Introducing OxTerm

Zterm has been one of my larger projects and by my reckoning fairly successful in it's original goals. I've enjoyed the project immensely, and learned a great deal along the way. However, writing an end user application in a pre-1.0 language while simultaneously working on the language bindings for the toolkit being used has presented some challenges, to say the least.

This isn't a dig on Zig

I love Zig. It's a fantastically well designed language, and when I say that I like to emphasize the word design. C and C++ both kind of organically grew, in fits and starts, with additions happening in an ad-hoc manner by people who didn't necessarily share the same grand ideas about what the final result should look like. Frankly, it shows. A lot. I've talked about this before. In contrast Zig presents a welcome sense of consistency, even in it's early stages. There are some exceptions to this rule in the std library, but Zig's creator, Andrew Kelley, has made it known that these are going to be addressed before the language hits it's 1.0 release. Based on his record to date, I trust him fully to deliver on that promise.

Zterm isn't going away (yet)

Yes, I'm creating Zterm's spiritual successor, but I intend to maintain Zterm in a working state for the forseeable future. I'm just not going to be adding much in the way of new features to it.

So..OxTerm

OxTerm is a chance to take most of the ideas that went into Zterm and approach them from a fresh perspective, with a robust language ecosystem clearing certain blockers, and a lot of knowledge gained regarding what worked and what didn't work as well. So what are some of the current limitations of Zterm that we might overcome with a rewrite?

  1. Each tab can only be split in a single direction at a time. More complex layouts are impossible.
  2. Only one window can be opened at a time. In fact, attempting to open a second window crashes the program.
  3. The background can be set as a solid color, gradient, image or solid color with transparency. Transparency should be available no matter the background type.
  4. Background gradients are limited to between two and four stops. Granted, that's already more than just about any other terminal emulator will give you, but...
  5. Zterm uses global variables in a few places. It's not currently hurting anything, but I just don't like the practice. This is related to the issue of being limited to one window.
  6. With vte now able to build against gtk4, this is a chance to advance to a more modern toolkit.

This should not be considered an exhaustive list of course, but let's dig into each one a little bit.

More complex layouts

Zterm splits tabs by just inserting another terminal into a GtkBox widget. The reason for this originally just to avoid complexity, because doing things the right way would have meant using a GtkPaned widget, then for each successive split temporarily removing the focused terminal and replacing it with another GtkPaned widget, and then putting the terminal into the new GtkPaned and opening a new Terminal as the second child of the new GtkPaned. Removing terminals is, if anything, even more complicated. This is not really an issue with Zig, and it's a solvable problem in either language. It's just where I originally drew the line on complexity. In fairness I've been using Zterm as my daily driver anyway for close on a year now and find it mostly adequate for my workflow.

Switching to Rust, however, adds a few tools to the toolbox which can help to manage this complexity. In Zterm, there are two global variables in gui.zig which keep track of opened terminals and tabs, terms and tabs. With the Rust bindings there is the option of leveraging subclassing and composite templates, and then we can make our list of open terminals a property of each tab, and likewise the list of open tabs can be a property of our new OxWindow widget. In point of fact, this helps to clean up a number of other issues as well. It is now much easier to open multiple windows, as well as to know which tab a given terminal actually belongs to.

How does this actually help though? It's a fair amount of boilerplate, so it needs to be justified. Well, let's give an example of something that having the option of subclassing makes more simple. Let's start with finding what terminal we're currently in. In Zterm, we do this by iterating over our list of terminals and checking for focus.

// in src/gui.zig
    fn currentTerm(self: Self) ?vte.Terminal {
        if (self.box.as_container().get_children(allocator)) |kids| {
            defer kids.deinit();
            for (kids.items) |child| {
                if (child.has_focus()) {
                    return vte.Terminal.from_widget(child);
                }
            }
            return null;
        } else return null;
    }

Walking through this code, which is a method on the Tab struct in gui.zig, we're attempting to get the children of the Tab struct's box widget, and then looping over them, checking to see if one of them has focus. Note that this is pretty fallible in a couple places as well, and it can only operate on the currently focused tab.

Taking a look at OxTerm the story changes quite a bit. In addition to storing a list of open terminals, it's possible to store the currently focused terminal as a property of the Tab widget, subclassed from a GtkBox. Here, instead of storing another reference to the terminal I've elected to instead store a String representing the widget name of the terminal. Since we also have a HashMap with a reference to each terminal as the value and this string as the key, it's easy and cheap to get.

// in src/tab/imp.rs
#[derive(Default)]
pub struct Tab {
    pub label: TabLabel,
    pub terms: RefCell<HashMap<String, Terminal>>,
    pub current_term: RefCell<Option<String>>,
}
// in src/tab/mod.rs
    pub fn current_term(&self) -> Option<Terminal> {
        if let Some(name) = &*self.imp().current_term.borrow() {
            return self.imp().terms.borrow().get(name).cloned();
        }
        None
    }

It's also quite easy to keep this updated. We just update Tab.current_term by connecting to the "has-focus-notify" signal emitted by every gtk widget when it receives the input focus. This happens in the new_term method of our Tab widget, called whenever a new terminal is going to be opened to do all of the required setup. No looping required. It also accounts for the need to get the current terminal if something else in the ui currently has the input focus, the obvious example being when someone clicks on the menu button. In fact, Zterm had to have a workaround for just this reason and disallow that particular button from ever receiving focus. Not having to do that actually makes the program more accessible, which is another small win.

Being able to reliably and cheaply get the current terminal is necessary in order to be able to divide up the tabs into panes. Simplified slightly, our first terminal in each Tab widget goes directly into the tab, which is a subclass of a GtkBox. When the tab is split the first time, a GtkPaned widget is appended to the box and the first terminal becomes it's "start_child", while the second terminal becomes it's "end child". When it comes time to split the tab again, it get's slightly more complex, as each GtkPaned can only have two children. It is necessary to get the current terminal, then from that terminal get it's parent, temporarily remove that terminal, insert a new GtkPaned in it's place, and put the old terminal into this new GtkPaned widget along with another new terminal.

As terminals get closed, this process basically goes in reverse. Once we're down to one terminal, if that one is closed then the Tab widget emits the custom signal "close-tab", which is how the OxWindow knows to close the tab. This code was always going to be complicated no matter what language it was written in, but it's significantly cleaner when we can do things like adding custom signals and storing extra housekeeping data right in our widgets.

Multiple windows

This requires less explanation as we've already layed a lot of the groundwork. In Zterm, since our list of open tabs and open terminals are global variables, it would be impossible to tell which window each one belonged to. OxTerm has the advantage here, largely due to composite templates and custom widgets, as I explained in the previous section. Our list of tabs is a RefCell<HashMap<String, Tab>> that is stored in each OxWindow widget. Similarly, the list of terminals is a RefCell<HashMap<String, Terminal>> that is stored in each Tab widget. It's a problem that was solvable in Zterm but would have come only with a significant refactor.

Backgrounds and transparency

This really is something that I could have implemented in Zterm without much fuss, had I given it more thought the first time around. Well, this time it's going to get baked in from the start.

Gradient stops

So yeah, while this code is nowhere near complete currently, Rust's trait system provided a great way to simplify handling multiple gradient stops. In Zterm, if the position of the first stop is changed, then the range of possible values for the second stop gets truncated to go no lower than the first stop, the third to the second, and so on. Here's a look at the amount of code that goes into managing those adjustments in Zterm.

pub const StopControls = struct {
    // Some fields omitted
    stop1_grid: gtk.Widget,
    stop1_color: gtk.ColorButton,
    stop1_position: gtk.Scale,
    stop2_grid: gtk.Widget,
    stop2_color: gtk.ColorButton,
    stop2_position: gtk.Scale,
    stop3_grid: gtk.Widget,
    stop3_color: gtk.ColorButton,
    stop3_position: gtk.Scale,
    stop4_grid: gtk.Widget,
    stop4_color: gtk.ColorButton,
    stop4_position: gtk.Scale,

    fn updateScale2(self: Self) void {
        const val = self.stop1_position.as_range().get_value();
        const adj = self.stop2_position.as_range().get_adjustment();
        if (adj.get_value() < val) adj.set_value(val);
        adj.set_lower(val);
    }

    fn updateScale3(self: Self) void {
        const val = self.stop2_position.as_range().get_value();
        const adj = self.stop3_position.as_range().get_adjustment();
        if (adj.get_value() < val) adj.set_value(val);
        adj.set_lower(val);
    }

    fn updateScale4(self: Self) void {
        const val = self.stop3_position.as_range().get_value();
        const adj = self.stop4_position.as_range().get_adjustment();
        if (adj.get_value() < val) adj.set_value(val);
        adj.set_lower(val);
    }

    fn connectSignals(self: Self) void {
        const Callbacks = struct {
            // Some functions omitted

            fn stop1PositionValueChanged() void {
                gradientEditor.stops.updateScale2();
                gradientEditor.setBg();
            }

            fn stop2PositionValueChanged() void {
                gradientEditor.stops.updateScale3();
                gradientEditor.setBg();
            }

            fn stop3PositionValueChanged() void {
                gradientEditor.stops.updateScale4();
                gradientEditor.setBg();
            }

            // functions omitted
        };

        self.stop1_position.as_range().connect_value_changed(@ptrCast(c.GCallback, Callbacks.stop1PositionValueChanged), null);
        self.stop2_position.as_range().connect_value_changed(@ptrCast(c.GCallback, Callbacks.stop2PositionValueChanged), null);
        self.stop3_position.as_range().connect_value_changed(@ptrCast(c.GCallback, Callbacks.stop3PositionValueChanged), null);
        // Some lines omitted
    }

OxTerm will be taking advantage of a composite template to define a StopEditor widget, consisting of a GtkBox with a GtkColorButton and a GtkSpinButton, so when we define our stops we're going to have significantly less struct fields. Additionally, OxTerm isn't going to care where the stops get set, and won't have to adjust the range of the next stop when it's value is changed. It's instead going to order their output by position. Let's look at the Gradient struct.

#[derive(Deserialize, Serialize)]
pub struct Gradient {
    pub kind: Kind,
    pub stops: Vec<Stop>,
}

The kind field here is a complex enum which encodes information like whether we have a linear, circular, or elliptical gradient and it's directionality. This is largely a straight translation from the Zig code in Zterm into Rust. However, in Zterm we had four fields for stops, the last two being optionals. In OxTerm I'm going for a Vec<Stop> instead, which will be constrained by the ui to have at least two entries. And our Stop struct implements PartialOrd.

#[derive(Deserialize, Serialize)]
pub struct Stop {
    pub color: RGBA<f32>,
    pub position: f64,
}

impl PartialOrd for Stop {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        self.position.partial_cmp(&other.position)
    }
}

impl PartialEq for Stop {
    fn eq(&self, other: &Self) -> bool {
        self.position == other.position
    }
}

When the time comes to turn this into css, all that is required is to sort the vec and then iterate over it's members, creating css for each one. Drastically simplified in concept and design.

And speaking of Rust's trait system, I recently revamped the rgba-simple crate, which provides color data structures for Gfret, Eva and now OxTerm. Conversions between RGBA<T> and gdk::RGBA are now done via implementations of the From trait in std::convert. So when it's time to go from our config, which is stored in rgba-simple's RGBA format, to gdk::RGBA, which is what Gtk+ uses internally, it's just a matter of color.into() because when you implement From you get Into for free. But I'm taking this a step further, in combination it with the previously mentioned StopEditor widget. We get a stop out of it by once again using into().

impl From<&StopEditor> for Stop {
    fn from(editor: &StopEditor) -> Self {
        Stop {
            color: editor.button().rgba().into(),
            position: editor.scale().value(),
        }
    }
}

Global variables

Yeah that's already been covered. Those have been moved into our OxWindow and Tab widgets.

Moving to Gtk4

So how hard would this have been in Zig, and where is the gain? Well first off, the language bindings for Gtk3 in Zig were my own project, and one for which I've only ever received one community contribution (thank you Gertkeno). Since there were a large amount of deprecations between Gtk3 and Gtk4, a lot of the code would just be thrown out, which isn't all that difficult. That said, there was a complete overhaul around menus, which means that all of that code would have had to be new. And there's enough subtle changes that overall it's just another large project, easily as large or larger than rewriting Zterm in Rust.

And here's the thing, really. Even were I to take that on, the result would not be anywhere near as awesome as the existing Rust bindings already are. And while I haven't even attempted to figure out certain things like subclassing and composite templates in Zig, I suspect that would entail a great deal of pain as the C interface likely is quite macro reliant, and C macros are one of the areas where Zig's C interop falls down pretty regularly. In short, it's a lot of mostly thankless work for little benefit. It's yak shaving at it's finest.

So what does this move actually give us? Most importantly it gives the program several years more support than it would have if written in Gtk3, because even though I expect Gtk3 to be around still for years to come Gtk4 is going to significantly outlive it. But that's not all. For the first time, Gtk4 has been promised to remain API stable, meaning new minor releases will not break existing code. That's huge in itself. A lot of the other changes do not directly affect OxTerm because they deal with things that are out of scope for a terminal emulator (we're not going to be playing back video or do fancy transformations) but there are other things which are more subtle.

Example: in Zterm, when the switch happened from hard coded keyboard shortcuts to user configurable ones, the little hints went away in the menu entries. With the new menu and actions system in Gtk4, they're back. I wasn't sold on these particular changes at first, back when I began porting Gfret from Gtk3 to 4, because it wasn't at all well documented and it felt inconsistent. In Gtk3, a menu entry was treated exactly like any other widget. I still think there's some issues with consistency, but after using it a bit I've discovered that it's actually simpler, and the way Actions work puts it over the top. It was just getting over that "poorly documented" hurdle. But I digress.

The bad

First off, while what is already there is a great base to build on and already has some features that Zterm lacked, it's nowhere near the polish that Zterm has yet.

Second, an awful lot of distros don't even have a new enough version of Gtk4 in their repositories to build gtk4-rs against yet. That's just the Gtk+ issue before you take into account vte, which only very recently began to be buildable against gtk4, and hasn't yet been released as stable. So realistically you're not going to be able to try out OxTerm unless you're using Arch Linux, in which case you can get an appropriate vte from the aur package vte-2.91-gtk4, or using some other rolling release distro and compiling vte yourself. FreeBSD ports are up to date with Gtk4, so that's another possibility. But I wouldn't know where to even starrt with Ubuntu.

Third, there's probably bugs. The entire project is less than a week old at this point and I'm far from perfect. They'll get better though. As will the features, and the rest of the ecosystem should catch up quickly as Gnome makes the push to get the rest of it's applications onto Gtk4.