Rust Criticism - from a Rustacean
I've been using Rust for a little over three years now and I absolutely love it. Rust makes good on a lot of it's promises. Rust makes highly reliable software, makes concurrency and parrallel code much easier to reason about, and will make you a better programmer even on those occasions that you write some code in another language. I could continue heaping praise. But that's not the point of this post. No, this post is about taking an honest look at what I think are a few of Rust's shortcomings. Or at the very least, things that I wish had been done a little differently in the early days and which are now pretty firmly entrenched and thus difficult to fix.
1. std calls into libc
Sometimes you want, or need, to call into some really low level
functionality provided by the OS. Rust is a systems programming
language after all. You might be surprised just how often the official
or semi-blessed solution is calling into Libc under the hood. I was
when I looked at it. This has a number of implications. It's difficult
to build a system completely with Rust because std requires libc. The
Redox project had to write their own libc, in Rust, in order to solve
that chicken and egg problem. They still aren't self hosting and a lot
of the reason revolves around porting rustc itself, which depends on
std (and thus libc).
If you poke around in the source code of any give libc implementation
(I recommend either the NetBSD code base or Musl) then what you tend to
see after a while is that a huge amount of the provided functionality
is a thin wrapper around the syscall interface. When you start doing
low level but not quite embedded sorts of programming in Rust, you
quickly discover that not all of those system call wrappers are
provided by std. In fact, there isn't even a way to make
syscalls using std. More on that later.
There have been a couple of abortive attempts to change this by
creating a version of std that is freestanding from libc.
They have mostly come to nothing because there hasn't been enough
community interest to sustain a project of that magnitude. I find this
kind of sad. I also want to contrast this with Zig, whose standard
library not only doesn't rely on libc but also provides a syscall
interface to the programmer.
2. Making syscalls in Rust
There is no official way to make syscalls in Rust, but there are crates
on crates.io which can provide this functionality. So let's look at the
first one which pops up when one does a search on crates.io, the aptly names
syscall crate. The syscall crate was last
updated 7 years ago. The repository no longer exists. It supports the
following platforms.
- x86 Linux
- x86_64 Linux
- x86_64 freebsd
- armeabi Linux
- x86_64 MacOS
Wow, so no updates in 7 years, and that is a really short list of supported platforms!
Having explored this space previously, I happen to know that the
sc crate was forked from the syscall crate a
pretty long time ago when the latter became abandoned by it's author.
It also has a much longer list of supported platforms.
- x86 Linux
- x86_64 Linux
- x86_64 FreeBSD
- x86_64 MacOS
- armeabi Linux
- aarch64 Linux
- aarch64 MacOS
- mips and mips64 Linux
- powerpc and powerpc64 Linux
- riscv64 Linux
- sparc64 Linux
Ok, so that list is much better. It's literally twice as long. If
you're paying attention you'll no doubt have noticed that Linux is the
only OS getting much in the way of love here, though. If you're using
FreeBSD on anything other than x86_64 you're still out in the cold, and
if you care about NetBSD, OpenBSD or Solaris you're just plain SOL.
Your only recourse at that point is to either call into inline assembly
or to use the libc crate. I don't consider that a good
state of affairs for a systems programming language. Your systems
programming language should be capable of interop with other languages
but probably shouldn't require calling into C just to provide full
functionality, such as interacting with the kernel.
3. The crates ecosystem is based on GitHub and has a flat namespace
Ok, so part of this calls back into the previous section. Remember the
(abandoned) syscall crate? The fact that anyone can
publish on crates.io and then subsequently abandon their crate means
that over time these situations are coming up more and more. This is
something that I am far from the only person criticizing, so please can
we have namespaced crates already? Pretty please, with sugar on top?
This was a laughably bad design choice right from the start.
Then there's the issue that the crates.io registry is on GitHub, and only GitHub, and in order to publish on crates.io you have to have a GitHub account. For some people this is obviously not an issue, but I'm sorry to be the dick that points out that GitHub is owned by a mega-corporation with a questionable history regarding open source and free software and there are a lot of people who experience varying levels of discomfort around this. I am sure that there are those for whom this state of affairs is a deal breaker. Personally, I'm fine with the Rust organization hosting their code on GitHub, but I think it's a bad situation that in order to get a crates.io account you must have a GitHub account. I know that the mega-corp in question hearts Linux now, big time, but we are humans with long memories after all.
4. Errors in std are inconsistent
What am I talking about here? Consider the
std::fmt::Display trait.
pub trait Display { // Required method fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>; }
Display takes a formatter as an argument and writes into it. This is a
fallible operation, hence the Result<(), Error>
return type. Now let's look at some common String handling
methods.
let r = "Rust"; let x: String = format!("{r} is inconsistent"); x.push_str(" in handling memory allocation errors"); x.push('.');
Let's unpack that. The format!() macro calls the
std::fmt::format function under the hood. Let's look at
that code.
#[cfg(not(no_global_oom_handling))] #[must_use] #[stable(feature = "rust1", since = "1.0.0")] #[inline] pub fn format(args: Arguments<'_>) -> string::String { fn format_inner(args: Arguments<'_>) -> string::String { let capacity = args.estimated_capacity(); let mut output = string::String::with_capacity(capacity); output.write_fmt(args).expect("a formatting trait implementation returned an error"); output } args.as_str().map_or_else(|| format_inner(args), crate::borrow::ToOwned::to_owned) }
The observant will immediately notice the line that starts with
output.write_fmt and ends with expect. This
means that whenever you use the format!() macro, there is
a potential panic.
The push_str and push methods rely on
functionality provided by the underlying Vec, since that
is what a String is. As such, they also inherit the
behavior of completely ignoring allocation errors and instead having a
silent panic hidden in the code, which we opt into literally every time
we use std in a project.
Now, when I say that Rust is inconsistent here, look at this.
let r = "Rust".to_string(); write!(r, " is inconsistent")?; write!(r, " in handling memory allocation errors")?;
This is another way of appending to a String and is
treated as a fallible operation (hint - appending to a
String is almost always a fallible op). The extra
observant will also know that the to_string method is
provided by the Display trait, for which we implement the
fmt method, which is considered fallible. So why is the
to_string method considered infallible? Once again, we're
just opting in to ignoring the possibility of an underlying
panic!().
Unlike the other gripes that I've brought up earlier, this one is so entrenched by now that it's never going to be fixed for common usage. All you can do is live with it and realize that if you are that low on memory, the OS is likely going to be killing processes anyway. That said, I don't like the way Rust is inconsistent here. Rust normally enforces correctness, yet turns around and treats memory as an inexhaustible resource, ignoring or handicapping a lot of potential use cases for the language.
6. Macros were a mistake
I'm writing this hot on the heels of the 2023 Rustconf fiasco wherein a very well known and respected person in the field was set to give a talk on improving reflection in Rust, but was notified very late in the game that his talk was to be downgraded from a keynote to a regular talk. This caused much weeping, wailing and gnashing of teeth and did a lot of harm to the community and how we are all viewed by the wider FLOSS community. It also sucks, because that would have likely been a very interesting talk on a subject that I care a lot about in relation to Rust.
Apart from Rust, the other language that has really caught my attention
is Zig. One of Zig's outstanding core features is
comptime, which is basically code which is evaluated and
executed at compile time rather than at runtime. The way that Andrew
Kelley has implemented this has not only provided a full replacement
for any kind of macro system but has provided other benefits as well.
According to Kelley, when he fully embraced comptime then
generics "just fell into my lap". He's not kidding, either. The concept
of generics is only possible with some sort of compile time reflection
in any language, and when you look at Zig code what is amazing
is that there is no special syntax required for doing very
interesting things at compile time, including generics and pretty much
all of the things that we reach for macros to do in other
languages, Rust included.
I guarantee that a lot of people disagree with this. I don't care. Macros are a mistake in light of the fact that there is a way to do the same thing without having to have a second DSL to learn on top of the language itself. Zig has proven it without any doubt to myself and to anyone else paying attention. You can, at least in this particular case, have your cake and eat it, too.
Consider this code from the json module in Zig's
std. This function takes a value of any type,
including your own custom complex types, and uses type reflection to
output a json string to a stream of any type that meets the function's
requirements.
pub fn stringify( value: anytype, options: StringifyOptions, out_stream: anytype, ) !void { const T = @TypeOf(value); switch (@typeInfo(T)) { .Float, .ComptimeFloat => { return std.fmt.formatFloatScientific(value, std.fmt.FormatOptions{}, out_stream); }, .Int, .ComptimeInt => { return std.fmt.formatIntValue(value, "", std.fmt.FormatOptions{}, out_stream); }, .Bool => { return out_stream.writeAll(if (value) "true" else "false"); }, .Null => { return out_stream.writeAll("null"); }, .Optional => { if (value) |payload| { return try stringify(payload, options, out_stream); } else { return try stringify(null, options, out_stream); } }, .Enum => { if (comptime std.meta.trait.hasFn("jsonStringify")(T)) { return value.jsonStringify(options, out_stream); } @compileError("Unable to stringify enum '" ++ @typeName(T) ++ "'"); }, .Union => { if (comptime std.meta.trait.hasFn("jsonStringify")(T)) { return value.jsonStringify(options, out_stream); } const info = @typeInfo(T).Union; if (info.tag_type) |UnionTagType| { try out_stream.writeByte('{'); var child_options = options; child_options.whitespace.indent_level += 1; inline for (info.fields) |u_field| { if (value == @field(UnionTagType, u_field.name)) { try child_options.whitespace.outputIndent(out_stream); try encodeJsonString(u_field.name, options, out_stream); try out_stream.writeByte(':'); if (child_options.whitespace.separator) { try out_stream.writeByte(' '); } if (u_field.type == void) { try out_stream.writeAll("{}"); } else { try stringify(@field(value, u_field.name), child_options, out_stream); } break; } } else { unreachable; // No active tag? } try options.whitespace.outputIndent(out_stream); try out_stream.writeByte('}'); return; } else { @compileError("Unable to stringify untagged union '" ++ @typeName(T) ++ "'"); } }, .Struct => |S| { if (comptime std.meta.trait.hasFn("jsonStringify")(T)) { return value.jsonStringify(options, out_stream); } try out_stream.writeByte(if (S.is_tuple) '[' else '{'); var field_output = false; var child_options = options; child_options.whitespace.indent_level += 1; inline for (S.fields) |Field| { // don't include void fields if (Field.type == void) continue; var emit_field = true; // don't include optional fields that are null when emit_null_optional_fields is set to false if (@typeInfo(Field.type) == .Optional) { if (options.emit_null_optional_fields == false) { if (@field(value, Field.name) == null) { emit_field = false; } } } if (emit_field) { if (!field_output) { field_output = true; } else { try out_stream.writeByte(','); } try child_options.whitespace.outputIndent(out_stream); if (!S.is_tuple) { try encodeJsonString(Field.name, options, out_stream); try out_stream.writeByte(':'); if (child_options.whitespace.separator) { try out_stream.writeByte(' '); } } try stringify(@field(value, Field.name), child_options, out_stream); } } if (field_output) { try options.whitespace.outputIndent(out_stream); } try out_stream.writeByte(if (S.is_tuple) ']' else '}'); return; }, .ErrorSet => return stringify(@as([]const u8, @errorName(value)), options, out_stream), .Pointer => |ptr_info| switch (ptr_info.size) { .One => switch (@typeInfo(ptr_info.child)) { .Array => { const Slice = []const std.meta.Elem(ptr_info.child); return stringify(@as(Slice, value), options, out_stream); }, else => { // TODO: avoid loops? return stringify(value.*, options, out_stream); }, }, .Many, .Slice => { if (ptr_info.size == .Many and ptr_info.sentinel == null) @compileError("unable to stringify type '" ++ @typeName(T) ++ "' without sentinel"); const slice = if (ptr_info.size == .Many) mem.span(value) else value; if (ptr_info.child == u8 and options.string == .String and std.unicode.utf8ValidateSlice(slice)) { try encodeJsonString(slice, options, out_stream); return; } try out_stream.writeByte('['); var child_options = options; child_options.whitespace.indent_level += 1; for (slice, 0..) |x, i| { if (i != 0) { try out_stream.writeByte(','); } try child_options.whitespace.outputIndent(out_stream); try stringify(x, child_options, out_stream); } if (slice.len != 0) { try options.whitespace.outputIndent(out_stream); } try out_stream.writeByte(']'); return; }, else => @compileError("Unable to stringify type '" ++ @typeName(T) ++ "'"), }, .Array => return stringify(&value, options, out_stream), .Vector => |info| { const array: [info.len]info.child = value; return stringify(&array, options, out_stream); }, else => @compileError("Unable to stringify type '" ++ @typeName(T) ++ "'"), } unreachable; }
So what exactly is going on here anyway? Well
@TypeOf(value) is a compiler builtin function which gets
the type of value (in Zig the '@' sigil signifies a
compiler builtin). Then we get the enum value of T with
another builtin, @TypeInfo, which we can switch on (match
is the closest Rust equivalent). What I love about this is that all of
this is written in plain Zig, not a DSL or macro language. There is no
special syntax required and the code is just as readable as any other
zig code which is not using any type of reflection.
To contrast this, the serde crate has to call into
syn to build an ast in order to get type information for
serialization. Should you decide to take a peak into the source code of
syn you will find a great big ball of macros
which tends to make your eyes glaze over and gives me a feeling of
wanting to curl up into a ball and hug my knees while rocking back and
forth to Pink Floyd's "Comfortably Numb". You shouldn't have to
generate an ast in order to accomplish this task, because the compiler
has to generate an ast as part of compilation. That's duplicate effort.
Further, even those who regularly write proc macros can have a hard
time reasoning about what is going on. The fact that macros are so
powerful combined with the fact that they are so difficult to
understand leads to, IMO, an unsafe situation. This should not be a
black art or an arcane science. A language feature as powerful as proc
macros should be simple to read and understand without any special
knowledge beyond understanding the language itself. The Zig way is
indeed the better way.
My Ideal Language
Rust comes very close to being my ideal language, but I'm actually
looking for "the next language" to come after it. I mean that in a very
positive way. I think that when we look back at the early part of this
century there will be a dividing line in software development that sits
right where Rust emerged as a real challenger to the existing order. I
believe Rust has kicked off a renaissance of sorts, but that perhaps
Rust itself is just showing the way, and that an even better solution
could be on the horizon. All it's going to take is for a strongly
motivated person or, preferably, group of people to do the work of
building such a language, without repeating the same mistakes. Granted,
some of the gripes that I mentioned are deficiencies in
std and the crates ecosystem, and as such aren't
technically reason to write a whole new language. But at this point I
just don't see anyone realistically switching to a
std-2.0.
So anyway, let's indulge for a moment my fantasy "post Rust era ideal
language". It would actually be a lot like Rust, but with a little
sprinkling of Zig's comptime goodness. There would be a
borrow checker, and it would give great error messages just like Rustc
does. We'd have the concept of comptime so no macros, and
it would be obvious that's how generics functioned under the hood. But
there would also be traits, which Zig doesn't have, because that is
such a useful concept. I'm not picky about syntax other than the fact
that it has to behave as consistently as possible. The borrow checker
would obviously lead to a great concurrency and parallelism story just
like Rust. Unlike Rust, the std of this new language would
be built from the ground up to be able to do anything and everything
that we think of as C's domain but without ever resorting to calling
into C or assembly. We would get built in functionality for
interfacing with other languages as well as directly talking to the
underlying OS kernel. And finally, I envision a library package
ecosystem which not only has namespaces, but would actually be fully
distributed and decentralized - federated if you will. How that would
work I will leave up to the imagination, other than to say emphatically
that there would be no blockchain shenanigans and crytobros can just
get off my blog already, but the FediVerse and Matrix have shown that
distributed communication can scale, and now we just need to iterate
and improve.
In the meantime, you can and should use Rust. In spite of any shortcomings it's still the best language that has come so far in terms of enforcing memory access safety and overall correctness. A lot of the little gripes I've mentioned are indeed pretty uncommon things for the average programmer to have to deal with. If you feel like trying something else (and haven't already done so) then you should also try out Zig, because it's a wonderful language with some fantastic ideas of it's own and is actually ahead of Rust when it comes to it's suitability to fully replace C.
Tags for this post: Rust Zig Programming Languages Comptime Macros