How Safe Is Zig?

quietbritishjim · on March 22, 2021

From the table:

> null pointer dereference ... [C] none; [Zig] runtime; [Rust] runtime

Assuming this is talking normal "safe" Rust, I think I disagree with this. The Rust analogue of a pointer in safe code is a reference, not a Rust pointer, and these can't null at all. You could use an Option<> of a reference, and Rust will internally use null to represent the None (empty) case, but an attept to use the option without checking for None will result in an error at compile time, not runtime. Yes you could convert that into a runtime error, but if it was an error condition for that variable to be None then (depending on the context) you could choose not to use an Option at all and then it would be a compile-time error at the call site to attempt to put None into it.

I don't think I understand what is meant by "type confusion". Surely this would also cause compile-time errors? Even C++ would give compile time errors for this unless you use a cast! (C, unlike C++, lets you implicitly convert from void* to any other pointer type so you don't need a cast to get pointer confusion.) Could someone think of an example of what might be meant here, and how it would cause a runtime error?

jamii · on March 22, 2021

> I don't think I understand what is meant by "type confusion".

Accessing a memory location with one type as if it's another. In C or C++ it's usually because you accessed a union without checking a tag somewhere else.

One classic way this happens is in an interpreter where you have a big enum for all the different possible data types:

https://github.com/MaterializeInc/materialize/blob/1b9b3cfab...

And then in various builtin functions you expect particular types:

https://github.com/MaterializeInc/materialize/blob/main/src/...

In rust and zig this code will produce a runtime error if you screw up, but in c it's easy to forget to check the tag and then you get UB. Similarly unwrap is checked in rust and zig but the equivalent in c - dereferencing a pointer that you are pretty sure is not null - is not.

Of course rust and zig both have support for c-style unions too, but they're not the first thing people reach for.

quietbritishjim · on March 22, 2021

Thanks for answering. As I said in another comment (in reply to another tsimionescu who suspected that's what you meant), I maintain that this is a compile time error. Yes, one usage is to panic if the contents of the enum is not what you expect, in which case an unexpected type causes a runtime error. But that is the programmer choosing to deal with it way.

Fundamentally in the language, mismatching type is a compile time error. In many situations it's feasible to exhaustively match and deal with all possible cases of an enum (which then causes a compile time error if you try to reference one of the other inner types from within the "wrong" case). Where one particular case is expected, you can often ensure through the type system that this case is the only possibility by using that individual type directly rather than passing that enum around. In some situations its not feasible or it is feasible but the extra faff isn't worth the reward - but even then I don't think it justifies saying that the type confusion is detected in runtime in a comparison of languages.

tsimionescu · on March 22, 2021

> In rust and zig this code will produce a runtime error if you screw up, but in c it's easy to forget to check the tag and then you get UB.

As pointed out elsewhere, somewhat surprisingly, this is not UB in C, though it is in C++. In C it is merely unspecified behavior, but perfectly safe if intended.

jamii · on March 22, 2021

My bad, should have verified first.

creata · on March 22, 2021

Another (much smaller) detail is that signed overflow in C is undefined (and iirc GCC takes advantage of that when optimizing) but signed overflow in Rust is precisely defined to error in debug mode and wrap in release mode.

edflsafoiewq · on March 22, 2021

And unsigned overflow is also error-in-debug, overflow-in-release with Rust. If you want wrapping arithmetic, you have to ask for it. In C, unsigned overflow always wraps, so even though you can compile with overflow checks, there's no way to distinguish unsigned arithmetic that is supposed to vs not supposed to wrap.

MaxBarraclough · on March 22, 2021

Sounds sensible. To quote John Regehr:

> Java-style wrapping integers should never be the default, this is arguably even worse than C and C++’s UB-on-overflow which at least permits an implementation to trap.

• https://blog.regehr.org/archives/1401

saagarjha · on March 22, 2021

I actually dislike Rust's "wrap in production" default, tbh. It strikes a strange balance: "we care about performance in release mode but we are also going to check and make sure this code does specific things on overflow".

__s · on March 22, 2021

Am important thing to remember is that even with wrapping Rust maintains memory safety

Still, I agree, I prefer consistent semantics between dev/prod as much as possible. Especially since there's methods to have checked/wrapping/etc arithmetic, so I can always go to those if I want the other behavior

Thankfully, this behavior is a flag which you can personally configure to be consistent across dev/prod: https://doc.rust-lang.org/cargo/reference/profiles.html

It does make sense when interpreting wrapping checks in dev as debug assertions

ChrisSD · on March 22, 2021

I'm not sure I follow. In release mode it does the most performant thing by default. In debug mode and tests it catches potential problems with using this default.

Either way there are explicit methods for doing wrapped, checked or saturating operations in every mode.

saagarjha · on March 22, 2021

Undefined behavior on overflow is always the most performant, followed closely by "the result is unspecified". Wrapping is less performant because it often forces the implementation to actually wrap if the behavior is observable, which might be extra work (e.g. i32) or interfere with loop optimizations.

MaxBarraclough · on March 22, 2021

Undefined behaviour isn't an option for Safe Rust, where the impossibility of invoking undefined behaviour is the whole point. Non-deterministic unspecified values aren't in keeping with the Safe Rust philosophy either.

I imagine throw-on-overflow is slower than wrap-on-overflow.

C# can be configured to throw an exception when an int is overflowed, [0] but this behaviour isn't the default and is rarely used (typically it uses wrap-on-overflow). I imagine it might have a significant performance impact, but I'm not sure.

In a language like SPARK Ada, intended for formal verification, you can insist upon a rigorous proof that unintended overflow can never occur. That isn't an option for Safe Rust, at least not without significant breakthroughs in tooling.

[0] https://docs.microsoft.com/en-us/dotnet/csharp/language-refe...

renox · on March 22, 2021

> I imagine it might have a significant performance impact,

It depends on the CPU: MIPS has(had if you consider MIPS dead) some integer operations which trapped in case of integer overflow AFAIK the 'trapping operations' were as fast as the non-trapping operations (when no overflow occurred of course).

Unfortunately even though RISC-V is MIPS successor, it doesn't have those trapping operations :-(

I must admit I don't understand why no modern CPU ISA has these trapping instructions: explicit checks have an 'instruction cache cost', so users won't enable them which reduces security..

bluGill · on March 22, 2021

Perhaps, but I often know via information not available to the compiler (it might be to a SPARK proof) that overflow won't happen and I don't want to check for it.

MaxBarraclough · on March 22, 2021

Programmers can't be trusted to do this kind of free-form reasoning correctly, as attested by the unending stream of security vulnerabilities arising from undefined behaviour in C and C++ codebases.

The push for safe languages is motivated by pragmatism, not theoretical purity.

bluGill · on March 23, 2021

The push back that performance matters in the real world still exists. I want a secure language, but I don't have enough spare CPU for some

MaxBarraclough · on March 23, 2021

The performance of modern Rust is competitive with C++.

bluGill · on March 23, 2021

Which is why I told my boss I want some time to play with rust to see how it works in our real world.

nicoburns · on March 22, 2021

That option's available in Rust on a per opration basis if you want it, it's just not the default

renox · on March 22, 2021

Sure but in practice Rust is as performant as C/C++ in benchmarks, so IMHO 'integer overflow is UB' should be scrapped..

MaxBarraclough · on March 23, 2021

The same argument applies to various other instances of undefined behaviour.

Integer division in Java never results in undefined behaviour. Divide-by-zero results in an exception, as does (Integer.MIN_VALUE / -1). In C/C++, both of those operations result in undefined behaviour. Modern JVMs have to generate some additional instructions to implement this [0], but I wonder if the real-world performance penalty is that substantial.

Another example is reading an uninitialized variable, which is undefined behaviour in C/C++. Does this footgun really improve performance, with modern compilers? I don't have a solid answer here but I suspect not.

> IMHO 'integer overflow is UB' should be scrapped

The C committee is opposed to radical change, they never want to step on the toes of exotic compilers and exotic hardware architectures, so I doubt they'll ever change it. I think it would be more realistic to ask the major compiler vendors to commit to never doing anything unsafe on signed integer overflow. I believe GCC has an 'opt-in' flag for this. I wonder what the performance cost is, if any. Perhaps it breaks some optimisations, but as you say, other fast languages like Rust seem to get by fine.

[0] https://www.javaer101.com/en/article/3166570.html

littlestymaar · on March 22, 2021

Zero UB in safe code is a core design constraint of Rust so I think it makes sense. Having integer overflow being implementation defined would sound more logical to me than the current behavior, but maybe there are arguments against it also.

steveklabnik · on March 22, 2021

To be clear, it actually is implementation defined. The rules are:

Integer overflow is a "program error." This case is handled by either "default" or "enabled" overflow checking:

* If checks are "enabled", then overflow must panic

* If checks are "default", then you'll get two's compliment wrapping

For implementations, if debug_assertions are enabled, then so must overflow checking be, unless the user specifically requests otherwise.

According to these rules, rustc today has "enabled" checking when debug_assertions is on, or when the user requests it via a flag. Otherwise, it leaves it to "default." If these checks ever become cheap enough, rustc may move to "enabled" in all cases by default. We'll see if that ever happens.

MaxBarraclough · on March 22, 2021

Introducing implementation-defined behaviour would undermine the advantages of Safe Rust. If I understand the goals of the Safe Rust project correctly, it aims to be a truly safe language, like Java or JavaScript. This means it must have no undefined behaviour, and beyond that, it should be as close to 'totally defined' as possible, without leaving program behaviour up to the particular platform, which would open the door to subtle bugs. (Concurrency is an exception here, as it really can't be made to be deterministic. Floating point might be another.)

An obvious example: does this code result in a divide-by-zero? (I'll use C syntax.)

    int myInt = INT_MAX;
    ++myInt;
    int myOtherInt = 1000 / myInt;

If signed overflow is permitted to result in myInt holding zero, then we have a divide-by-zero. Not the kind of thing that should be left up to the particular platform.

The behaviour of your Java code does not change when you move it from a 32-bit x86 machine to a 64-bit ARM machine. That's part of the appeal of Java. The same should be true of Safe Rust.

To put that another way: Safe Rust is remarkable because of its ambition: to be a truly safe language, while also having excellent real-world performance. It seems to be succeeding in doing both, without trading off on performance (Java, Go, C#) or safety (C++, and even Ada). If it starts compromising on either dimension, it becomes 'just another language'.

dathinab · on March 22, 2021

> "we care about performance in release mode but we are also going to check and make sure this code does specific things on overflow".

More like:

> "integer overflow checks are a painful/unacceptable performance degradation for some use-cases, but we still want to cough over-/under-flow bugs during testing"

Anyway luckily you can just enable integer overflow checks in release builds, which is not a uncommon setup in use-cases like server code.

dgellow · on March 22, 2021

So, how do professional C programmers deal with this in general? Do they manually check for `x > MAX_INT || x < MIN_INT` every time they want to do some arithmetic? Do they manually check the CPU overflow flag after an operation? Or something else?

(I only have limited C experience, and only for hobby projects)

comex · on March 22, 2021

Use unsigned integers, which have well-defined overflow behavior.

In my experience with C (which is biased towards some specific use cases), most numbers that are likely to overflow are things like sizes and counts, which cannot meaningfully be negative anyway, so you may as well use unsigned integers for them. Other cases really do require signed integers, but for most arithmetic operations you can 'just' convert to unsigned before doing the arithmetic and then convert the result back to signed.

(Some may disagree. For instance, the Google C++ style guide [1] specifically says not to "use unsigned types to say a number will never be negative", because they want the undefined overflow behavior of signed types, in order to allow the compiler to diagnose bugs and to avoid "imped[ing] optimization". I think this is mostly nonsense; the drawbacks far outweigh the benefits, and tools for detecting overflow like UBSan can be told to check unsigned overflow as well.)

That said, even if you avoid the UB cases, checking for overflow correctly is hard; I've found many security vulnerabilities caused by missing or incorrect overflow checks. __builtin_add_overflow and friends are very nice if you have them, though unergonomic. I wish a more ergonomic version were standardized as part of the language.

[1] https://google.github.io/styleguide/cppguide.html

jcelerier · on March 22, 2021

> (Some may disagree. For instance, the Google C++ style guide [1] specifically says not to "use unsigned types to say a number will never be negative", because they want the undefined overflow behavior of signed types, in order to allow the compiler to diagnose bugs and to avoid "imped[ing] optimization". I think this is mostly nonsense; the drawbacks far outweigh the benefits, and tools for detecting overflow like UBSan can be told to check unsigned overflow as well.)

Yes, I really disagree. unsigned integers mean one thing, which is "modular arithmetic". Unless you are in the very uncommon case of actually needing modular arithmetic, for instance, when implementing a crypto or hash algorithm, you want normal integers. As soon as you have anything that may have any chance of introducing a substraction somewhere, unsigned will cause bugs.

I don't know how many times I had to debug broken code such as

    for(int i = 0; i < some_size - 1; i++) { ... }

because some_size was unsigned.

If you really want a "number that cannot be negative", you don't wan't some_size - 1 to silently give you UINT_MAX, you want a type that will give you a compile-time or at worst run-time error.

zelphirkalt · on March 23, 2021

Lets say some_size is the length or size of something. So that a negative number does not make sense, as there will only ever be a positive length or size (Is that actually the case for arrays and such?). Then how would you express that fact? What is your proposal for such a type?

jcelerier · on March 23, 2021

the way to do it has been known for years: https://groups.google.com/a/isocpp.org/g/std-proposals/c/-KV...

zelphirkalt · on March 23, 2021

Thank you for that. It is great, that this is known for years. I am not a regular C coder, so I am merely asking out of curiosity.

Could you supply a non-Google link to example code? Perhaps in some git repository or a code snippet on a non-Google snippet website?

Quekid5 · on March 23, 2021

Here's an example of a similar library: https://github.com/foonathan/type_safe

zelphirkalt · on March 23, 2021

Aha, thank you, that's pretty cool! I wonder how they manage to make these types zero-overhead.

jcelerier · on March 24, 2021

It can't be zero-overhead over e.g. int a,b ; a+b;

It's however zero-overhead wrt a+b plus the manual error checking you'd have to do if you wanted to be correct

saagarjha · on March 22, 2021

C does not provide anything specific for this (though some have argued that it should). Many projects use compiler builtins such as __builtin_add_overflow.

pansa2 · on March 22, 2021

Either you need to restrict arithmetic to values which can’t overflow, or yes, you need to check for overflow manually. Note that you need to do those checks without actually triggering the overflow because once undefined behaviour is possible, all bets are off.

dgellow · on March 22, 2021

I see, so checking for the overflow flag wouldn't be good as you now have your program in an undefined state.

deadbytes · on March 22, 2021

When I looked into doing this before I think the ultimate issue was that the C specification doesn't even give you any way to check the overflow flag.

So even if you knew your specific C implementation handled overflow properly there is no way to check the flag afterwards anyway.

hansvm · on March 23, 2021

It varies a lot depending on your needs. Here are a few options.

Parts of postgres just heavily document the allowed range and expected domain of such functions (effectively encoding a type system into the comments and relying on a human to enforce it).

I've seen people drop down into assembly to prevent UB. Note that checking the overflow flag doesn't necessarily suffice; the problem happens during compilation -- e.g. if you write `x = (x-MIN_INT)+MAX_INT;` the compiler might be able to reason that the only value for x which wouldn't have UB is MIN_INT. Consequently the result must be MAX_INT, and the compiler can inline that constant anywhere else it's used without ever issuing the instructions so that you could check an overflow flag if you wanted to. If you do those calculations in assembly then you have a lot more freedom in that regard.

Some projects do manually check every arithmetic call (or more commonly they'll lean on the pre-processor to do that kind of busywork for them), or at least they'll do so out of some small, core kernel which is more heavily vetted and can't afford the overhead.

Leherenn · on March 22, 2021

Well, first `x > MAX_INT || x < MIN_INT` makes little sense if x is an int, it will never be true. If adding a+b, you would check b > MAX_INT - a.

I would say that's very rare, only for special cases or defensive programming. Usually you either know/assert the operation will not overflow because the inputs are bounded, or you use wider types (e.g. use a int32_t when adding two int16_t).

tedunangst · on March 22, 2021

Compile with -fwrapv.

dgellow · on March 22, 2021

Thanks, I just checked and also found out -ftrapv.

GCC says:

> This option generates traps for signed overflow on addition, subtraction, multiplication operations.

nwellnhof · on March 22, 2021

This has nothing to do with C programming. If an arithmetic operation could overflow, you always have to add a check regardless of programming language. It's simply that a lot of high-level code doesn't care about such a level of correctness. Another exception are languages like Python that automatically upgrade your integers to arbitrary precision on overflow.

That said, most of the time you end up counting objects in the current address space. If you assume that there can exist no more than `SIZE_MAX` objects in memory, you can avoid many overflow checks.

dgellow · on March 22, 2021

My question does have to do with C programming. Other languages (not C++ of course) do not have signed overflow considered an undefined behaviour. My question is specifically about this.

nwellnhof · on March 23, 2021

The point is that integer overflow practically always indicates a programming error. How a certain languages handles integer overflows is secondary. These overflows shouldn't happen in the first place. Making overflows a well-defined operation actually hides programming errors.

With regard to C compilers, there are a few cases where a compiler performs optimizations because it assumes signed integer overflow cannot happen. This is bad but typically, compiled C behaves like the underlying platform which means signed integers wrap around. With GCC, you can enforce this behavior and make signed integer overflow a defined operation with `-fwrapv`. You can also compile your code with UBSan to get runtime checks during testing. UBSan can also check for unsigned integer overflow which is defined behavior in C. So with modern C compilers, the situation is basically the same as with Rust, Zig or other safer languages.

the_duke · on March 22, 2021

Zig has a nullable pointer type, which will have null checks. But only in ReleaseSafe build mode, and it's trivial to construct a non nullable pointer that is null or uninitialised.

So I agree with your disagreement.

jamii · on March 22, 2021

> it's trivial to construct a non nullable pointer that is null or uninitialised

This is checked at runtime in ReleaseSafe.

    jamie@machine:~$ cat test.zig
    pub fn main() void {
        var x: usize = 3;
        _ = @intToPtr(*u32, x - 3).*;
    }
    jamie@machine:~$ zig run test.zig -OReleaseSafe
    cast causes pointer to be null
    Aborted (core dumped)

(The x-3 is because if you use a literal 0 it gets caught at compile time)

quietbritishjim · on March 22, 2021

I'm not familiar with Zig, and I'm afraid I didn't quite follow your comment. Are you saying that "runtime" should instead be "none" for Zig?

masklinn · on March 22, 2021

Depends on the exact guarantees e.g. technically it's trivial to construct a nullable reference in Rust, but it's also `unsafe` and flagrantly UB so...

tsimionescu · on March 22, 2021

> I don't think I understand what is meant by "type confusion". Surely this would also cause compile-time errors? Even C++ would give compile time errors for this unless you use a cast! (C, unlike C++, lets you implicitly convert from void* to any other pointer type so you don't need a cast to get pointer confusion.) Could someone think of an example of what might be meant here, and how it would cause a runtime error?

Given the foot-notes, I think they are referring to unions - probably writing to one union member but reading through another (e.g. uni.intVariant = 19; float a = uni.floatVariant).

I don't personally know how Rust and Zig handle this,I believe it is UB in C.

quietbritishjim · on March 22, 2021

> I think they are referring to unions - probably writing to one union member but reading through another

Thanks, you could be right. In that case, I think it ought to be listed as a compile-time rather than a runtime error too (just as I already argued for null pointer dereference).

It depends a bit on use of course: if you have a function that takes an enum and considers all cases using pattern matching, then the compiler will stop you from accessing the wrong member within a given case. If you pass an enum to a function that expects one particular case to be active, then yes you can convert the compile time error into a run time one, but this is your own choice, not something that is naturally a run time error.

saagarjha · on March 22, 2021

It's actually defined as doing a type pun on the bit representation, interestingly. (The history of this specific behavior is somewhat complicated–it wasn't always clear-cut what this did.)

quietbritishjim · on March 22, 2021

[Edit: this isn't true] That is definitely how it is used, and it works on the major compilers. But, surprisingly, this is indeed undefined behaviour according to the standard. There was an infamous rant by Linus [1] (more infamous for the language than the subject) about how the kernel can and should continue to use unions for type punning. He noted that even though the C standard doesn't specify a behaviour for that usage, gcc does (so long as you don't use a particular compiler switch mentioned in that discussion), and went on to say "The standard simply is not important, when it is in direct conflict with reality and reliable code generation."

EDIT: Having looked into this more I think I was getting confused with C++ (where the only supported way to type pun is to use memcpy or similar, which despite the name might not have to actually copy the bytes at runtime). Here [2] is a StackOverflow answer (/discussion) on the matter.

[1] https://www.yodaiken.com/2018/06/07/torvalds-on-aliasing/

[2] https://stackoverflow.com/questions/25664848/unions-and-type...

saagarjha · on March 22, 2021

No, the C standard defines it as I had specified:

> If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.

This verbiage has existed in a footnote of the standard since a defect report was filed against C99: http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm. This happens to be one of the few instances where Linus is right about the C standard, but without knowing why ;)

quietbritishjim · on March 22, 2021

Yep, you're right, my mistake

pansa2 · on March 22, 2021

> The standard simply is not important

Does the Linux kernel still only support compilation using gcc? That’s about the only situation in which the standard could be considered “not important”. I wonder how many other C projects are in the same situation and only support the use of a single compiler?

quietbritishjim · on March 22, 2021

> Does the Linux kernel still only support compilation using gcc?

There was an effort to build the kernel with Clang [1], I'm not sure about its current status. It helps that Clang tracks GCC's features quite closely, including some of its idiosyncracies.

[1] https://lwn.net/Articles/734071/

saagarjha · on March 22, 2021

Google builds Linux using Clang for Android and Chrome OS.

Blikkentrekker · on March 22, 2021

> but an attept to use the option without checking for None will result in an error at compile time, not runtime. Yes you could convert that into a runtime error, but if it was an error condition for that variable to be None then (depending on the context) you could choose not to use an Option at all and then it would be a compile-time error at the call site to attempt to put None into it.

This wording still suggests that it is a null pointer error reference, which is technically u.b., but in practice on any modern operating system a segmentation fault, but that is not what happens in Rust.

Rather, an optional reference cannot be dereferenced at all ere it be converted to an actual reference, typically with appropriate guards for the None-case, it is true that there is a way to convert an optional value of type `T` to type `T` without handling the None-case, but that is simply a trivial library function that handles the None-case by panicking the thread. The None-case must technically always be handled in some way in safe code.

The distinction is rather meaningful, I would say, as it is not the null pointer dereference stage where the error occurs, but the panicking of the thread of attempting to convert a None-case, which not only happens before that point, but provides for far cleaner debug information.

In safe Rust, one simply cannot dereference an optional reference, one can only coerce it to a reference, with the understanding that this operation panics the thread in the None-case. — an optional reference in Rust is not a “reference” at all as it is in other languages with special support for it.

The_rationalist · on March 22, 2021

I think it's fairly common in rust code to abuse unwrap and such behaving exactly like an unchecked access to a nullable type.

staticassertion · on March 22, 2021

They're definitely not the same at all. Dereferencing a null pointer in C and C++ is undefined behavior. You can read more about it here: https://rootkits.xyz/blog/2018/01/kernel-null-pointer-derefe...

In rust an unwrap of a None is defined and memory safe.

The_rationalist · on March 22, 2021

I don't understand the practical consequence, it will make your program panic and therefore stop which is order of magnitude worse than throwing a null pointer exception, can you imagine your long running application (e.g a server) stopping in production..

quietbritishjim · on March 22, 2021

The main difference[1] between a panic and undefined behaviour, including a null pointer access in C/C++, is that undefined behaviour might not crash. It's tempting to equate "undefined behaviour" with SIGSEGV but it doesn't have to be so. The undefined behaviour is even allowed to travel back in time to an earlier line of code, so long as the compiler is able to prove that it would have eventually invoked that undefined behaviour (e.g. it can prove that it would later dereference that null pointer).

So however bad it might be for your production server to crash, imagine how much worse it might be for it to appear to continue working while it actually corrupts memory and database entries, or an attacker uses it as an exploit to read other users' information, or whatever else. At least when it crashes you can detect that with your health monitoring and potentially start it back up (if the same problem doesn't cause it to immediately crash again in a loop, of course).

[1] A secondary difference is that it is possible to catch Rust panics in an analogous way to catching C++ exceptions. This isn't encouraged and might not even work (if they've been turned into calls to abort() at compile time). The fact that they're not undefined behaviour is the big one.

staticassertion · on March 22, 2021

> I don't understand the practical consequence

Well I did link to a blog post that explains this, so I feel like it's sort of on you at this point. But I guess the short answer is that in Rust you know exactly what 'unwrap' will do on a None, and in C/C++ you can't.

> can you imagine your long running application (e.g a server) stopping in production..

Yes. It happens all the time, and in fact it is inevitable. Far better than the program misbehaving, which is what undefined behavior leads to. In fact if you're building a serious, production service, you might want to skip panicking altogether and just kill the process.

withoutboats2 · on March 22, 2021

Java's NPE is not at all the same thing as the undefined behavior of dereferencing a null pointer in C (or Zig in release mode). In fact its the same exact thing as calling unwrap. Your Java server's NPE is not bringing down the entire server because the exception is caught somewhere. The same can be done in Rust (and is, by many frameworks), where panics are caught before they crash the entire application.

yellowapple · on March 22, 2021

> or Zig in release mode

My understanding is that Zig doesn't allow pointers to be null in the first place (regardless of release mode) unless you're 1) manually creating a pointer from an integer or 2) interfacing with C, and in both of those cases all bets are already off anyway (as they would be in Rust). The only "supported" options outside of that would be a non-null pointer or None.

maccard · on March 22, 2021

you can likely use SEH/Signals in zig to catch the panic and resume.

In C++, you'll be lucky if you throw a nullptr exception. It's undefined behaviour, and it's often exploited by compilers for optimization. Here's a super simple example [0] showing how the compiler makes assumptions and generates a very unexpected result.

[0] https://gcc.godbolt.org/z/7159jGhex

saagarjha · on March 22, 2021

Almost no code can correctly reason about a stray null pointer, and usually they result in strange auxiliary crashes or data corruption. Loudly crashing is often the best choice, even in production.

gnulinux · on March 22, 2021

If my code derefrences null, I want it to panic even in production. Assuming your server will never fail is very dangerous. oom killer can just kill your process for unrelated reasons to your server.

Ar-Curunir · on March 22, 2021

There's a difference between a value being null, and a pointer being null. While Rust disallows both, it handles the cases differently; for the latter, a Rust program containing only safe Rust cannot create "null" references. However, for a "nullable" type, Rust encourages using `Option`, and forces you to handle the `null`/`None` case via a check.

Ygg2 · on March 22, 2021

That's the opposite of true. Unwrap is heavily discouraged. I know for Result there is `?` operator, but I don't remember if that worked on Option.

johnisgood · on March 22, 2021

Heavily discouraged by who? I have seen many Rust projects and they were all filled with unwraps.

Two random Rust projects:

https://github.com/tokio-rs/tokio:

  $ grep -irn 'unwrap' | wc -l
  1398

https://github.com/denoland/deno:

  $ grep -irn 'unwrap' | wc -l
  2050

I dunno, but all Rust projects I have encountered had unwraps everywhere.

duckerude · on March 22, 2021

Here's the score for tokio without tests and examples:

  $ find . -name '*.rs' | grep -v -e test -e example | xargs -n1 sed '/mod test/q' | grep -v '^\s*//' | grep -cF -e '.unwrap(' -e '.expect('
  224

I see a lot of unwrapping of locks. IIRC that only fails if another thread crashed while holding the lock, in which case crashing the current thread is often unobjectionable.

Ygg2 · on March 22, 2021

Yeah, in tests and simple examples. But unwrap is a code smell. You can write it as a shorthand for something

Additionally that metric of yours is faulty.

A) Looks at tests/examples

B) Probably counts 'unwrap_or' which is fine.

> Heavily discouraged by who?

I'd ask to see however encouraged it?

johnisgood · on March 22, 2021

You are right. I should not have even picked two random projects. I merely just wished to say that most Rust projects I have encountered contained a lot of unwrap.

oleganza · on March 22, 2021

In addition to what Ygg2 mentioned (tests and unwrap_or), Rust is still (slowly) moving towards support of `Never` type that auto-erases impossible branches. Without `Never`, if you statically know that your function always returns `Option::Some`/`Result::Ok`, you have to use `.unwrap()` or `.expect("reason")`. This is just you doing the obvious work for the compiler that's not smart enough yet to figure it out by itself.

The_rationalist · on March 22, 2021

This really should get more attention. Safe null access in language like typescript or Kotlin through '?' syntax can still be abused through ('!!' ) but the incentive to abuse it is much lower than in rust because the safe syntax is just as short or shorter (in Kotlin)

withoutboats2 · on March 22, 2021

This is the opposite of reality: in Rust, the safe syntax `?` exists and there is no short syntax for the panicking form (`.unwrap()`). Users are not incentivized to unwrap in Rust, users are as disincitivized as possible to unwrap.

The uses of unwrap in the projects the GP cited are overwhelmingly in tests and examples, which are not expected to handle errors in the same way as application code. The remainder are mainly lock poisoning unwrapping, which is a completely endorsed idiom.

duckerude · on March 22, 2021

For errors the ? operator is both correct and easy most of the time.

But for options it's trickier. Unless the surrounding function is structured just right, the equivalent to Typescript's and Kotlin's ? operator is .map()/.and_then(), and that's pretty ugly. .unwrap() is easier.

Try blocks might help.

saagarjha · on March 22, 2021

I disagree with your characterization that "?" is the "safe syntax"; both are safe but one aborts execution.

edflsafoiewq · on March 22, 2021

Unwrap is discouraged as an error handling scheme, it's fine when it means "I know this is Some". Every a[i] is an unwrap.

The_rationalist · on March 22, 2021

Those are non mutually exclusive statements, it can be officially discouraged yet abused in practice because it often is for the developer the path of least resistance.

flohofwoe · on March 22, 2021

One thing that's important to note is that Zig in general enforces much more correctness than C and also has much better stdlib APIs. IME many memory corruption issues in C (and C++) are actually "secondary effects" of C's and C++'s general "sloppiness" and bad stdlib APIs.

One thing I found where Zig is currently worse than C compilers: returning a pointer to a stack variable generates a warning in "modern" C and C++ compilers, while Zig lets this slip through. I hope that "trivial" things like this will be fixed on the way to 1.0

Rendello · on March 22, 2021

I ran into this in Zig and I was surprised that it wasn't caught. Since I'm newer to lower-level programming I didn't automatically think to look for this and I was lost for a day or two

AndyKelley · on March 22, 2021

We're working on this one- it requires careful cooperation with the language specification to make it legal to be a compile error.

Rendello · on March 22, 2021

Awesome stuff! I love this language

pron · on March 22, 2021

One thing that's important to emphasise: sound safety, i.e. using a safe language, is no one's goal; rather, it is a means to end. What people want is correct programs. The question is, then, does a language help write correct programs?

Ensuring safety from important classes of bugs with sound guarantees is one way to help write correct programs, and both Zig and Rust use it; ensuring safety from from important classes of bugs with sound guarantees based on runtime checks is another way, and both Zig and Rust do it, too; a simple language that's easy to understand and analyse is another very important way to help write correct programs, and both Zig and Rust try to be simpler than their predecessors; making it easy to write tests and run them frequently is another way to get more correct programs that both languages try to employ. Both languages drastically differ in the use of those techniques from either C or C++ because they are both languages that put a very strong emphasise writing correct programs, but they also differ a lot from each other in how they balance those techniques.

It is impossible to tell without careful empirical research which helps write correct programs more than the other, and it is also possible that different people find it easier to write correct programs in either Zig or Rust. Rust certainly provides stronger guarantees that prevent temporal memory bugs than Zig, so let's assume Rust programs will contain zero, and a Zig program will contain more than zero but much less than C or C++. But that delta is insufficient to determine that Rust's balance of techniques reduces more bugs overall.

Also, Zig already has decent checks for use-after-free, and they'll get better, and not having uninitialised memory is also very easy to do (and verify) in Zig, despite there not being any checks. Even if, like other runtime checks, it is turned off in production, it still helps catch errors in that category.

lmm · on March 22, 2021

> What people want is correct programs.

At first order not even that; what people want is programs that behave correctly (or correctly enough for their purposes, which may not be very correct at all) for their particular inputs and execution environment.

Conversely those of us who want the industry to advance the state of the art generally don't want to just produce correct programs at a particular point in time, but programs whose correctness can be easily maintained even as implementations and requirements change. More than that, we want to produce libraries and frameworks that will lead as-yet-unknown programs to be correct.

pron · on March 22, 2021

> what people want is programs that behave correctly (or correctly enough for their purposes, which may not be very correct at all) for their particular inputs and execution environment.

Ah, yes. This raises an interesting philosophical question with real ramifications for software quality assurance: is a bug in the algorithm that never manifests in the system really a bug? Something like that happened in two well-used pieces of code: There was a bug in the TimSort algorithm used in both Java and Python, whose probability of actual failure is similar to the probability of failure due to a bit flip caused by cosmic rays. Because hardware can only be correct with probability, no running system can be soundly verified, i.e. with certainty, anyway, so while the correctness of algorithms can be absolute, the correctness of system cannot. And since soundness has a big cost in verification, many in software correctness research now focus on unsound techniques that are cheaper.

> Conversely those of us who want the industry to advance the state of the art generally don't want to just produce correct programs at a particular point in time, but programs whose correctness can be easily maintained even as implementations and requirements change. More than that, we want to produce libraries and frameworks that will lead as-yet-unknown programs to be correct.

True, but that is not a winning argument for soundness. The cost of soundness manifests even at maintenance. It's therefore an equally strong argument that a language that compiles quickly and more easily allows running, say, concolic tests, mutation tests etc., serves that goal, too.

pron · on March 22, 2021

P.S.

A language that makes code reviews easier also works toward that goal of maintaining program correctness over time. The point is, there are many different paths to correctness, all of them state-of-the-art yet are often in conflict with one another, and we don't have any mechanism other than empirical research to compare them. For example, is it beneficial to increase soundness at the expense of making code reviews harder? Not only do we not have an answer to that question, it is likely that there is no general answer (I say it's likely because whatever empirical research we do have shows messy results with large variance).

dnautics · on March 22, 2021

> A language that makes code reviews easier

So much this. And also keep in mind the way that we typically do code reviews, we typically are looking at github diffs. So if you are in a situation where changing code is ill-composable, for example, if something looks safe in place A and something looks safe in place B but when you put them together it's not unsafe... Then you could be in deep trouble with the async way that we do reviews.

HourglassFR · on March 22, 2021

I've seen you make that argument a few times on Zig related discutions but I'm not sure I buy it. It essentially boils down to: simpler language => easier to reason about and build tools for => less bugs.

While the thought as merits, the empirical evidence we have indicates that yes it is possible to achieve good software with faulty languages with strong rules and tooling but it is certainly not as straightforward as you make it seem.

In the end though, for the Zig case I agree that the jury is still out. But if I was a betting man, my money would not be on it, even though personnaly prefer Zig.

mtzet · on March 22, 2021

You're also forgetting: simpler language/explicit code => faster build times. Zero-cost abstractions are only zero-cost in optimized builds and and complex optimization isn't free.

Whether static checking vs faster iteration time is more important depends entirely on the context, but rust isn't going to help you when you accidentally did front-face culling instead of back-face culling.

pjmlp · on March 22, 2021

Eiffel, Delphi, Nim and D have very fast build times, and are all relatively complex.

Even C++ can have relatively fast build times, depending on how everything is structured, and the use of binary libraries.

It is a matter of tooling, as an anecdote all my UWP C++ applications compile faster than most of my Rust experiments.

pron · on March 22, 2021

> but it is certainly not as straightforward as you make it seem.

I never claimed it is straightforward; it is anything but. As a practitioner and advocate of formal methods and verification, I've been following research in software correctness for many years (and have written much about it, e.g. https://pron.github.io/posts/correctness-and-complexity), I've come to realise how complex the problem is, and there's more we don't know than we know, and even the things we know are problems, we don't know what the best solution is, because often solutions carry with them more problems.

Nonetheless, there are certain principles. We know that we can eliminate certain bugs with compile time guarantees; we also know that code reviews catch many (many!) bugs, and so making them easier helps. But what if these two are in opposition? It's not easy to tell which wins in which circumstances.

> In the end though, for the Zig case I agree that the jury is still out.

True, but the jury is still out on Rust, too. In fact, for most languages. However, there is no clear argument that we should assume, a priori, that Rust results in more correct programs than Zig. Many such arguments in the past have failed to yield positive empirical results (e.g. https://youtu.be/ePCpq0AMyVk). In fact, given empirical research, the safest bet is to assume the null hypothesis -- that there is no difference. Out of an abundance of caution, I'll assume that languages whose designers place a strong emphasis on correctness might achieve it more easily than languages whose designers put no emphasis on it at all, but Zig and Rust are in the same category here. Both are designed with correctness as a primary goal. But as their design and means of achieving correctness is so different, I think it's impossible to make an educated guess as to which of them, if any, yields more correctness more easily.

If we want some bottom line, it is this: software correctness is so complex, and solutions are often so non-obvious (i.e. many work in theory but not in practice), that we cannot say anything with certainty until we have actual empirical results, and even then we need to be careful not to be careful not to extrapolate from one study to other circumstances with different conditions (i.e. that TypeScript seems to have fewer bugs than JavaScript does not seem to extrapolate to the general claim that typing always reduces bugs compared to no typing in the same amount or at all, when other languages are concerned).

HourglassFR · on March 22, 2021

> (and have written much about it, e.g. https://pron.github.io/posts/correctness-and-complexity)

Wow, thanks for that link. I only made it through the first part for the moment but it is an incredible read. You clearly thought about this more deeply and carefully than I did.

Edit: I'm not entirely sure how that came across so I want to explicitly say that this is not a dry ironic statement (communication is hard, and I am a poor writer).

pron · on March 22, 2021

If there's anything I learned it is to be wary of any easy answers or definitive claims when it comes to software correctness.

dnautics · on March 22, 2021

Are you interested in helping kickstart interest into formal verification of zig? My contact info in profile.

littlestymaar · on March 22, 2021

> We know that we can eliminate certain bugs with compile time guarantees; we also know that code reviews catch many (many!) bugs, and so making them easier helps. But what if these two are in opposition? It's not easy to tell which wins in which circumstances.

I understand the argument, but I'm not sure on what basis you consider that Rust's type system harms code review. Do you have specific examples in mind? (And because the discussion is about Zig, this is a pretty strange argument to make, because Zig's ubiquitous usage of metaprogramming is in fact a hindrance to code review).

pron · on March 22, 2021

Rust is easily among the top five most complex programming languages ever created (it's in the good company of other low-level languages that follow a similar design philosophy, like C++ and Ada).

Calling Zig's comptime "metaprogramming" is a little misleading when compared to other low-level languages. It is used for the same purpose as metaprogramming in other low-level languages (like macros in C++ and Rust, or templates in C++), but doesn't have any quoting mechanism [1] and doesn't operate at any "higher-level." In fact, Zig's semantics would be unchanged if comptime were executed at runtime. It is more similar to meaprogramming in dynamic language with reflection, with the benefit that related "runtime" errors are actually reported at compile-time. So comptime doesn't increase Zig's complexity. It can be thought of as a pure optimisation.

[1]: Unlike metaprogramming in Rust or C++, Zig's comptime is referentially transparent, i.e. if two terms, x and y, have the same meaning, then, unlike in C++ or Rust, one cannot write a unit e in Zig, such that e(x) and e(y) have different meanings. So the metaprogramming features in C++/Rust are trickier than Zig's.

littlestymaar · on March 22, 2021

> Rust is easily among the top five most complex programming languages ever created

You said that already[1], this is unsubstantiated and you declined to answer to my rebuttal.

> So comptime doesn't increase Zig's complexity. It can be thought of as a pure optimisation.

I'll grant you that it doesn't increase Zig's implementation complexity and also have a smaller learning-curve cost than other mecanisms. But when reading a piece of Zig code, you constantly have to wonder at which time the given code is gonna run. And there's much, much, more comptime in use in any piece of Zig code, than you'll uncounter macros in Rust or C++. So yes, it adds its share of friction when reading Zig code.

[1]: https://news.ycombinator.com/item?id=26511584

pron · on March 22, 2021

> You said that already[1], this is unsubstantiated and you declined to answer to my rebuttal.

Sorry, didn't see your response. I can answer it in two ways, subjective and objective. The subjective is "I know it when I see it," which roughly corresponds to the difficulty in determining what an unfamiliar piece of code does as well as how many language rules I need to know to figure that out. The objective one is literally language complexity, i.e. the computational complexity of determining whether a string belongs is in the language or not (i.e. whether or not it is well-formed).[1]

> you constantly have to wonder at which time the given code is gonna run

You really don't. The semantics of Zig are the same as those of Zig', which would be the language that runs comptime at runtime. The whole point of comptime is that as far as semantics -- not performance -- is concerned, you do not have to care when code would run.

[1]: There's a complex theoretical caveat here, because I believe both Zig and Rust are undecidable. So we can exclude degenerate cases from Rust, and look at the complexity of Zig' , the language I introduce in the second paragraph, which is semantically the same as Zig.

littlestymaar · on March 23, 2021

That's a strange vision of the burden of proof.

Anyway, complexity can come from many factors:

- feature bloat: C++ is way more complex now than it was in 1990, because features where added on top of features. In that regard, the older a language gets, the more complex it becomes. C++ is the most cited example, but I think PHP is even worse in that regard: it's probably the one and only most feature bloated PL ever, probably because there is not even a standardization committee to add frictions to the feature additions process. By that metric, Rust is slowly becoming more complex every year, like every other language (but the growth of its complexity isn't particularly concerning compared to others, Go for instance has recently been on a much steeper track).

- platform fragmentation: when Internet Explorer was still a thing, JavaScript development was made incredibly complex by the huge implementations differences between browsers. Code that worked somewhere failed somewhere else more often than not, and you had to keep work-around for old versions or IE for years. IE is mostly dead, Safari is less shitty every year, and google killed Android Browser and replaced it with Chrome, so it's a much smaller issue than before, but problems remain.

- cultural factors: Haskellers love for obscure mathematical terms or the fetishism of OOP's design patterns in Java in the late-90 and 2000 are good examples of culturally-induced complexity.

- ecosystem churn: JavaScript between 2013 and 2018 or something, with new framework or libraries or tools replacing the old ones every six months, before getting replaced themselves in the following month was a massive source of complexity, fortunately it seems to have settled a bit and the churn rate is lower than before. In Rust's early days, when many useful features were still unstable and feature-gated in the nightly version of the compiler, this phenomenon also existed (though at a much smaller scale). By that metric, Rust's complexity decreased quite a bit since 1.0, as many libraries have been adopted as de facto standard way of solving a bunch of problems (a few domains remain prone to this though, like error handling helpers, and ECS for game engines apparently) and Rust is now roughly in the same situation as most languages.

- counter-intuitive semantics: c.f. pre-ES6 JavaScript, how `this` and `var` bindings worked, which was simply the opposite of what people wanted in 95% of the cases.

- obscure control flow: `with` statement in non “strict mode” JavaScript, languages relying on a lot of `goto`, or even languages with exceptions.

- too much responsibility: manual memory management in C (or Zig for that matter) which we now have significant evidence after half a century that no human is able to do it consistently right of the time.

- poor interactions between features: see C++, how modern features interact poorly with older (more C-like) ones.

Rust is less complex than many mainstream languages on a least one of these dimensions, and less complex than JavaScript on most of these…

> The objective one is literally language complexity, i.e. the computational complexity of determining whether a string belongs is in the language or not (i.e. whether or not it is well-formed).[1]

This is a stupid metric, because it confuses implementation complexity with user-facing complexity (brainfuck wins this benchmark, yet good luck building anything with it). But from a theoretical perspective, this is a fun one because there's not only one but two classes of indecidability involved:

First, with most language with type polymorphism, it is undecidable to know whether a given program will successfully compile. But there's also a second level: when a language has Undefined Behaviors, a program compiling successfully isn't enough: it can still be invalid, and whether or not it is valid is also undecidable. C is not in the former situation but is in the later, C++ and Zig are in both, safe Rust is in the first only, but unsafe Rust is also in both. So in that regard, safe Rust is strictly less complex than Zig, but the whole Rust is equivalent.

> You really don't. The semantics of Zig are the same as those of Zig', which would be the language that runs comptime at runtime. The whole point of comptime is that as far as semantics -- not performance -- is concerned, you do not have to care when code would run.

This argument is pretty similar to the Rust point of “when you get used to it, ownership doesn't adds any cognitive burden”, maybe when gaining enough familiarity with Zig you can gloss over it without hassle, but I'm clearly not in this case yet so you really better not assume that it's gonna be straightforward and instantaneous for everybody, it is not.

pron · on March 23, 2021

> That's a strange vision of the burden of proof.

I don't think so, because I don't think I'm claiming what you think I am.

> Anyway, complexity can come from many factors:

I completely agree, but I'm only talking about language complexity, in the strict syntactic, linguistic sense. I am not saying that all things considered, Rust makes maintaining programs harder than other languages -- nobody knows that until we have some empirical study -- but linguistic complexity is one very prominent property of Rust, as is, say, the memory safety of safe Rust, which, similarly, does not mean that Rust programs are overall safer than those written in, say, Zig, when all things are considered, because correctness also has many contributors, not just sound syntactic guarantees. There, too, only empirical study can settle the issue.

But you can't have it both ways, focusing on one specific piece when it comes to correctness yet insist on looking at the full picture only when it comes to complexity. All you can say is that, linguistically/syntactically, Rust offers some sound guarantees re memory safety and that it is complex, and that overall, both subjects are complex, involve many aspects, and require empirical study to make any definitive claim about.

> This is a stupid metric, because it confuses implementation complexity with user-facing complexity

It is obviously not stupid because it is commonly and usefully employed in computer science. But as with any precise definitions, it focuses on some aspects and not others. It captures the intrinsic difficulty of answering a question about a program. You are correct that it does not take into account human ergonomics and psychological aspects, but it is one more useful metric, even if not comprehensive.

> This argument is pretty similar to the Rust point of “when you get used to it, ownership doesn't adds any cognitive burden”,

Absolutely not (and, BTW, I was not referring just to ownership and lifetime when I spoke of Rust's complexity). It is a very precise and well defined property of Zig. The semantics of a Zig program, i.e. what it does in terms of what action it computes, is completely independent of comptime. It is not an ergonomic or psychological argument. comptime does not change the meaning of anything, and not only do you not need to figure out what happens at compile time and what happens at runtime -- unless you want to reason about efficiency -- but that knowledge contributes nothing. It's a meaningless distinction when it comes to semantics. It's a very powerful, well thought-out, theoretical and practical aspect of Zig's design.

littlestymaar · on March 23, 2021

> but I'm only talking about language complexity, in the strict syntactic, linguistic sense.

Then again, on the strict syntactic sense, Rust is even less complex than C, because of the “most vexing parse” issue. If you wanted a rigorous analysis of the syntactic complexity, you could attempt to measure how difficult it is to write a lexer and a parser for every popular languages, and see how Rust performs. But given that the language grammar has been designed with parsing complexity in mind and have benefited from the hindsight of others before it, you'd be terribly disappointed.

From this discussion, and many of your previous interventions on this forum, it's pretty clear, even though the reason isn't, that you have developed a resentment towards Rust and you can't help bashing it.

Rust isn't a silver bullet, it has a fairly tough learning curve and as it tries to push the frontier of system programming language forward, it will take a few decisions that will ultimately be regarded as dead ends, and I have no doubt that future languages will avoids these pitfalls and provide improvement over the state of the art.

In the meantime, spreading your hate with unsubstantiated judgements like “Rust is one of the 5 most complex programming language ever” or “Rust harm code reviews” isn't really constructive for anyone.

Zig is a cool motorbike, Rust is a SUV. Arguing that your bike can indeed be safer than a SUV because you have more visibility and agility to avoid the danger is beyond childish.

Super easy cross compilation and incredible development velocity on small-medium projects are super cool features of Zig, and Rust can't beat that. No need to downplay the importance of Rust for the software industry (and as a friendly reminder, Rust is making its way to the Linux kernel, with the approval of Linus because unlike C++ or Ada isn't too complex to his taste ;).

pron · on March 23, 2021

> can't help bashing it.

Perhaps this may disappoint you, but I -- like many and perhaps most developers -- don't get such emotional responses, positive or negative to any programming language [1], which might appear as resentment to the emotionally attached. I am very impressed with some aspects of Rust, less impressed with others, and overall, my feelings toward it overall are shaped just like yours: by personal aesthetics. I don't find Rust's aesthetics very appealing and so Rust isn't my cup of tea (although I would't resent working in it because I'm not emotional toward languages and I currently program mostly in a language whose aesthetics I like even less than Rust's [2]), while you find them appealing and so you do like Rust. It's all just a matter of taste, and I fully accept that not everyone shares mine. I think your approach is too coloured by emotion, and is therefore unconstructive. You're a zealot, and you project that attitude on others, so "unconvinced" appears to you as a personal attack, and scepticism or dislike seems to you like bashing.

> Rust is even less complex than C, because of the “most vexing parse” issue. If you wanted a rigorous analysis of the syntactic complexity, you could attempt to measure how difficult it is to write a lexer and a parser for every popular languages, and see how Rust performs

No. The complexity of a formal language, like that of any set, is the computational complexity of deciding whether a string is in the language (so, including type-checking), not as the complexity of the parsing phase (https://en.wikipedia.org/wiki/Computational_complexity_theor...). I'm not saying this is the most useful way to talk about language complexity in this context (and caveats are needed, anyway, to make a more fine distinction between languages), but that is certainly one well-known way to talk about the intrinsic complexity of a language.

> Zig is a cool motorbike, Rust is a SUV. Arguing that your bike can indeed be safer than a SUV because you have more visibility and agility to avoid the danger is beyond childish.

It is beyond childish to make such inane statements about software correctness when you're clearly not very familiar with the subject, and are drawn to arguments like "more correctness => more soundness". The effect of language design on correctness is a complex subject with mostly unsatisfying answers, and even in software verification research, the debate over the value of soundness is far from settled (and not currently leaning toward more soundness). An equally inane statement would be, "Zig is like a modern aeroplane, relying on multiple levels of safety, some mechanical and some human, while Rust is like an old train that breaks down and kills everyone once there's a problem with the tracks." If we've learned anything about software correctness in the past decade it's that there is not much we can assume in advance, and that we don't really know one best way to improve it. It is true that some researchers think that the best answer to any correctness issue is more soundness in the language, but not only is this not a consensus opinion, I doubt it's even a majority opinion.

[1]: I would say I'm a "language sceptic." I'm generally sceptical toward any empirical claim about the bottom-line effectiveness of linguistic features without empirical support, and overall think that whatever empirical studies we do have show little impact overall to language design (comparing "reasonable" alternatives, at least), certainly compared to what all language fans claim. I would never, say, make a definitive claim like, "Zig yields more correct programs than Rust", or "Rust yields more correct programs than Zig," without clear empirical support (and my guess based on prior results would be that they're about the same).

[2]: C++

littlestymaar · on March 23, 2021

> I think your approach is too coloured by emotion, and is therefore unconstructive.

Your little “I'm a rational agent, you are too emotional” is pretty cute. But it would work better if your whole attitude in this thread didn't contradicted it, don't you think? “Rust is among the five most complex language” is not a rational argument, it's a personal feeling. Why you feel the need to spread your feelings over the internet while pretending you're not an “emotional” person is quite intriguing. If you want to look more like a rational person (no human really is), try to keep as much personal and unsubstantiated judgement out of your writings.

“Rust marketing makes safety claims that we should not take at face values” is alright, “Rust is one of the five most complex language ever” doesn't pass this test.

> is the computational complexity of deciding whether a string is in the language (so, including type-checking)

But for both Rust, C, Zig, and many others, it's undecidable, so by this definition of complexity, these language are definitely too complex (and equally so).[1] In fact, your desperate attempt to save your initial argument about complexity, by narrowing it to a tiny technical corner makes me cringe a bit.

> I would never, say, make a definitive claim like, "Zig yields more correct programs than Rust", or "Rust yields more correct programs than Zig," without clear empirical support (and my guess based on prior results would be that they're about the same).

The technicality of what constitute a “definitive claim” in a human to human conversation is an interesting question, but in practice truisms like “we can't conclude whether Rust is safer than C” or “we can't conclude that Rust doesn't bring more bugs than Zig does” aren't neutral: what such claim attempts to do is insinuate the opposite. And when combined with gratuitous judgements like “Rust is one of the 5 most complex language ever”, it looks a lot like an attempt to deter people from using this language you don't like. (And now I have a clue on what the root cause of your bad feelings can be).

[1] And as I said earlier, in the case of Zig and unsafe Rust and it's not just about type-checking: because of UB, even after compilation whether the compiled binary is the binary of a valid program or not is also undecidable.

pron · on March 23, 2021

> But for both Rust, C, Zig, and many others, it's undecidable, so by this definition of complexity, these language are definitely too complex (and equally so)

My comments on this tried to be as careful and as precise as possible, and touched on this very issue.

> In fact, your desperate attempt to save your initial argument about complexity, by narrowing it to a tiny technical corner makes me cringe a bit.

Your emotional response here is so powerful that I think we're conversing on entirely different levels. I am sorry if my mild and careful statements have touched on something that you clearly see as essential to your identity.

> aren't neutral: what such claim attempts to do is insinuate the opposite

No. It is the most precise and careful statement that I can say, having followed the research for years and being a practitioner and advocate of formal formal methods. Anything I say, the more careful I try to be, it just seems to send you into further rage (and abuse). I think you're in a middle of a tantrum that's clouded your judgment, or perhaps, being a zealot, you cannot imagine any other attitude. Anyway, this conversation is making me very uncomfortable, as I sense you're in a very agitated emotional state, and I want no part of that.

littlestymaar · on March 23, 2021

> No. I think you're in a middle of a tantrum and that's clouded your judgment.

I have to admit, this is cool rage-quit punchline!

Edit:

> I am sorry if my mild and careful statements have touched on something that you clearly see as essential to your identity, but I simply see no way to discuss this subject with you.

> Anyway, this conversation is making me very uncomfortable, as I sense you're in a very agitated emotional state, and I want no part of that.

Wow, it got even better as I refreshed the page!

defen · on March 22, 2021

> You said that already[1], this is unsubstantiated and you declined to answer to my rebuttal.

How would you propose to measure the concept of "programming language complexity"? One metric could be "how difficult is it to write programs that do not contain certain classes of bugs"? By that metric, C is indeed incredibly complex. An alternate metric might be "how long does it take the average developer to learn the language well enough to write reasonably effective programs"?

In the absence of formal studies we just have to go by our intuition. Personally, I kinda hate the "I'm not smart enough to write C, so I write Haskell/Rust" argument. It comes across as incredibly condescending to me. What I can tell you from my experience is that I spent a month trying to learn Rust on nights and weekends, and by the end of that was able to write some extremely simple programs with a lot of effort. On the other hand I was making nontrivial contributions to Zig itself within a week of learning the language. So to me, Rust is much more complex than Zig.

littlestymaar · on March 22, 2021

I'm not a native English speaker, but as far as I know, the word complexity in English is pretty close to its meaning in French (where it comes from). From Wikipedia:

> Complexity characterises the behaviour of a system or model whose components interact in multiple ways and follow local rules, meaning there is no reasonable higher instruction to define the various possible interactions.

This is in fact the most antithetical possible description of Rust, which, thanks to its strong type system and compile-time rules, keep the interactions between different components or features as clear and specified as possible.

Yes Rust is hard to learn, but learning curve and complexity are orthogonal concerns.

pjmlp · on March 22, 2021

From my point of view until Zig fixes the issues marked as none on the table, it adds very little value to existing alternatives.

I can already use C and C++ to suffer that in production, use VC++ static analysers to mitigate them, while languages like Ada, D, Rust, Nim, Swift take care of them not happening at all.

pron · on March 22, 2021

Zig's safety is not at all like C's (or even C++'s), even with static analysers and sanitisers. It is core to the language through things like slices and nullability types. What Zig brings to the table is an extremely powerful and expressive, yet remarkably simple language that places as much emphasis on correctness as Rust (albeit in a radically different way).

Ar-Curunir · on March 22, 2021

I don't think you can say that Zig is as correct as Rust given that memory-safety is not guaranteed by Zig (as evidenced by the article we're commenting on).

pron · on March 22, 2021

Whether a language is "correct" is meaningless (hopefully, most compilers/interpreters are reasonably correct); we're talking about which language makes it easier to write correct programs, and because both languages focus heavily on that goal yet take very different approaches to achieving it (the article only compares one), it is simply impossible to tell at this point which of those languages, if any, achieves that goal better than the other.

pjmlp · on March 22, 2021

Except you are assuming that Zig will never change after 1.0 release.

C17 is also quite different from K&R C, specially in what optimizers do with UB.

The only way OS vendors have to fix issues that Zig also shares with C, C++ and Objective-C, as per the article, is to adopt hardware memory tagging, something already available on Solaris SPARC, Azure Sphere and iOS (yes PAC is a bit different), with ongoing work for ARM.

So I really don't see the benefit, but lets see how Zig 1.0 actually looks like, and I might be wrong by then.

pron · on March 22, 2021

I'm talking about Zig as it is now. If the design drastically changes, it would be different story. The UB comparison is not very relevant though because Zig, a language that takes correctness seriously, aims to make it very easy to not have any UB (at least with high probability) in its safe mode. Aside from being inherently safer than C, and arguably C++, even with all of those enhancements (not considering safe variants of C), Zig brings benefits other than safety. Like terrific cross compilation, fast builds, and a language that is extremely expressive yet very simple and easy to learn.

But I've long ago learned that language preference is mostly a matter of personal aesthetics, so all I can say is that I find Zig very appealing. Its design is certainly radical, and it doesn't feel like any other low-level language I've ever seen (it is about equidistant from C, C++, Rust, D, Nim, Ada; even when pushed I don't think I'd be able to say which of those Zig is most like, because it is so different). Like it or not, it offers a fresh vision on how low-level programming can be done.

pjmlp · on March 22, 2021

By the way, Apple decided to just use "Safe C" for their iBoot firmware, but other than documentation references on Apple Developer, they are probably not going to share it with the world.

_vvhw · on March 22, 2021

It's important to clarify that memory safety is only one aspect of writing safe, secure software.

To this list then I would also want to add and compare: OOM-safety under overload conditions, and fine-grained error handling safety, in particular because error handling tends to be one of the leading causes of faults in distributed systems [1].

To be fair, I was surprised that Rust did not have checked arithmetic on by default and that this needs to be turned on via compiler setting or linted against. The presence of integer overflow in a program can facilitate a whole range of exploits, even with memory safety.

[1] - https://www.eecg.utoronto.ca/~yuan/papers/failure_analysis_o...

quietbritishjim · on March 22, 2021

> memory safety is only one aspect of writing safe, secure software

A good example of this is the SQLite documentation page "Why Is SQLite Coded In C". Among other things, it describes those memory safety issues as "the easy problems" compared to "the rather more difficult problem of computing a correct answer to an SQL statement".

Not all of our programs are like SQLite, of course (and not all of us mere mortals are like its developers). But I would certainly say that just because you've eliminated memory safety bugs doesn't mean you've eliminated all bugs. Depending on the program, you might not even have eliminated most bugs.

[1] https://www.sqlite.org/whyc.html

jamii · on March 22, 2021

Despite being an 'easy problem' it has led to code execution on android/ios:

https://research.checkpoint.com/2019/select-code_execution-f...

> We established that simply querying a database may not be as safe as you expect. Using our innovative techniques of Query Hijacking and Query Oriented Programming, we proved that memory corruption issues in SQLite can now be reliably exploited. As our permissions hierarchies become more segmented than ever, it is clear that we must rethink the boundaries of trusted/untrusted SQL input. To demonstrate these concepts, we achieved remote code execution on a password stealer backend running PHP7 and gained persistency with higher privileges on iOS. We believe that these are just a couple of use cases in the endless landscape of SQLite.

dnautics · on March 22, 2021

technically that's exactly the 'hard' problem, since QOP and QH are 'dealing with database problems' and not low-level problems like memory safety.

masklinn · on March 22, 2021

QOP/QH are not the point, they're ways to reach the existing memory corruption. Without the memory safety issue you can reach QOP/QH are not useful.

dnautics · on March 22, 2021

IIRC, QOP/QH though requires the somewhat unfortunate way that tables are laid out and initialized in SQLite, so it was my impression that QOP/QH are the highest order problem that needs to be patched; after all there are other types of vulns that aren't memory safety problems that are reachable with QOP/QH.

nerdponx · on March 23, 2021

I found this part about Rust at the end particularly interesting:

> All that said, it is possible that SQLite might one day be recoded in Rust. Recoding SQLite in Go is unlikely since Go hates assert(). But Rust is a possibility. Some preconditions that must occur before SQLite is recoded in Rust include:

> Rust needs to mature a little more, stop changing so fast, and move further toward being old and boring.

> A. Rust needs to demonstrate that it can be used to create general-purpose libraries that are callable from all other programming languages.

> B. Rust needs to demonstrate that it can produce object code that works on obscure embedded devices, including devices that lack an operating system.

> C. Rust needs to pick up the necessary tooling that enables one to do 100% branch coverage testing of the compiled binaries.

> D. Rust needs a mechanism to recover gracefully from OOM errors.

> E. Rust needs to demonstrate that it can do the kinds of work that C does in SQLite without a significant speed penalty.

_vvhw · on March 22, 2021

There's always a new treasure to be found in the SQLite documentation. Thanks for the link.

littlestymaar · on March 22, 2021

> The presence of integer overflow in a program can facilitate a whole range of exploits, even with memory safety.

Do you have examples that do not involve array indexing? (which are uncommon in Rust because iterators exist and are faster).

_vvhw · on March 22, 2021

Sure, almost anything that uses numbers, e.g. to control access or parse data structures or do "something important", here are two off the top of my head:

* Rate limits for an OTP, which could be trivially reset through overflow.

* Parsing zip files, where the hostile content is self-referencing, to sneak in cyclic references for a DoS, or to change file extension type for code execution after bypassing a content filter, or to change output destination (e.g. as part of a symbolic link or directory traversal) to overwrite system files.

Unchecked arithmetic for me is by far one of the scariest exploit vectors because it's so easy to do, and one of the first things a trained attacker would look for.

Programs in any language are always exploitable, in some way or another. Memory safety is no guarantee that a program is "safe" let alone correct.

littlestymaar · on March 22, 2021

Thanks for the examples.

_vvhw · on March 22, 2021

Hey it's a pleasure!

unanswered · on March 22, 2021

> Rust did not have checked arithmetic on by default

Except it does, in debug mode, which is the default compilation mode. And you don't have to tweak compiler settings or linters to get checked arithmetic in release: you simply call the checked arithmetic functions, which give the added benefit of giving you complete freedom in how to handle overflow.

Wherever you 'learned' this misinformation from, please stop considering it a trustworthy source :(

Nycto · on March 22, 2021

Someone on Reddit added Nim to the table, which I found interesting:

https://uploads.peterme.net/nimsafe.html

(Source: https://www.reddit.com/r/nim/comments/maj1lz/nim_safety_in_c...)

andrepd · on March 22, 2021

Which is a bit silly because Nim is a garbage collected language, and hence competing in a different league.

pjmlp · on March 22, 2021

The league of being safe, without use after free, letting the developer focus on productivity, while providing the language features to do C style programming when required.

k__ · on March 22, 2021

Isn't the Nim GC optional?

ampdepolymerase · on March 22, 2021

It is optional in the sense that D's GC is also optional. Technically true but you have to go out of the way to make it work for you, average libraries off the shelf cannot be easily utilized.

rayman22201 · on March 23, 2021

This is a response to both you and the grandfather comment.

Nim now has ARC/ORC, which is a reference counting scheme similar to Swift. This is not a "generational GC" like Java, Go, or D.

The entire standard library and most average libraries, "just work" with it, and it will be the default near future.

https://nim-lang.org/blog/2020/12/08/introducing-orc.html

fwsgonzo · on March 22, 2021

I don't have much experience with Zig, but one thing that stuck out to me was that I was able to build for RISC-V with a one-liner. I didn't have to change or do anything at all to make this happen. That's so cool.

In contrast, I have yet to be able to build any RISC-V binaries with Rust. It just doesn't work. Sure, I could see some potential things like writing a custom JSON to describe the env and maybe build using a cross-compilation toolchain. But after a certain amount of time and no answers, it was not worth my time anymore.

https://stackoverflow.com/questions/64308644/rust-unable-to-...

If you think you have the answer ^

dnautics · on March 22, 2021

The recommended way to deal with use-after-free and double-free in the language is to do it in test. You can pretty trivially get "asan"-like behaviours out of the zig std library. A good demonstration is here: https://www.youtube.com/watch?v=4nVhByP-npU&t=3h12m

I kind of like this philosophy, because in a sly way it's a carrot to get you to write tests. Come for the memory safety, stay for the robustness.

As a bonus, the beginning 2 hours of the video is a fantastic and honest discussion about the role of emotional empathy in tech communities and tech employment (while also acknowledging that it is possible to be an asshole and deliver good tech).

jamii · on March 22, 2021

That's also the recommended way to deal with it in c. It hasn't been effective in preventing vulnerabilities based on use-after-free.

If the GPA is practical to use in production, that will be a different story. But it doesn't sound like it's there yet.

dnautics · on March 22, 2021

True, UAF still kind of sucks in zig. My prediction is that we are going to eventually get some sort of formal verification engine for zig as a third party tool.

AndyKelley · on March 22, 2021

To be clear - you get this same UAF protection in ReleaseSafe mode by default with GeneralPurposeAllocator and PageAllocator. It's not just a test thing. The testing system just chooses some nice defaults for you.

dnautics · on March 22, 2021

thanks for the clarification! Also, you can arbitrarily compose GPA backended by other allocators that aren't pageallocator, in which case you might lose the early-exit segfault behavior.

pjmlp · on March 22, 2021

Unless the test proves all use cases in production, it is no better than C with ASAN.

mlindner · on March 23, 2021

This is as ridiculous as it is in C. You can't be sure you're testing all the user after free cases and if the code changes and you forget to update the test then you have no idea that you could have added a use after free. Tests rot over time.

The more I read about Zig I feel like it's made for C programmers stuck in the Stockholm Syndrome of C and don't want out. (Speaking as a former C programmer.)

StefanKarpinski · on March 22, 2021

It’s interesting to note and probably underappreciated how many of these safety issues are addressed simply by having a garbage collector — eliminating manual memory management prevents all of these kinds of bugs except for data races. Of course there are situations where you cannot afford a GC, but for how many programs is avoiding a GC worth all the additional language complexity? Preventing data races is no small thing, but this observation certainly suggests that the approach of GC + tasks & channels + a good race detector is more powerful than it is commonly given credit for — think about how much user-facing language complexity it replaces. This sounds like a pitch for Go, and to some extent it is, but Julia takes very much the same approach for the same reasons.

tene · on March 23, 2021

One other underappreciated point is how to deal with other resources besides memory.

Go has a GC, which helps with memory, but you're entirely on your own for files, sockets, database connections, mutexes, channels, etc.

The features that Rust uses for memory management are fully general-purpose, and also help you with safely handling all other kinds of resources.

Consider how many ways you can misbehave with a Mutex in Go that are all caught by rustc at compile-time. This has nothing to do with memory management, but the same thing that prevents use-after-free is what also prevents using a mutex-protected value after releasing the lock.

oleganza · on March 22, 2021

The power of a software is in composing things via powerful abstractions. When you write a useful library in a PL with GC, you automatically make it unavailable in all applications that can't use GC. What's worse, other libraries that could've built something around yours have to choose whether to limit their applicability to applications with GC or find another way.

But if you have a clever language that navigates this tradeoff and lets you build powerful zero-cost abstractions (C++, Rust, Zig), then it attracts significantly more talent that compounds.

StefanKarpinski · on March 22, 2021

That’s a fair point. Something we’re very interested in is exposing libraries written in Julia without the full runtime, ditching the JIT or the GC if you don’t need them. It would be easy to write a libm replacement in Julia, for example, without using either of those.

Regarding talent, it depends on the kind of talent you’re talking about. Yes, systems programmers like Rust. On the other hand, needing to deal with a strict borrow checker excludes a very large number of people with numerical computing, data science and machine learning expertise (not all, but definitely most). So it cuts both ways.

pjmlp · on March 22, 2021

There are many ways to compose software and libraries, it doesn't need to be static linking all the time.

Everyone praises UNIX for pipes in the shell and forgets about OS composition APIs.

mtzet · on March 22, 2021

I think this is a roughly fair assesment, but I also think it's important to contextualize memory safety. Ultimately, the goal here is to produce /correct/ software. Memory safety is a subset of this, but there are other aspects to correctness as well.

I really like zig's approach of explicitness and fast iteration cycles. Fast compile times and the very flexible build system makes me hopeful for a really slick workflow for embedded development,where zig code can be used to deploy and test as well. For my own use I think it's a clear win.

On the hand, the amount of damage poorly architected zig code can cause is about as large as for poor c code. For typical enterprise code the rust compiler will make sure that many bad decisions will not even compile. There's still a risk of towering abstractions, but at least I could avoid spending as much time debugging hideous race conditions.

dathinab · on March 22, 2021

Just somewhat related to the article but I think one thing which is often misunderstood about rust's is it's borrow checker.

The borrow checker is not about memory safety but about aliasing guaranteed.

It just happen that combining this with deterministic destructors (/RAII) happend to enable reliable "automatic-manual-memory-management" (or however to callit).

And combining it with some clever auto traits (Send/Sync) happen to prevent data races (if no unsafe is used, like always).

But the benefits are not limited to just that. Not just memory-resource management but also other resource management profits from this design.

Similar while Send/Sync is about multi threaded data race prevention there are also problems in single threaded patterns which are quite similar to that e.g. "racing" between iterating a collection and changing it in the body of the iteration, and the aliasing guarantees make sure you don't have such problems either.

Similar rusts main pointer type (`&`) does not only provide compiler time non-null guarantees but also provides compiler time guarantees about how the data can be accessed (dereferencable, writeable etc.).

And then there is the choice to use the type system to prevent application logic bugs in many ways.

So the bullet points in the table miss many dimensions .

But then zig is still a grate language, but trying to convince people that it's good enough by telling them that not reusing allocations seem to not be the best way.

Instead look at arguments why people still use C today (not C++!) what they conceptually like about it and you might realize many of the parts still apply to Zig.

Honestly Zig seems to be a grate choice for webasm or similar sandboxed systems where the potential damage of use-after free or double frees can be massively reduced.

sunflowerdeath · on March 22, 2021

I think, it is very important, that people started to actually discuss and compare different approaches to safety, instead of just saying that since Rust is safer, we should throw out all of c/c++ code and rewrite everything in rust.

com2kid · on March 22, 2021

It is amusing to me because last time I was using C, the problems rust solves weren't the problems I had in C.

Deeply embedded code doesn't use malloc, doesn't use threading.

I could use a better type system, ala Ada, being able to say "this variable is of type distance in meters, this variable is of type time in milliseconds", that'd have cut the # of bugs by a huge amount.

But simple, unsexy, type system changes like that aren't what language designers are focused on.

Who here has never confused Milliseconds and Seconds when passing a variable around? Trivial for a compiler to catch with a half decent type system, but few modern languages bother to try.

Even when writing modern code in newer languages, I rarely directly use threads, and if I need to pass data between them 95% of the time I can get away just doing a deep copy to avoid the hassles of sharing data between threads!

Obviously Rust is meant solving different problems than the ones I face, I have friends who frequently write highly threaded code, but in my day to day, Rust doesn't offer much more safety.

(However, Zig does look super cool and interesting!)

asalahli · on March 22, 2021

Have you looked into the subset of D called "Better C"? I recently stumbled upon it and have been wanting to try it out. It seems to solve the exact same problem you're describing, though I don't know how good of a job it does at that.

https://dlang.org/spec/betterc.html

steveklabnik · on March 22, 2021

I work on embedded code with no malloc, no threading. We still find Rust valuable. YMMV :)

wjjvk · on March 23, 2021

This is so normal. You are one of the developers of rust. LoL

steveklabnik · on March 23, 2021

The decision was made before I joined. I am also not the only developer.

paragraft · on March 22, 2021

Re units of measurement in Rust, have you looked at something like https://docs.rs/uom/0.31.1/uom/ ?

fulafel · on March 22, 2021

Oh we had this long before Rust, and most of C++ usage in new applications was displaced by safer (among other things) languages.

I think the biggest thing was that university curriculums and mainstream app development platforms (like Microsoft) stopped pushing it as hard when the level of horror got past a certain point. It used to be pretty bad. Business apps being written using MS "Active Template Library" in C++ and then used as signed ActiveX plugins on IE6-only web pages etc.

smolder · on March 22, 2021

Safety (memory and otherwise) isn't new, but during my CS curriculum, including a course on programming language theory, there was little/no mention of techniques to ensure safety in the space between C++ and Java. I probably would have pointed toward formal verification if someone said they needed safety guarantees in the absence of garbage collection and a potentially slow or bloated runtime.

Though I believe there were some languages with features to that end, at least research languages, they weren't that well represented. I think Rust's presence brought attention to the possibilities there, and an increasing number of people see the value of investigating and developing that niche.

pjmlp · on March 22, 2021

Microsoft still is the main company pushing it hard (C++ use) despite all security reports, most likely due to how the Windows and Office teams don't accept anything else.

So basically you have the DevTools and Azure teams pushing for .NET, Java and other safer languages, while Azure Sphere has a C only SDK and WinUI/UWP push C++ above anything else, with some C++ only APIs.

Politics.

pansa2 · on March 22, 2021

> Temporal memory safety and data race safety. [...] Unique to rust. [...] add a significant amount of complexity to the language.

Rust seems to be a very complex language. Is all that complexity essential to providing memory safety without GC? Or would it be possible to have a significantly simpler language that is equally safe? A language that’s “safe & C-like” compared to Rust’s “safe & C++-like”?

EwanToo · on March 22, 2021

You can write very simple Rust, it's worth trying it out if you haven't already.

I'm not a rust developer, but I've ported over a handful of python or golang projects to see how it works. I managed to write the code without understanding much about things like the borrow checker.

I'm certain my code is not as performant or elegant as it could be using some of the more complex tools and concepts in the language, but it is possible.

IshKebab · on March 22, 2021

> You can write very simple Rust

Technically true, but only really true if you don't use many dependencies. At some point you're going to use some dependency that uses async/await all over the place or really goes wild with generics and then it is definitely not simple.

Two examples:

* Heim (https://docs.rs/heim/0.0.11/heim/) is a great crate for getting system info, but it only uses async/await so you are thrown into that rather painful world even if you don't need it.

* Plotters (https://github.com/38/plotters) is a pretty great graph plotting library for Rust (the only one as far as I know), but they have definitely gone a bit overboard with the generics. Want to draw a scatter graph?

I tried simply calling `PointSeries::new()` and got a basically impossible-to-follow error about Rust not being able to infer the type `E` here:

https://github.com/38/plotters/blob/master/src/series/point_...

Very simple it is not!

moltonel3x · on March 23, 2021

async/await doesn't have to be painful: Slap a `#[tokio::main]` in front of your `main()` and a `.await` after your heim functions. Done.

IshKebab · on March 23, 2021

In the most perfect of perfect situations that's all you need.

In reality it isn't nearly as simple as that. There's been a lot of discussion about the pitfalls of async Rust recently, highlighting the issues.

Here's a good one: https://tomaka.medium.com/a-look-back-at-asynchronous-rust-d...

adrianN · on March 22, 2021

"You can write <adjective> <programming language>" is an answer that usually get rebutted by pointing out that a) you need to read a lot more code than you write and b) other people might not write <adjective> code or c) other people might have a different definition of <adjective>.

genuine_smiles · on March 22, 2021

This is true.

It’s also true that team using languages with more feature then they need can just take the parts that they need. It’s not quite as ideal as having a language that’s perfectly suited to your use cases, but it works well enough.

For example, I’ve been writing JavaScript for 3+ years. I have yet to use the prototype chain directly, only through the use of the `class` keyword, and I’ve only reviewed code using it once. I hear C++ is similar in that teams use a slice of the available language features.

pansa2 · on March 22, 2021

> I'm certain my code is not as performant

If you don’t need maximum-possible performance, though, why use Rust/C/C++ at all? Wouldn’t a better choice be Go/Java/C#?

afavour · on March 22, 2021

It’s all a matter of opinion but I actually find Rust a wonderful language to write in, given the right circumstances. Which usually means “without having to deal with lifetimes”.

I tried Go, I wanted generics and errors. I like C#, but not the ecosystem that comes along with it. And so on. So for me personally, Rust is a a valid choice even when performance isn’t a first concern.

littlestymaar · on March 22, 2021

Re-usability is a important feature: if you write a Rust library you can reuse it from most runtime-based language with “native modules” (or whatever they are called in the said language), exactly like a C library.

GrayShade · on March 22, 2021

Because even slow Rust code might have better performance or memory usage characteristics than idiomatic code in these languages. Or because it still protects you from race conditions. Or because you like cargo more than, say, maven. Or because you want to learn the language.

hu3 · on March 22, 2021

I was recently surprised that idiomatic Go can be faster than idiomatic Rust:

https://news.ycombinator.com/item?id=26463967

GrayShade · on March 23, 2021

I haven't read that article yet, but I wouldn't really call the simple Rust version idiomatic. Stdin::lines() incurs a string allocation for each line (fixing this takes three lines of source code), which can be quite significant. And garbage-collected languages will have faster allocation, so I'm not too surprised.

Of course, I know who the author of that code is. I just wanted to point out that it's not such a trivial comparison to make.

edflsafoiewq · on March 22, 2021

Java and C# typically require a VM, and not everyone thinks Go is "easier to write" than Rust.

pjmlp · on March 22, 2021

They can also be compiled to native code and as such it is a plain language runtime, just like Go.

Measter · on March 22, 2021

Well now I'm (morbidly?) curious what your Rust code looks like.

ivanbakel · on March 22, 2021

>Rust seems to be a very complex language. Is all that complexity essential to providing memory safety without GC?

The language features specific to memory safety i.e. the borrow checker, are essentially irreducible. It is also Rust's biggest piece of complexity, and the one that is hardest to learn. There is no simpler language inside Rust that has the same safety guarantees, unless you strip out other useful features (traits, async, etc.).

afavour · on March 22, 2021

> There is no simpler language inside Rust that has the same safety guarantees

I’d argue there is: there’s reference counting. Rather than using references and fussing with lifetimes you could sprinkle Rc<> wherever it’s necessary. You’d take a performance hit but the code would be simpler to write.

cdcarter · on March 22, 2021

Indeed, in fact, Swift often feels like Rust with automatic Rc<> for classes and Cow<> everywhere else.

pansa2 · on March 22, 2021

> There is no simpler language inside Rust that has the same safety guarantees, unless you strip out other useful features (traits, async, etc.).

This sounds like “there is no simpler language inside C++, unless you strip out useful features (classes, templates, etc.)”. Yet C exists.

So there could be a simpler language with Rust’s safety guarantees if you were willing to strip out traits, async etc?

ivanbakel · on March 22, 2021

>So there could be a simpler language with Rust’s safety guarantees if you were willing to strip out traits, async etc?

Well yes, there exists a hypothetical C + borrow-checker language. But that language wouldn't really be significantly simpler, because the borrow-checker is the largest contributor to Rust's complexity. The only things you would have taken away are the more well-understood features, as they already occur in other languages.

The C/C++ comparison doesn't really work, because there is (to my knowledge) no single C++ feature which makes up the majority of its complexity over C. You could strip out independent features of C++ one at a time to return to a simpler language. Rust doesn't have the same property.

dgellow · on March 22, 2021

> there is (to my knowledge) no single C++ feature which makes up the majority of its complexity over C

Maybe template programming?

pjmlp · on March 22, 2021

Object Pascal and Modula-2 would be already much safer than plain C, although you still suffer from use-after-free errors.

Then there is Ada, but it is in the same complexity level as C++, although much easier than Rust.

moltonel3x · on March 23, 2021

I don't have enough Ada experience to compare, but Rust is IMHO much easier and less complex than C++. Rust is renowned to have a steep initial learning curve, but that curve doesn't climb nearly as high as C++'s.

Snetry · on March 22, 2021

In the recent latest Zig Showtime stream Andrew showed off things like use after free, double free and so on detection

oblio · on March 22, 2021

Interesting comparison. Long term we badly need something to replace C (or at least minimize its usage drastically), so perfect should not be the enemy of good.

I hope something like Zig gets widespread adoption, including in embedded/IoT/automotive environments. Especially automotive. We're moving more and more life-and-death scenario-type tools into software.

roca · on March 22, 2021

We are, and I hope for life-and-death situations people are willing to work a little bit harder to get the extra protections Rust provides ... or much harder, and formally verify their code in which the language you use no longer matters as much.

the_duke · on March 22, 2021

Considering that Rust exists, I really don't hope that Zig gets much adoption, at least until the language improves a lot in some key aspects.

There definitely is a design space for a simpler language than Rust that is easier to write, but Zig is too far on the side of C and has lots of trivially introducable unsafety. It's an improvement over C , but imo not enough.

sph · on March 22, 2021

Rust instead is too far on the side of C++

I have tried to like the language, but sadly having to think about types and lifetimes robs precious energy which should be devoted to thinking about business rules and what am I actually trying to achieve.

In some niches Rust is perfect, but in every language thread on HN there's often someone that suggests to use Rust whatever the use case. C, in that respect, is more flexible and gets out of the way much more, of course while being unsafer, but it's easier to keep your mind on the goal and not figuring out the best memory safe approach for this piece of logic.

Which is why I'm very excited for Zig. I don't want another C++. Give me safer C, thanks.

orthoxerox · on March 22, 2021

> I have tried to like the language, but sadly having to think about types and lifetimes robs precious energy which should be devoted to thinking about business rules and what am I actually trying to achieve.

I've always wanted a "shut up about memory safety for a while, just don't free anything, I want to find out if my code produces the right answer" mode in rustc.

square_usual · on March 22, 2021

This might just be me being naive, but if you have a business-rules heavy project, why not just use a garbage collected language? I can't think of many use cases where you need a systems programming non-GC language but also have to write tons of custom business logic.

_vvhw · on March 22, 2021

Because, for example, a business-rules heavy project would also benefit from type safety and from a compiler that checks that return values are not ignored, that variables are defined correctly and not shadowed, and that errors are all handled. I'm not sure if there are many GC'ed languages that would do all that? These kinds of safety guarantees tend to come from compiled languages not GC'ed languages.

Beyond the correctness argument, also because the GC can really come back to bite you when you least expect, following the sudden "knee" of Little's Law. I've seen multi-minute pauses every few seconds even with V8's GC in production and it was not a pleasant experience. It cropped up, out of the blue, and in the end required a V8 core team member to advise and help comment out a few lines of C++ GC code that were overzealous.

creata · on March 22, 2021

In your first paragraph, you seem to be confusing interpreted languages and GC'd languages. Even Java has all of the features you've listed above, afaict.

_vvhw · on March 22, 2021

Yes, I was thinking of Java, which is why I said not "many" and "tend".

oblio · on March 22, 2021

> Because, for example, a business-rules heavy project would also benefit from type safety and from a compiler that checks that return values are not ignored, that variables are defined correctly and not shadowed, and that errors are all handled. I'm not sure if there are many GC'ed languages that would do all that? These kinds of safety guarantees tend to come from compiled languages not GC'ed languages.

GCed is orthogonal (as in: doesn't have anything to do) to type safety.

Java is GCed, so are Scala, Kotlin, C#, F#.

Even dynamic GCed languages towards the scripting side of things are moving to static typing: Typescript, Python Mypy, Ruby types (I forgot the name of the project).

_vvhw · on March 22, 2021

Of course, but in the past these would tend to go hand in hand, and the context here is not only about type safety, but about checked behavior for ignored return values and for exhaustive switches, i.e. all syscall errors are handled, and the compiler (or interpreter) will crash at compile time or run time with an error if not. Do Scala, Kotlin, C#, F# have all those features? Do most Java versions also not allow integer overflow?

square_usual · on March 24, 2021

I'm responding really late because I didn't check my threads for a while, but in response to your first point, there's plenty of GC'd languages that can give you type safety and compiler checks. As others have pointed out, Java, TypeScript and Go are fairly solid compiled languages with GC. If you want more type safety, Scala, F#, Ocaml and Haskell all have extremely powerful type systems that I personally find very useful when working on complicated business logic (my personal favorite is Ocaml). In the more exotic space, Nim has a really cool cross between ARC and GC, and comes with dependent types for extra safety! GC is not perfect, and as you point out, a "stop the world" GC can cause more pain than is saved from not having to do manual memory management, but I think there's a lot of good work in the space that makes me think really hard about picking up a language that would require manual memory management.