Zig self hosted compiler is now capable of building itself

AndyKelley · on April 16, 2022

Hello HN! Here is some context to decorate this announcement:

The Zig self-hosted compiler codebase consists of 197,549 lines of code.

There are several different backends, each at varying levels of completion. Here is how many behavior tests are passing:

       LLVM: 1101/1138 (97%)
       WASM:  919/1138 (81%)
          C:  740/1138 (65%)
     x86_64:  725/1138 (64%)
        arm:  490/1138 (43%)
    aarch64:  411/1138 (36%)

As you might guess, the one that this milestone is celebrating is the LLVM backend, which is now able to compile the compiler itself, despite 3% of the behavior tests not yet passing.

The new compiler codebase, which is written in Zig instead of C++, uses significantly less memory, and represents a modest performance improvement. There are 5 more upcoming compiler milestones which will have a more significant impact on compilation speed. I talked about this in detail last weekend at the Zig meetup in Milan, Italy[1].

There are 3 main things needed before we can ship this new compiler to everyone.

1. bug fixes

2. improved compile errors

3. implement the remaining runtime safety checks

If you're looking forward to giving it a spin, subscribe to [this issue](https://github.com/ziglang/zig/issues/89) to be notified when it is landed in master branch.

Edit: The talk recording about upcoming compiler milestones is uploaded now [1]

[1]: https://www.youtube.com/watch?v=AqDdWEiSwMM

erwincoumans · on April 16, 2022

Congratulations with the milestone!

Does using Zig over C++ lead to "less memory, and represents a modest performance"? Or was the C++ implementation a bit sloppy? (lacking data oriented design for instance)

Also, what specifically are you most excited using Zig for?

AndyKelley · on April 16, 2022

Thanks :)

The new Zig implementation is certainly more well designed than the C++ implementation, for several reasons:

* It's the second implementation of the language

* It did not have to survive as much evolution and language churn

* I leveled up as a programmer over the last 7 years

* The Zig language safety and debugging features make it possible to do things I would never dream of attempting in C++ for fear of footguns. A lot of the data-oriented design stuff, for example, makes use of untagged unions, which are nightmarish to debug in C++ but trivial in Zig. In C++, accessing the wrong union field means you later end up trying to figure out where your memory became corrupted; in Zig it immediately crashes with a stack trace. This is just one example.

* Zig makes certain data structures comfortable and ergonomic such as MultiArrayList. In C++ it's too hard, you end up making performance compromises to keep your sanity.

Generally, I would say that C++ and Zig are in the same performance ballbark, but my (obviously biased) position is that the Zig language guides you away from bad habits where as C++ encourages them (such as smart pointers and reference counting).

As for less memory, I think this is simply a clear win for Zig. No other modern languages compete with the tiny memory footprints of Zig software.

Some of the projects I am exited to use Zig for:

* rewriting Groove Basin (a music player server) in zig and adding more features

* a local multiplayer arcade game that runs bare metal on a raspberry pi

* a Digital Audio Workstation

jchw · on April 16, 2022

> * a Digital Audio Workstation

Woah.

A little bit ago I went to write a little tool that mucked with the FL Studio FLP format. It was easy enough to guess out the bits that I cared about, so I pretty much just did that using a couple quick projects with specific things in them. However, I did check to see if anyone else had mucked around with the FLP format, and couldn’t help but notice your name. Was pretty surprised as a very curious onlooker to the Zig programming language. You certainly seem to get around :)

That’s a little tangential, but I guess I mention it because I was actually wondering if this was ever something you planned on doing given the fact that it was clear you had dabbled with DAW stuff (forgive me for not knowing if you have a more rich connection to music production than just that; I never bothered to check.)

A DAW in Zig sounds like a kick-ass idea. I tried to write a sort-of DAW toy with friends in Rust and it was a lot of fun even if it mostly made me realize how challenging it could be. (And also how bad at math I am. It took me so much effort to feel like I could understand FFTs enough to actually implement them.) It makes me wonder if your Zig DAW would be open source? It would be a fun project to attempt to contribute to, if or when it ever came to fruition.

Exciting stuff. Congrats on the Zig milestone.

PaulDavisThe1st · on April 16, 2022

> It took me so much effort to feel like I could understand FFTs enough to actually implement them.)

Do you understand how little DSP is involved in writing a DAW? It has almost nothing to do with DSP and everything to do with application architecture, data management, threading and more.

jchw · on April 17, 2022

I think you may be reading too much into what I said; I was working on a DAW-like toy, not a full DAW that uses VST plugins, and because I chose to write DSP code, I had to understand it. I don’t know what a real DAW looks like because I am not a subject expert.

That said, Rust seemed pretty promising for writing the audio engine of a DAW due to the memory ownership model. It was relatively easy to come up with a way to architect a very basic lock-free audio thread and feel sure that it was at least memory correct. I have no idea how a real DAW avoids certain pitfalls in the audio thread; lock-free isn’t too hard, but avoiding allocations in all circumstances seems tricky.

PaulDavisThe1st · on April 17, 2022

Most DAWs do not avoid locks in RT context.

Take a listen to my interview with Justin Frankel of Reaper - somehow in our 3hr+ epic chat, he acknowledges that they don't avoid locks entirely. And Ardour, despite trying, also fails to do so. Anecdotally, from my conversations with other DAW developers, they also do no manage it 100%. As Justin put it, it's more about contention avoidance than lock avoidance.

Avoiding actual stack-based allocation is pretty easy in C++. You just have to want to do it, and remember to do it.

Memory correctness is really easy in the RT threads of a DAW precisely because they do (should) not allocate. You're dealing with pre-allocated memory blocks that do not change over time. It's almost the simplest case of memory mgmt that I know of within a DAW. I have seen several new-to-DAW developers adopting ill-advised schemes for memory mgmt in an RT context. The "list of blocks" is one pattern that I consider something to avoid (and unnecessady). Single-reader/single-writer lock-free (circular) FIFOs get you almost all the way there, for everything.

jchw · on April 17, 2022

Interesting. I actually thought this was the case, but when I made a similar argument in a chatroom I got pretty severely roasted by people who were sure real audio engines never used locks in the audio thread. It seems like whichever way I go I’m wrong with audio :P but that’s OK, because I really am not pretending to be knowledgable here, for me it was only ever for fun.

The main reason I felt Rust was nice was simply the re-assurance that I could not, even if I wanted to, accidentally cause a data race within the confines of safe Rust. It’s not that the actual code was hard, but it did force me to ensure that what I was doing with the data model was actually safe. It wound up guiding how data flow worked in the playback engine.

Anyway, thanks for the pointer to the interview. I don’t expect myself to be writing the next Reaper or Ardour so it is probably OK if I don’t quite grok what you’re getting at. But, I do find this stuff rather interesting, so I would love to have a listen when I get a chance.

aaaaaaaaaaab · on April 17, 2022

It's a fun exercise to implement FFT and other DSP algorithms, but for real-world software please stick to true and tested libraries like FFTW or FFTPACK instead of your own naïve implementation.

jchw · on April 17, 2022

I feel like there’s some kind of PTSD in the audio field. I was only talking about a fun toy project among friends and I already have two replies critiquing it presumably by subject matter experts. Just an observation, I’m not really that sore over it. But still, even with cryptography, where it is well-known you never roll your own code (and yet people do anyways) I would not expect people to get all flustered if I had said “toy EC-SRP implementation” because the word “toy” is meant to signify “I know that’s not what you would do in the real world.”

maleldil · on April 16, 2022

Thanks for the detailed answer! I have more questions, if you'll indulge me:

If I understood it correctly, you think smart pointers and reference counting are bad habits. Why? Especially the smart pointers bit.

Why does Zig use less memory than other languages? Is it inherent to Zig, or can it be reproduced in other languages?

lupire · on April 16, 2022

I think Zig prefers explicit memory management, because allocations may fail and should be handled explicitly, and because automatic deallocations lead to hard-to-predict lifetimes (excess memory usage, and bugs for resource handles that are destructed at hard to predict moments).

These are things that a "systems language" programmer should put in the work to do correctly/near-optimally, and not ask the compiler to just do something "good enough", like Python would.

coder543 · on April 16, 2022

> because automatic deallocations lead to hard-to-predict lifetimes (excess memory usage, and bugs for resource handles that are destructed at hard to predict moments).

I don't really feel this is the case in Rust.

varajelle · on April 16, 2022

But it is. If you have a String on the stack, it's memory is only reclaimed at the end of the scope, while it often could be free'd before. This is especially bad in async code around await point since it means the memory need to be kept alive more than needed.

coder543 · on April 16, 2022

I disagree. It is substantially and unequivocally better to hold onto memory until the end of scope than to leak memory by default. How can anyone argue that leaking memory is a better default? That’s a ticking time bomb. Maybe someone is so optimistic as to believe they’ll catch every leak before shipping new code?

You can easily add a manual “drop” call in Rust at any point if you want to force an allocation to be freed sooner, but I speak from years of experience using Rust for work when I say that Rust’s RAII model is not problematic in practice. I’m not simply speculating or theorizing, and I have professional experience with a variety of languages at all levels of the stack. I personally don’t mind garbage collectors most of the time, but Rust is great when you need more control.

In C++, RAII can absolutely be problematic because you are able to easily do things that cause undefined behavior by accident, which is arguably worse than either leaking by default or the mere act of holding onto memory for the duration of a scope.

If you can propose a system which cannot be contrived to have any downside, that would be fantastic! In the real world, Rust’s approach to memory management is extremely pragmatic and beneficial. I’m sure someone will eventually improve on Rust’s approach, but “leak by default” isn’t it.

I honestly do enjoy following Zig… it is a fascinating language taking a really interesting approach to solving many problems, but its memory safety story is not where I want it to be yet. Leaking memory by default is technically safe, but it's not exactly endearing.

guidoism · on April 16, 2022

I’ve recently been writing personal code that leaks like a sieve. It’s just not worth my time to find every leak when the lifetime of the process is finite and short and it will only ever run on a machine with gigs of memory. I haven’t thought through your question enough but maybe a situation where memory usage would be super high if waiting until the end of scope? I’m probably trying to hard to come up with a situation but I have a gut feeling that freeing mid scope is important under certain circumstances to keep the code simple and understandable.

coder543 · on April 16, 2022

> I’m probably trying to hard to come up with a situation but I have a gut feeling that freeing mid scope is important under certain circumstances to keep the code simple and understandable.

I explained in my previous comment that you can explicitly "drop" any value at any time in Rust, if you choose.[0] But if you don't, it will still be dropped at the end of the scope. The developer has control, but the language will watch your back.

[0]: https://doc.rust-lang.org/std/mem/fn.drop.html

StefanKarpinski · on April 16, 2022

Heck, this is the entire memory management model of PHP, which I found shocking when I learned it, but makes sense given that the language is intended to generate web pages: just allocate, never reclaim the memory, then kill the process when you’re done.

coder543 · on April 17, 2022

Do you have any sources? Maybe some truly ancient version? PHP has done garbage collection (seemingly mixed with some reference counting, like Python) for at least 10 or 15 years... I didn't bother to keep searching for even older information, but nothing I saw indicated that PHP only released memory when the process for a request exited.

I don't even think most mainstream uses of PHP have done the process-per-request model for decades, but I could be wrong.

tored · on April 17, 2022

Correct, garbage collector was added to PHP 5.3, prior to 5.3 PHP did only reference counting. PHP 5.3 was released in June of 2009.

Here is a series of articles that describes it in detail

https://www.php.net/manual/en/features.gc.php

Especially look at this example where the same script is executed with and without GC

https://www.php.net/manual/en/features.gc.performance-consid...

varajelle · on April 16, 2022

Of course you can call drop() manually, but almost nobody does or even think about it because that's not the way you program in a language with RAII.

Don't get me wrong. I do think that rust and c++ RAII is much more convenient and safe than the C or Zig way.

(I'd even prefer if you could annotate given struct in rust so the compiler could drop them as soon as it's no longer used, but that s not that simple)

coder543 · on April 16, 2022

I definitely wish that Non-Lexical Lifetimes would eagerly drop anything that doesn't implement Drop.

It would probably be a breaking change to automatically call an explicit Drop implementation anywhere other than the end of the current scope, so I think that would have to be left as-is. String doesn't implement Drop, so it could easily be dropped eagerly within the scope as soon as it won't be referenced again. Such a change would be roughly equivalent to any of the compiler optimizations that reorder statements in ways that should be unobservable.

______-_-______ · on April 17, 2022

100%. I would actually go further and say every value should be dropped immediately after its last use, including temporaries in the middle of statements, whether or not it implements Drop. Reuse the same rules that NLL uses. Breaking change yes, so do it next edition. It would lower memory usage in general and solve so many little pain points, not least of which is holding strings across await points.

varajelle · on April 17, 2022

    fn foo(xx : &Mutex<String>) {
        let lock = xx.lock();
        let ptr = &mut *lock as *mut String;
        unsafe { use_from_c(ptr) }
    }

You get the idea: you don't want the lock or the string to be dropped before the unsafe code, even if the actual string is no longer used. That's the breaking change. It's hard to detect automatically, so hard to justify even in an edition.

coder543 · on April 17, 2022

I generally agree with you (which is why I made an exception for values with a manual Drop implementation), but to advocate on behalf of the idea… I don’t think it would be too crazy to make a rule that any function that invokes “unsafe” (or is itself defined as an unsafe function) would fall back to the old scope-based drop rules. Worst case scenario is that you’d be dealing with the current level of memory usage efficiency, and you might need to manually call drop if you want something dropped earlier, but in most cases things would be eagerly dropped even sooner.

Outside of uses of unsafe, are there any other serious problems with eagerly dropping values that manually implement Drop? Maybe this fallback mechanism would also have to be invoked simply for casting to a raw pointer anywhere in the function. Either way, a list of exceptions to eager drop would arguably be better than not having eager drop, as long as the list was sound.

It would still be a breaking change, and would definitely require at least being restricted to a new edition. Some people currently use Drop as a kind of scoped “defer”, especially for things like instrumentation, so maybe it would be time for the language to introduce a proper “defer” statement that exists for that purpose instead of making Drop so lazy for everyone.

vrfvr · on April 16, 2022

I still don't get why one should do that in the first place

pcwalton · on April 17, 2022

This doesn't match the experience of GC'd languages. I've never heard of problems in practice that arose from treating all values present on the stack/registers as GC roots, even if those values are dead in the data flow sense.

_3u10 · on April 17, 2022

Stacks are a fixed size. Unless you are running out of stack space there is no point in “freeing” memory from the stack. Where in the stack the stack pointer points has no bearing on stack size for any OS I’m aware of.

I’m assuming by string you mean a stack allocated array of char and not a std::string.

varajelle · on April 17, 2022

I meant a std::string on the stack, which is just a handle for more memory on the heap

kaba0 · on April 17, 2022

You can just put a single pair of bracket around it and that’s it.

For locks it is seriously the best way to manage them.

pjmlp · on April 16, 2022

And as we know from C, every developer is quite capable of taking care of use after free possible bugs.

Comevius · on April 16, 2022

The point of C and Zig is that you can solve any kind of problems with them. In particular there is much more to memory reclamation than garbage collectors or Rust's borrow checker. Some of the reclamation schemes offer wait-freedom, or efficiently support linearizable operations.

pjmlp · on April 16, 2022

Any kind of problem?

So how do you solve SIMD with C, without language extensions?

C is not a special snowflake.

Comevius · on April 16, 2022

SIMD is not a problem, it's hardware. Compilers tend to take advantage of the CPU-specific SIMD registers and instructions, or you can use them explicitly.

Writing generic SIMD code is more portable. C has libraries, and Zig also has vector primitives available through built-in functions.

pjmlp · on April 16, 2022

Any language can have libraries, C is not special here, nor is Zig.

In fact, a few GC enabled languages have explicit SIMD support.

turminal · on April 16, 2022

Are there languages that "solve" SIMD?

pjmlp · on April 16, 2022

ISPC for one.

https://ispc.github.io/

sitkack · on April 17, 2022

https://futhark-lang.org/

https://github.com/halide/Halide

maleldil · on April 16, 2022

What can C and Zig do that Rust can't?

Comevius · on April 16, 2022

Nothing if unsafe Rust is considered, though Zig does not require you to fight the language nearly as much. This is especially obvious with embedded. Zig's philosophy of no hidden control flow or allocation makes things simple. Simplicity is power.

For certain problems you would want a TLA+ specification for safety and especially liveness either way. It's not like Rust absolutely guarantees correctness in all cases.

Rust sits in a sweet spot between C/Zig and languages like Java, but it's not an appropriate replacement for either of them.

3a2d29 · on April 17, 2022

Certainly Rust sits where C++ is. Zig and Rust are definitely best used for different things, but I think sweating rust off as in between C and Java is inaccurate.

politician · on April 16, 2022

This is kind of a trick question because they are all Turing Complete, but so is Assembly. The best way to interpret your question, then, is not “what is possible” but rather “what is simple to express correctly” and “what is simple to get correct eventually”. Those are questions about how tricky it is to get something to compile in the language and which tools exist to help determine whether a compiled program does what’s expected and does not do what’s not expected.

maleldil · on April 18, 2022

It wasn't a trick question. I'm not asking it in the same tone as someone would ask 'What can C do that Java can't?'.

I believe that Rust can be as low level as you want. If you have to fight the language to accomplish something, it's because you're probably trying to do something unsafe, and Rust will let you do that if you pay the price.

I was looking for examples of real cases where Zig is a better option than Rust. And while Zig is a lovely language, its main points against Rust are that it lets you do unsafe things more easily, and it's easier to learn.

messe · on April 16, 2022

The general purpose allocator in the zig standard library has protections against use after free bugs.

pcwalton · on April 16, 2022

By quarantining all memory forever. This is not a scalable solution because keeping one allocation in a 4kB page alive will leak the whole rest of the page. And if you don't quarantine all memory forever then use after free comes back. If it were that easy to solve UAF then C++ would have solved it by now.

There is a scalable solution for UAF that doesn't involve introducing a lifetime/region system: garbage collection. Of course, that comes with its own set of tradeoffs.

AndyKelley · on April 16, 2022

The problem of a single allocation keeping a 4 KB page alive is something that you might commonly find with the C++ or Rust way of programming that encourages allocating many individual objects on the heap, but in Zig land this is pretty rare. In fact, compare a Zig implementation of a given application with an equivalent in any other language and you will find there is no contest with respect to the size of the memory footprint.

There are many cool possibilities that C++ has never explored and frankly I find your argument unimaginative.

pcwalton · on April 17, 2022

An allocator that can leak (4kB - epsilon) of memory for each allocation is broken. It's not a question of whether the language makes such allocations unusual: free() not actually allowing memory to be freed is a violation of the contract of a memory allocator.

From the man page (which I believe quotes ISO C): "The free() function shall cause the space pointed to by ptr to be deallocated; that is, made available for further allocation." If free doesn't do that because another allocation is pointing into that page, that's a violation of the contract. Standard quarantine doesn't violate the spec because the space will eventually become available no matter what, but perpetual quarantine does.

One way to fix the problem would be to round up all allocations to 4kB, but that's very wasteful and slow due to cache and page table traffic; you'd be better off from a performance point of view with a garbage collector. More promising is ARM MTE, though the tag is currently too small to really be called a solution to UAF as opposed to a mitigation. A 64-bit tag would be enough, but I'm not sure what the performance costs of that would be--I wouldn't be surprised if the increased memory traffic makes it slower than a GC.

gnuvince · on April 17, 2022

> way of programming that encourages *allocating many individual objects* on the heap

Quoting and bolding. I think this is the thing that I need to change most about my own style of programming.

pjmlp · on April 16, 2022

Just like C and C++ debugging allocators, so what is the improvement here?

charcircuit · on April 16, 2022

The lifetime of the memory a smart pointer manages is very predictable. It will have the same lifetime as the smart pointer itself (this is ignoring moves).

RustyConsul · on April 16, 2022

I think what he's saying is there is a way to carelessly use smart pointers in rust.

pub enum List { Empty, Elem(i32, Box<List>), }

instead of :

pub struct List { head: Link, }

enum Link { Empty, Some(Box<Node>), }

struct Node { elem: i32, next: Link, }

charcircuit · on April 17, 2022

Can you explain the difference between these approaches? Is it just that the first example allocates an extra u16 (tag of the tagged union) (ignoring any overhead)?

tialaramex · on April 17, 2022

Notice that if I make this alleged "List" with a single data item in it, my data lives in the List object I just made (probably on the stack), but an empty List gets allocated on the heap.

I thought Aria's "Entirely Too Many Lists" tutorial actually tries to build this, but it actually doesn't, she draws you the resulting "list" and then is like, OK, that's clearly not what we want, let's build an actual (bad, slow, wasteful) linked list as our first attempt.

kaba0 · on April 17, 2022

I don’t really get your example, or how does it have to do anything with rust? It’s just a bad linked list, isn’t it?

tialaramex · on April 17, 2022

Are you talking about the example by RustyConsul? Because I'm not RustyConsul so I can't tell you what specifically RustyConsul intended with this example. Yes this is a bad data structure in Rust, or C, or presumably Zig. In the context of this thread I'm sure Zig's proponents will say that you'd never make this mistake in Zig.

RustyConsul · on April 17, 2022

My original point was referencing the use of smart pointers. They are indeed very smart, but can be used stupidly as a bandaid kind of like .clone().

The difference really lies in the fact that i now have some data stored on the stack (The element) and some data stored on the heap because it's recursive. Was just a random example of where it's poor practice. As others have noted, linkedlists are a terrible data structure to begin with.

erwincoumans · on April 16, 2022

Great, thanks for those details! I primarily develop using C++ and avoiding pitfalls (smart pointers, exceptions, unintended memory allocations) takes a lot of effort.

I enjoy synthesizers (including Eurorack) and looking forward playing with a Zig DAW!

abbeyj · on April 16, 2022

If the union is untagged, how can it be determined (at runtime) that you've accessed the wrong field?

Spex_guy · on April 16, 2022

The compiler adds a tag in debug modes, but not in release modes.

a_t48 · on April 17, 2022

Neat, that's somewhere in between a `union` and a `std::variant`. You could build your own, but it's cool that it's a first class member of the language.

Karliss · on April 16, 2022

Can you clarify about the union part? Did you meant that zig allowed you replacing what would be untagged union in C++ with tagged union in zig? Or does the zig compiler has some kind of debug sanitizer mode which automatically turns untagged unions into tagged unions with checks?

dralley · on April 16, 2022

In the past they've talked about the speed and memory improvements they've gotten from using MultiArrayList in the compiler, storing tags separately from the unions themselves. If you have a union with a size of 16 bytes and you add a tag to that which is 1 byte, a lot of space is wasted due to padding. If you keep the tags in a separate array, both arrays are individually densely packed. Less memory wasted due to padding == less memory use overall and better utilization of cache.

But in terms of the implementation, this means working with untagged unions, because the tags are maintained externally.

tristan957 · on April 18, 2022

In debug builds, untagged unions become tagged behind the scenes as you described.

jcpst · on April 16, 2022

Thanks for all the context. Curious to know more about the concept/design of the DAW.

aaaaaaaaata · on April 16, 2022

Seconded.

PaulDavisThe1st · on April 16, 2022

> * a Digital Audio Workstation

ORLY?

Maybe you can put together a better, faster team (like Presonus managed to do for Studio One), but most current DAWs have been in existence for 20 years or more. Catching up with that is a challenge, and if you don't catch up then it's an interesting toy or half-a-DAW.

So why?

ps. obviously I am biased.

reitzensteinm · on April 17, 2022

This argument applies equally to programming languages, and Zig seemed to go OK?

PaulDavisThe1st · on April 17, 2022

Oh yes, it absolutely applies to all programming languages.

elcritch · on April 16, 2022

Impressive work getting Zig self-hosting. However:

> As for less memory, I think this is simply a clear win for Zig. No other modern languages compete with the tiny memory footprints of Zig software.

Is not true at all. There are several other modern languages that compete with Zig for small memory footprints.

geodel · on April 16, 2022

Would you tell about those languages or they are left as an exercise to readers?

elcritch · on April 16, 2022

Sort of left it to the readers and hopefully encourage people to investigate for themselves. It's far too easy to get into "this benchmark vs this benchmark", etc. Memory usage is overall a complicated topic, which I think Andrew's comment doesn't do justice to. Granted the Zig team has done some impressive work, much of the memory usage & bloat in the C++ world comes from real world programs and libraries and the inevitable drift in program architecture over time.

However, I could believe Zig's stdlib and culture encourages low memory footprints, but there isn't anything novel in the language that makes it inherently lower memory footprint. Though to name a few languages I'd say that can be directly comparable are Rust, D (esp. BetterC mode), and Nim. Even Julia & Go in the right context. Though honestly I often prefer wasting a few hundred bytes of RAM here and there, even on a microcontroller, for pure convenience.

edit: forgot Odin. Another comment mentioned it, though I've never used it. It looks like it's used in production.

dralley · on April 17, 2022

Zig has a few extra tricks up their sleeves compared to most languages. The ones that come to mind are

* MultiArrayList

* arbitrary size-integers that make it easy to pack data very tightly

* easy access to several special purpose allocators which reduce the need for strategies like reference counting

elcritch · on April 17, 2022

Perhaps, though nothing there that's not been available for a while. To be fair Zig has integrated some of these features well.

* MultiArrayList is convenient and easy to do in Rust, D, Nim. Odin seems built around it. * Arbitrary bit-size integers seem more gimmicky than anything and similar to struct bitfields in other languages (except Rust oddly). Most compilers don't even pack uint8/uint16's for performance reasons. * Special purpose allocators are a bit more interesting. Still they don't provide actual memory safety AFAICT and can have fragmentation and performance penalties.

It'll be interesting to see larger Zig code bases emerge in different fields and see how the memory footprint compares in practice.

bonzini · on April 17, 2022

What? 8- and 16-bit are never 32-bit aligned nor sized.

elcritch · on April 17, 2022

To clarify I meant comparing how int8/int16's are packed in structs vs struct bitfields. Can't recall about the stack rules. Here's more discussion:

http://www.catb.org/esr/structure-packing/

https://github.com/Twon/Alignment/blob/master/docs/alignment...

Also ARM for example doesn't have 8/16 bit registers so int8 or int16 will use a 32bit register:

https://stackoverflow.com/a/23716920

Curiosity got to me, perhaps Zig had improved significantly. So I compared the first benchmark I found (kostya/benchmarks/bf) with Zig with Nim. For the smaller input (bench.b) Zig did run with ~22% less RAM (about 20kB less).

However, for the larger input (mandel.b) Nim+ARC used ~33% less RAM in safe mode: Nim 2.163mb -d:release; Zig 2.884mb -O ReleaseSafe; Zig 2.687mb -O ReleaseFast. The Nim requires 0.5mb less ram and the code is ~40% shorter. I don't have time to try out the Rust or Go versions though.

edit: grammar

dralley · on April 17, 2022

Are the two using the exact same algorithm?

elcritch · on April 17, 2022

Yes, as far as I can tell [1]. It's a simple Tape algorithm. Neither had any crazy SIMD, threads, no custom containers, etc. They use almost the same function names (and look the same as the Go version too). I used `time -l` https://stackoverflow.com/a/30124747 for memory usage.

Note I ran the benchmarks locally (MacBook Air M1) because the reported benchmark uses the older (default) Nim GC while I only use Nim+ARC. I also had to fix the Zig code and it took a few tries to get the signed/unsigned int conversions working. I tried tweaking flags for both a bit as well to see how stable they were. Zig's memory usage was pretty constant. Nim had a small speed vs memory tradeoff that could be tweak-able, but the defaults used the least memory.

Overall I'd expect exact memory usage by language(s) to vary some by benchmark and one random benchmark isn't conclusive. Still I didn't find anything to indicate Zig is clearly better than other new generation languages. Manual memory management might actually be worse than letting a compiler manage it in some cases.

1: https://github.com/kostya/benchmarks/tree/master/brainfuck

Tozen · on April 21, 2022

In that category (of C/C++ alternatives) is also Vlang (https://vlang.io/). It would be closer to Odin, than Julia, Go, or Nim. Although Vlang and Odin have a strong Go influence.

blippage · on April 17, 2022

Odin doesn't work on 32-bit ARM, which was a disappointment to me.

elcritch · on April 17, 2022

Bummer, looks like an intriguing language.

geophertz · on April 16, 2022

> written in Zig instead of C++, uses significantly less memory, and represents a modest performance improvement

That's particularly interesting considering the rust compiler in rust has never been as fast as the original OCaml one

pcwalton · on April 16, 2022

Huh? That's not true at all. It took over 30 minutes to compile the self-hosted Rust compiler with the OCaml compiler, when rustc was far smaller than it is today. rustboot was agonizingly slow, and one of the main reasons why I was so anxious to switch to rustc back in those days was compilation speed.

I was there and had to suffer through this more than virtually anyone else :)

geophertz · on April 18, 2022

My source: https://pingcap.com/blog/rust-compilation-model-calamity#boo...

> 7 femto-bunnies - rustboot building Rust prior to being retired

> 49 kilo-hamsters - rustc building Rust immediately after rustboot's retirement

> 188 giga-sloths - rustc building Rust in 2020

sanxiyn · on April 16, 2022

Well, OCaml Rust compiler also didn't use LLVM and used its own lightweight code generator and I think self-hosted Rust compiler frontend was in fact faster than OCaml Rust compiler frontend.

parentheses · on April 16, 2022

With both projects, how much of the improvement is simply building for the second time?

sanxiyn · on April 16, 2022

For Rust, I think improvement was almost entirely due to LLVM producing faster code. That's not applicable to Zig case, since both old and new compiler use LLVM. I don't know enough about Zig to answer.

kibwen · on April 16, 2022

The original OCaml compiler didn't have essentially any of the static analysis that Rust would eventually be known for. Rust in 2011 (when rustc bootstrapped) was dramatically different from what would later stabilize in 2015.

est31 · on April 16, 2022

I wonder how much this statement still holds. I've never used the OCaml bootstrap compiler but performance wise, the rust compiler has improved incredibly since the 1.0 release.

sanxiyn · on April 16, 2022

An apple to apple comparison is impossible because rustboot compiled a very different language. But I suspect suitably updated rustboot would be still faster because compilation time is dominated by LLVM.

pcwalton · on April 16, 2022

Rustboot's code generator was generally slower than LLVM. I think in some small test cases it might have been faster, but when implementing stuff like structure copies rustboot's codegen was horrendously slow because it would overload the register allocator.

icsa · on April 16, 2022

> uses significantly less memory, and represents a modest performance improvement.

The reduced memory has significant value. Being able to do the same build on less expensive hardware or do more with the same hardware is a significant financial performance improvement

joshbaptiste · on April 16, 2022

yup 18x memory reduction improvement 8.5GB -> 0.5GB according to the vid..

dleslie · on April 16, 2022

Happy to see the C backend coming along. LLVM is a major barrier to use on esoteric embedded devices.

sitkack · on April 16, 2022

You can also target C through Wasm.

https://github.com/WebAssembly/wabt/tree/main/wasm2c

exikyut · on April 16, 2022

My genuine question is what sort of code-size and/or performance impact the translation imposes.

The simple example in the README.md seems straightforward enough, but I wonder if there are any pathological explosions in practice.

sitkack · on April 18, 2022

That would be an interesting undergraduate paper. Perf, Size, by primary language, by toolchain, linker, post processor (dead code elimination, etc).

For a pathological explosion, you mean something like a Zip Bomb? Wasm and C are pretty close together in their semantics, Wasm hides the stack and prevents jumping into the middle of a function (CFI, Control Flow Integrity). I think the code bloat should be on the order of some multiple of the smallest interpreter.

I just did a quick scan of Wasm interpreters (3 in Rust, 1 in C)

    yblein/rust-wasm ~4kloc
    rhysd/wain ~16kloc
    paritytech/wasmi ~25kloc
    wasm3/wasm3 ~22kloc (C)

My hunch is that the expanded code would be approximately (2x-5x interpreter + bin.wasm). I just did a spot check with doom.wasm, I am wrong. The resulting expanded C code when compiled to Arm is 2x the wasm binary size.

    4.1M wasidoom.o
    1.8M wasidoom.wasm

https://en.wikipedia.org/wiki/Zip_bomb

einpoklum · on April 16, 2022

* What kind of tests are these "behavior test"?

* Is that a list of compilation targets?

* If not all behavior tests pass, does that not mean that the compiler fails to compile programs correctly?

Please indulge those of us who are not familiar with self-hosting compiler engineering.

Spex_guy · on April 16, 2022

> What kind of tests are these "behavior test"?

Snippets of zig code that use language features and then make sure those features did the right thing. You can find them here: https://github.com/ziglang/zig/tree/master/test/behavior

> Is that a list of compilation targets?

Mostly. Pedantically, it's a list of code generation backends, each of which may have multiple compilation targets. So for example the LLVM backend can target many architectures. The ones that are architecture specific are currently debug-only and cannot do optimization.

> If not all behavior tests pass, does that not mean that the compiler fails to compile programs correctly?

Some tests are not passing because they cause an incorrect compile error, others compile but have incorrect behavior (miscompilation). Don't use Zig in production yet ;)

(edit: fix formatting)

kristoff_it · on April 17, 2022

To add to what Spex said: also some of those tests check language features that the compiler code doesn't exercise, like async/await. This means that the compiler is able to build itself, but is not able to build every possible valid Zig program. We're getting there though :^)

riffraff · on April 16, 2022

it's so surprising to hear there was a Zig Meetup in Milan, I'd not expect a large enough community to exist there, pretty cool!

vanderZwan · on April 16, 2022

Probably a significant chunk of the larger European community was represented there too - don't forget that traveling to EU countries is relatively easy for EU citizens

carapace · on April 16, 2022

Congratulations!

jonpalmisc · on April 16, 2022

I think Zig’s compatibility with C is such a valuable feature.

I also wish we could rewrite everything in a modern language, but the reality is that we can’t and that if we could, it would take a LONG time. The ability to start new projects, or extend existing ones, with a modern and more ergonomic language—Zig—and be able to seamlessly work with C is incredible.

I look forward to the self-hosted compiler being complete, and hopefully a package manager in the future. I’d really like to start using Zig for more projects.

jitl · on April 16, 2022

Zig as the “Kotlin of C” makes it very appealing. Kotlin has seen fantastic adoption in JVM projects because you can convert files one at a time from .java to .kt, with only a modicum of one-time build system shenanigans up front. Then your team can gain experience gradually, fill in missing pieces like linters over time, all without redesigning your software.

What zig offers is even better - because zig included a CC, you can actually reduce complexity with zig by getting a single compiler version for all platforms, rather than a fixed zig + each platform’s cc. And with it, trivially easy cross-platform builds - even with existing C code. That’s cool! Go has excellent cross-compilation, but Go with C, not so much.

Rust is a powerful tool, but it’s a complex ecosystem unto itself - both conceptually and practically. It has great interoperability frameworks, but the whole set of systems comes at a substantial learning cost. Plus, porting an existing software design to Rust can be a challenge. It’s more like the “scala of C” if we’re trying to stretch the analogy past the breaking point.

pjmlp · on April 16, 2022

Kotlin has seen fantastic adoption in Android projects, because of the way Google pushes it, while stagnating Android Java on purpose.

On the JVM world not really.

https://www.infoq.com/news/2022/03/jrebel-report-2022/

holoduke · on April 16, 2022

Are they pushing? Most documentation for Android dev is still Java. Or by default Java. It's only the Intelij guys pushing for Kotlin by creating a lockin in their IDE. One reason I refuse to use it.

pjmlp · on April 16, 2022

You have missed all the Kotlin only Jetpack libraries, NDK documentation now using Kotlin, Jetpack libraries originally released in Java now rewritten in Kotlin.

They are still using Android Java on the system layers, because they aren't rewriting the whole stack.

Even the update for Java 11 LTS subset on Android 13 is mostly likely caused by the Java ecosystem moving forward, than the willingness of Android team to support anything other than what can be desugared into Java 8 / DEX somehow.

You are right in relation to JetBrains, they even acknowledge it on their blog post introducing Kotlin.

https://blog.jetbrains.com/kotlin/2011/08/why-jetbrains-need...

"And while the development tools for Kotlin itself are going to be free and open-source, the support for the enterprise development frameworks and tools will remain part of IntelliJ IDEA Ultimate, the commercial version of the IDE. And of course the framework support will be fully integrated with Kotlin."

sedatk · on April 16, 2022

> You have missed all the Kotlin only Jetpack libraries, NDK documentation now using Kotlin, Jetpack libraries originally released in Java now rewritten in Kotlin.

Also, Oracle's lawsuit against Google for copying Java APIs.

pjmlp · on April 16, 2022

This doesn't compute, because Kotlin is heavily dependent on the Java ecosystem, regardless how Google screw up Sun and the Java community with their Android Java.

sedatk · on April 16, 2022

I still see it as a passive aggressive move for Google to get back at Oracle. They can't back away from Java API completely but they can hurt Oracle by discrediting Java language.

carstenhag · on April 16, 2022

As an Android Dev: noone of us wants to do java projects anymore. If we had support for some recent versions maybe, but as it is, there's no going back.

0 of our recent or current projects still use java.

Google is either moving/extending libs to natively integrate with kotlin (numerous -ktx libs) or they are kotlin-only (Compose) anyway.

I don't really see the Jetbrains lock-in thing, because: Android Studio is free, you can use any other IDE with syntax highlighting and the terminal to run tests & to compile.

If you want to blame someone for locking in android devs into Android Studio, it would be Google, because they build the previews into Android Studio afaik. But you would have the same criticism at Apple/XCode. Supporting one IDE is already tough I guess.

Grimburger · on April 16, 2022

As someone not familiar with android development, this is somewhat confusing because I always thought Google was pushing dart/flutter for mobile these days?

Or is it both at once?

jitl · on April 16, 2022

Dart/Flutter is their React Native competitor. It’s in the same space as Xamarin - A secondary language and UI toolkit who’s selling point is rapid development for multiple platforms.

Kotlin is to Java (on Android) as Swift is to Objective-C (on Apple) - the successor primary platform language.

Grimburger · on April 16, 2022

Thanks, that's a solid summary.

Feels like an environment that moves so quickly (to someone like me anyway). Can barely keep up.

Aldo_MX · on April 18, 2022

I think it is better to compare Flutter with Apache Cordova / Ionic Capacitor, as Flutter actually has a DOM tree behind the scenes, but instead of using a Web View to render the application Flutter renders the app directly to a GPU Accelerated surface using Skia.

charcircuit · on April 16, 2022

From talking to a Googler the reason why Google devotes resources to Kotlin is that there was and still is a large external demand from the Android development community for Kotlin.

pjmlp · on April 16, 2022

Kotlin adoption was triggered from inside, with some anti-Java attitude.

https://talkingkotlin.com/the-first-kotlin-commit-in-android...

charcircuit · on April 16, 2022

My information was from someone from the team working on jetpack compose. So maybe the answer I was given comes from a different context.

hota_mazi · on April 16, 2022

Not sure where you got that impression from, the Android team has never pushed, let alone mentioned, Dart/Flutter, ever. All the Flutter advertising you hear is from the Flutter team.

Kotlin and Java are the main languages supported on Android.

exikyut · on April 16, 2022

Huh. I read about this the other day as well - https://news.ycombinator.com/item?id=30842602:

> If only [Oracle] hadn't sued Google, Java would still have been the pre-eminent language for Android development. Sadly Android is stuck at legacy Java 8 permanently now. So, modern Java is stuck as a server-side language with dozens of competitors.

A reply argued that Android is on Java 11 now, and then you noted (hi!) that it's "a subset". Huh.

I'm trying to get a handle on understanding the ramifications of the legal/licensing situation, and the actual concrete impact on Java's use in Android. The subject seems somewhat murky and opaque. Is there possibly a high-level disambiguation about what's going on published anywhere?

kaba0 · on April 17, 2022

The whole Oracle google lawsuit had nothing to do with modern Android’s use of Java — java has been open-sourced since and for a long time now (by oracle themselves).

The lawsuit was about Sun’s license that explicitly demanded a purchase for use on mobile devices for their Java programming language (as that was the area they wanted to get money from). Google instead copied most of the APIs and called it a day, and Oracle bought Sun and went after the lawsuit.

But since the license changed in the meantime so that OpenJDK is completely open-source and has the same license as the Linux kernel, it was all about an older state of things.

pjmlp · on April 16, 2022

The set of Java 11 LTS features and standard library missing from Android 13 is left as exercise for the reader, they are relatively easy to find out, hence subset.

You can check the Javadoc for the standard library, and the JVM specification. Then compare with DEX/ART and the Android documentation for Java APIs.

jitl · on April 16, 2022

That poll pours a cold bucket of water on “fantastic adoption” among all the respondents, but compare adoption of Kotlin and Java releases after Kotlin’s release. 1 in 6 respondents using language versions after 2016 are using Kotlin. I don’t think that’s too shabby.

pjmlp · on April 16, 2022

Java is still Java, regardless of the version, otherwise we should add Kotlin versions to the discussion as well.

richardfey · on April 16, 2022

The "Kotlin of C", what a beautiful metaphor!

kristoff_it · on April 16, 2022

> I also wish we could rewrite everything in a modern language, but the reality is that we can’t and that if we could, it would take a LONG time. The ability to start new projects, or extend existing ones, with a modern and more ergonomic language—Zig—and be able to seamlessly work with C is incredible.

That's the Maintain it With Zig approach :^)

https://kristoff.it/blog/maintain-it-with-zig/

mlinksva · on April 16, 2022

Sounds compelling. Is there a list of projects following this advice anywhere?

Tozen · on April 16, 2022

Zig is not the only one. Other newer languages like Vlang, Odin, Rust, Nim... offer strong C interop.

jonpalmisc · on April 16, 2022

Can’t speak about the others, but Rust’s C interop is nothing like Zig’s, not to mention that Zig can also compile C.

Shoop · on April 16, 2022

For reference:

Tracking issue for overall progress on the self-hosted compiler: https://github.com/ziglang/zig/issues/89

Zig's New Relationship with LLVM: https://kristoff.it/blog/zig-new-relationship-llvm/

fouronnes3 · on April 16, 2022

Are there any languages out there that can only be compiled by a compiler written in their own language? Presumably because the original pre-dogfood compiler stopped being maintained years ago. So that if we somehow lost all binaries of the current compiler, that language would effectively be lost?

richardfey · on April 16, 2022

Most modern languages cannot be compiled anymore with their original pre-dogfood compiler, but we have the sources of the older versions so you can bootstrap them in a sequence.

retrac · on April 16, 2022

The Haskell GHC compiler is written mostly in Haskell. Rust's compiler is also written in Rust, these days. TypeScript's compiler is in TypeScript.

It's a pretty common state of affairs, actually. Often arises out of the second or third implementation of the compiler being much better than the first attempts, probably coupled with the momentum of people using the language who can contribute to the tooling because it's in the same language.

amelius · on April 16, 2022

If your language (call it X) can only be compiled by a compiler written in X, then you can always create an X-to-C transpiler (it doesn't need to be efficient, and it can even leak memory, as long as it can complete the bootstrapping process).

fasquoika · on April 16, 2022

GHC is nearly impossible to bootstrap if you don't consider the vendored transpiled C code to be source. Versions of GHC not dependent on GHC were never public AFAIK.

https://elephly.net/posts/2017-01-09-bootstrapping-haskell-p...

dottedmag · on April 16, 2022

coder543 · on April 16, 2022

A lot of popular C compilers are actually written (at least partially) in C++ these days.

dottedmag · on April 17, 2022

In this case, C++?

kaba0 · on April 17, 2022

https://guix.gnu.org/manual/en/html_node/Preparing-to-Use-th...

The Guix project likes bootstrappility very much. They basically host a tiny assembly C-compiler (only for a subset of C) which can compile a C compiler written in C for the whole subset that can bootstrap the whole ecosystem.

dottedmag · on April 18, 2022

This link is better: https://savannah.nongnu.org/projects/stage0: bits to asm to C, and then everything else follows.

viraptor · on April 16, 2022

You're commenting on an article about Zig which became self-hosted and can compile C. (There's also lots of other C compilers available)

dundarious · on April 16, 2022

Zig compiles C/C++ by deferring the vast majority of the work to libclang, which is written in C++. Also note Zig is self-hosted when using the LLVM backend, which means deferring to C++ for much of the code generation. There is no "end-to-end Zig" self-hosted compiler yet, because the Zig native backends are not as near completion. See the creator's comment about the breakdown: https://news.ycombinator.com/item?id=31052234. (I'm excited about this progress, so this is not meant as any kind of knock on Zig, which I think is quite impressive)

But you're right that C is not a good example.

skrebbel · on April 16, 2022

TypeScript

mkl · on April 17, 2022

Nope: https://github.com/swc-project/swc

cercatrova · on April 16, 2022

I was thinking of learning Rust but it seems a bit overkill due to manual memory management as compared to languages with similar speed like Nim, Zig, and Crystal. How would one compare these languages?

Is it worth learning Rust or Zig and dealing with the borrow checker or manual memory management in general, or are GC languages like Nim or Crystal good enough? I'm not doing any embedded programming by the way, just interested in command line apps and their speed as compared to, say, TypeScript, which is what I usually write command line apps in.

lijogdfljk · on April 16, 2022

It's funny i came to Rust from Go, Python, NodeJS, etc after a combined .. 15 years or so. I've been using Rust full time (work & home) for ~2 years now.

Obviously i'm biased, but i quite enjoy it. I find i am more efficient now than before, because it manages to give me the ease of the "easier" languages quite often with a rich set of tooling when i need to go deeper.

Personally i feel the concern over the borrow checker is way overblown. 95% of the time the borrow checker is very simple. The common worst case i experience is "oh, i'm borrowing all of `foo` when i really want `foo.bar` - which is quite easily solved by borrowing just the `.bar` in a higher scope.

The lifetime jazz is rarely even worth using when compared to "easier" languages. Throw a reference count around the data and you get similar behavior to that of them and you never had to worry about a single lifetime. Same goes for clones/etc.

I often advocate this. For beginners to use the language like it was a GC'd language. A clone or reference count is often not a big deal and can significantly simplify your life. Then, when you're deeper into the language you can test the waters of lifetimes.

So yea. Your mileage will vary i'm sure. But i find Rust to be closer to GC'd languages than actual manual languages, in UX at least. You won't screw up and leak or introduce undefined behavior, which is quite a big UX concern imo.

capableweb · on April 16, 2022

The perspective of someone who is learning Rust (but not professionally) during the last few months. :

- The borrow checker is one of the easier parts of Rust to grok, it's just as you say, not that complicated in the end.

- Traits are more annoying to understand and find in source code when they can get added from anywhere, and suddenly you code gets extra functionality, or it's missing the right one unless you import the right crate but there is no "back-reference" so you're not clear what crate the code actually comes from.

- Crates/libraries are harder to grok with their mod.rs/lib.rs files and whatnot, in order to structure your application over many files.

- Macros are truly horrible in Rust, both to write and debug, but then my only experience with macros are with Clojure, where writing macros is basically just writing normal code and works basically the same way

- Compilation times when you start using it for larger projects tend to be kind of awful. Some crates makes this even worse (like tauri, bevy) and you need to be careful on what you add as some can have a dramatic impact on compilation speed

- The async ecosystem is not as mature as it seems on first look. I'm really looking forward to it all stabilizing in the future. Some libraries support only sync code, others only async but only via tokio, others using other async libraries. Read somewhere that eventually it'll be possible to interop between them, time will tell.

- Control flow with Option/Result and everything that comes with it is surprisingly nice and something I'm replicating when using other languages (mainly a Clojure developer by day) now.

My development journey was PHP -> JavaScript -> Ruby -> Golang -> Clojure with doing them all the capacity of backend/frontend/infrastructure/everything in-between, freelancing/consulting/working full-time at long term companies, so take my perspective with that in mind. Rust would be my first "low-level" language that I've used in larger capacity.

Georgelemental · on April 16, 2022

`rust-analyzer` lets you find the trait/impl that provides a method, if you don't have it in your IDE you should get it.

capableweb · on April 17, 2022

True, that does help when you want to dig into things. I think my main pain-point here is that without extra tooling, the reference is not surfaced. Compare this to Clojure, where every reference is explicit, things normally don't get "magically" created in your scope. Just by searching for a var in the current file upwards (up to the `ns` declaration), you can find out where code is coming from, which is not always possible in Rust.

jackosdev · on April 17, 2022

It takes you to the trait definition but not the specific implementation. There is an open issue looks like a hard problem to solve, but intellij does it so must be doable.

klabb3 · on April 16, 2022

> Personally i feel the concern over the borrow checker is way overblown. 95% of the time the borrow checker is very simple.

I have been using Rust professionally as well and had a different experience. For anything singlethreaded I agree with you. For any asynchronous behavior, whether it's threads or async or callbacks, the borrow checker gets very stubborn. As a response, people just Arc-Mutex everything, at which point you might as well have used a GC language and saved yourself the headache.

However, enums with pattern matching and the trait system is still unbeatable and I miss it in other languages.

melony · on April 16, 2022

The problem is that when lifetimes cause you problems, it can force the entire feature development to stop until the problem is fixed. There is no reasonable escape hatch or alternative (clone doesn't always work).

klabb3 · on April 16, 2022

+1. My programming style normally consists of trial-and-error style prototyping for a couple of iterations, and then later refining that into something that's solid and robust. I find Rust's inofficial "prototyping mode" difficult to combine with it's regular "production grade mode" for practical purposes.

pornel · on April 19, 2022

Rc/Arc is usually the escape hatch. I presume you put raw pointers under the "no reasonable" caveat, but if the borrow checker is wrong and you're right, they can be the right tool too. They're not unreasonable if the alternative was to use a different language that couldn't guarantee this safety either.

I know dealing with the borrow checker can feel like a dead-end sometimes, but dealing with it is something that you can learn. Things that it can't handle fall into a handful of common patterns (like self-referential structs). Once you learn what not to do, you can recognize and avoid the issues even before you write the code.

bobajeff · on April 16, 2022

That sounds like good advice. I'll keep that in mind next time I attempt to use Rust on something.

pyjarrett · on April 16, 2022

Ada is another option without a GC. I wrote a search tool for large codebases with it (https://github.com/pyjarrett/septum), and the easy and built-in tasking and pinning to CPUs allows you to easily go wide if the problem you're solving supports it.

There's very little allocation since it supports returning VLAs (like strings) from functions via a secondary stack. Its Alire tool does the toolchain install and provides package management, so trying the language out is super easy. I've done a few bindings to things in C with it, which is ridiculously easy.

throwaway239i3j · on April 16, 2022

Of the ones you mentioned, Zig is the only one that has explicit memory management.

> Is it worth learning

Languages are easy to pick up once you understand fundamentals. The borrow checker is intuitive if you have an understanding of stack frames, heap/data segment, references, moved types, shared memory.

You then should be asking "Is it worth using?", then evaluate use cases.. pros/cons.. etc.

For CLI, Rust is likely the easiest given it's macros, but if you struggle with the borrow checker then it won't be. You will be fighting the compiler instead of developing something.

Depending what your CLI program is doing, you might want to evaluate what libraries are available, how they handle I/O, and parallelism.

JavaScript has incredibly easy and fast concurrent I/O thanks to libuv and v8.

travisgriggs · on April 16, 2022

>> Is it worth learning

> Languages are easy to pick up once you understand fundamentals. The borrow checker is intuitive if you have an understanding of stack frames, heap/data segment, references, moved types, shared memory.

I see this sentiment often. In the last 10 years, I have come up a level in raw C, learned Kotlin, Swift, Python, Elixir/Erlang, and a smattering of JavaScript, all coming from a background that included Fortran and Smalltalk.

My problem with the dialogue is what is meant by “learn.” I have architected, implemented, and maintain different components of our products in all these languages currently. I think that demonstrates I have “learned” these languages, at least at this level of “picked up.” But I can’t write Python the way Brett Canon does. Or Elixir the way Jose Valium does. Or any of their peers. And in that regard I still very much feel I have not “learned”.

I spent a couple days playing with Zig a month or so ago. I became familiar with the way it worked. I could spend another month or so in that phase, and then could probably comfortably accomplish things with it. But I don’t think I’d feel like I’d “learned” it.

It reminds me of my experience learning Norwegian. I lived in Norway for 2 years and did my best to speak as much Norwegian as I could. At six months I could definitely get by. At 13 months, as I embraced the northern dialect, I was beginning to surprise Norwegians that I was from the states. I started dreaming in Norwegian at some point after that. But even at 24 months, able to carry on a fluid conversation, I realized I still could “learn” the language better than I currently knew it.

So I guess, it always seems there needs to be more context, from both the asker and the answerer, when this “should I learn X” discussion is had. Learning is not a merit badge.

jgillich · on April 16, 2022

I've found that Go is not elegant enough for me and Rust is too difficult to write (I started using Rust in 2015 and after years of trying I eventually realized Rust doesn't make sense for most apps), so I'm all in on Crystal. Despite not having much prior Ruby experience, I absolutely love the language.

pizza234 · on April 16, 2022

Crystal doesn't have built-in support for parallelism, let alone production-grade support. This is a significant lack for a modern language.

For a language that is around 8 years old, this may be a serious problem, since the surrounding ecosystem has been probably written without parallelism in mind, and it may take a very long time to be updated (if ever).

npn · on April 16, 2022

> Crystal doesn't have built-in support for parallelism

They do, but it is hidden inside a compiler flag, if you compile your prject with `Dpreview_mt` then it will come with multi-threaded support. This has been an experimental feature for a few year though, and there is not much improvement since it first got introduced.

Personally I don't use crystal for this kind of feature, and it runs stable enough when I use it for some cpu intensive tasks when I rarely need it.

Crystal really shines when you need something that you usually write a python/ruby script to do, especially for tasks that run for hours. Converting some script from ruby to crystal and run it in production mode typically reduce the time consumed to 1/5 or even 1/10 of the original depends on the job. As someone who have to read gigabytes of text files regularly, Crystal is currently the best one for the task.

The compilation time for released binary is something need much improvement though. And I'm not sure if they can even achieve incremental compilation.

pizza234 · on April 17, 2022

> hey do, but it is hidden inside a compiler flag, if you compile your prject with `Dpreview_mt` then it will come with multi-threaded support

It depends on the domain. From a production perspective, an flag-gated functionality that has been experimental for two or more years, is not "built-in". Plus, as explained, the ecosystem (I think I've read even the stdlib) doesn't give guarantees about thread safety

For small-scall scripting, then sure, it could be useful - but anything will do. I've evaluated for use at my company, and discarded it, because of the lack of libraries. Sadly, this is a chicken-and-egg situation. I've also evaluated contributing to it, but I won't until multithreading is stable.

npn · on April 17, 2022

> I've evaluated for use at my company

Well this might be the problem. In corporate environment you can't afford to be too adventurous.

Personally I solve the "lack of libraries" problem by using more than one language, then connect them via child process call or some persistent storage like database or plain text files.

But it's entirely a different matter when the code need to be used by a lot of people.

sanxiyn · on April 16, 2022

I use PyPy for such cases. Is Crystal better than PyPy?

npn · on April 16, 2022

I think Crystal is better than Python in term of language design. Unlike Ruby and Python that were way older, crystal is relatively new, so they learned from other languages mistake and try to improve it, result in a more cleaner language.

For the cases mentioned, I think crystal is immensely helpful: - Reading/writing files are easy, usually a single method will give you the result you want. - Working with directories are nice, things like `Dir.mkdir_p`, `Dir.each_child`, `File.exists?`... all existed to make your life easier. - Like ruby, you can invoke shell command easily using backticks - There are some useful libraries to for console app, like `colorize` or `option_parser`. Crystal is a battery included language, so the standard library is filled with useful libraries. - Working with lists and hashmaps is a breeze, since the Enumerable and Iterable modules are filled with useful methods, mostly inspired from ruby land. - Concurrent is built in, so you can trivially write performant IO-bounded tasks like web crawlers.

For a project that made by a handful of people, I just can't praise the dev team enough for making a language this practical.

zozbot234 · on April 16, 2022

Modern Rust is much more straightforward than it was in 2015. It's effectively two different languages, albeit maintaining backward compatibility (i.e. code written for Rust 1.0 should still compile today, with proper edition settings).

bscphil · on April 17, 2022

Suppose I wanted to try learning Rust again; is there a resource for someone with a lot of (hobbyist) programming experience, and experience with low level languages and memory management (e.g. C), but not complicated low-level languages, like C++?

When I tried to work with Rust a few years ago I found it utterly impenetrable. I just had no idea what the borrow checker was doing, did not understand what the error messages meant, and honestly couldn't even understand the documentation or the tutorials on the subject. Understanding what is happening in C or Zig is pretty easy; in Rust it's always been a nightmare for me. I just really don't grok the "lifetime" concept at all, it feels like I'm trying to learn academic computer science instead of a programming language.

Rust feels to me like a powerful, expressive language for professional programmers at the top of their game. That's a complement for any language. But it comes at the cost of mind-numbing complexity for anyone who's not an expert.

zozbot234 · on April 17, 2022

> Suppose I wanted to try learning Rust again; is there a resource for someone with a lot of (hobbyist) programming experience, and experience with low level languages and memory management (e.g. C), but not complicated low-level languages, like C++?

The official Rust book is targeted at novices with some programming experience. There's also Rustlings https://github.com/rust-lang/rustlings for a more practical approach.

> When I tried to work with Rust a few years ago I found it utterly impenetrable. I just had no idea what the borrow checker was doing, did not understand what the error messages meant, and honestly couldn't even understand the documentation or the tutorials on the subject

The compiler diagnostics have improved a lot over time. It's quite possible that some of the examples you have in mind return better error messages.

> in Rust it's always been a nightmare for me. I just really don't grok the "lifetime" concept at all, it feels like I'm trying to learn academic computer science instead of a programming language.

Academic computer science calls lifetimes "regions", which is perhaps a clearer term. It's a fairly clean extension of the notion of scope that you'd also find in languages like C or Zig. It's really not that complex, even though the Rust community sometimes finds it difficult to convey the right intuitions.

bscphil · on April 17, 2022

Fair enough, I do need to have a look at the book again, although that was one of the sources I found impossible to understand a few years back. I think there's a temptation to talk about lifetimes in extremely abstract terms under the assumption that the reader already understands and appreciates the abstraction. I, however, was never able to build up an intuition for it, and so tutorials that didn't explain what was happening in detail sailed over my head.

lijogdfljk · on April 16, 2022

I second zozbot234's statement about it being far better than it was in those days.

The language team has done a great job rounding rough edges, and this next roadmap is slated for even more polishing. They heavily prioritize dev experience which is why i think people like myself (a GC'd language person historically) use and love Rust so much.

sanxiyn · on April 16, 2022

Zig uses manual memory management too (even more manual than Rust), so that's a bit strange question.

zigger69 · on April 16, 2022

It's really easy in Zig to be honest. Just put `defer thing.deinit()` in the right scope and you're done. You gain explicitness and know exactly what's going on in your Code. Everything is obvious. That's the reason Zig is so incredible simple and easy to read. Zig also has a GPA that will tell you about memory leaks or anything.

puffoflogic · on April 16, 2022

And in rust you just put `` in the right scope and you're done. This is perfectly explicit and you know exactly what's going on in your code. Everything is obvious.

zigger69 · on April 17, 2022

You can execute arbitrary code on drop operations by implementing `std::ops::Drop` for a type.

pornel · on April 19, 2022

A `.deinit()` function could also run some arbitrary code that does weird and unexpected things. The point is that reasonable people don't abuse functions like that — if they do abuse, don't use their code. Neither Rust nor Zig is a sandbox that could stop actively stupid/malicious code.

dralley · on April 17, 2022

Rust's ownership rules are no less explicit. The object dies when the owner of the data goes out of scope.

cercatrova · on April 16, 2022

Interesting, based on their code samples it looked to me like a GC language since (at least from what I saw) I didn't see anything regarding memory management.

elcritch · on April 16, 2022

It’s easy, IMHO, to mistake Zig as a GC’ed language or more broadly as a memory safe systems language. It’s neither but it is a nicer C.

sanxiyn · on April 16, 2022

I am not sure what code samples you looked at, but https://ziglearn.org/chapter-2/ should give you an idea.

pizza234 · on April 16, 2022

> or are GC languages like Nim or Crystal good enough?

Any programming language is good enough for their own use cases :) It's a matter of understanding which the use cases are.

I'm a big Rust fan, but I nonetheless believe that the use cases for programming languages with manual memory management are comparatively small, in particular, since GC has been improved a lot in the last decade.

For undecided people, I conventionally suggest Golang. Those who at some point need deep control, will recognize it and change accordingly.

kaba0 · on April 17, 2022

Why Go? It is quite terrible in expressiveness and if you do commit to a GC you have plenty of better choices. But to each there own, otherwise I agree that systems programming is a niche and a good GC is an overwhelmingly good tradeoff in almost every case.

zigger69 · on April 16, 2022

Crystal: compilation speed is just too slow, sadly. Nim and Zig: I'd definitely just go with Zig. It's an extremely simple language, has no macros (but something much better than macros), is explicit, and in the long run it's just going to be worth it much more than Nim.

mlindner · on April 16, 2022

Memory has to be managed by something. The more decisions that are made for you in how that happens the less flexibility there is for certain situations.

cercatrova · on April 16, 2022

Sure but my use cases would be stuff I'd normally write in TypeScript or Python that already have garbage collection. Like I said I'm not doing embedded programming so I don't have too much of a need to manage memory.

My question could be further constrained then to be, is learning Rust or Zig despite its manual memory management worth it for applications that are normally already garbage collected in their current implementations? Or are languages like Nim and Crystal enough? Does Rust and Zig have other benefits despite manual memory management?

MarcusE1W · on April 16, 2022

The way you describe your use case I think you are fine with a language with garbage collection like Nim (which has has a syntax a bit like Python) or Crystal. I would also throw Go in the ring or if you are interested to learn a bit of functional programming then you also could look at Ocaml.

Zig has no garbage collection btw, but makes it easier than C to handle that. Another language without garbage collection that helps a lot to avoid memory issues is Ada (Looks a bit like Pascal). So there are alternatives to Rust.

adgjlsfhk1 · on April 16, 2022

imo, most code can do just fine with GC. modern GCs can be relatively low overhead even with guaranteed small pauses (10ms). furthermore, most code that can't handle pauses can be written to not allocate any memory (so GC can be temporarily turned off). as such, the only two places where you need manual allocation are for OS development, and hard real time requirements.

Comevius · on April 16, 2022

When tail latency (high-percentile latency) is important GC is not a good choice. Wait-free (threads progress independently) concurrent algorithms also need wait-free memory reclamation with bounded memory usage to be able to guarantee progress.

But most software are throughput-oriented.

pjmlp · on April 16, 2022

Additionally, not all GCs are made alike, and languages like D, F#, C#, Nim, Swift, among others, also offer value types and manual memory management, if desired.

elcritch · on April 16, 2022

Also Swift and Nim w/ ARC use reference counting, which generally give much better latency and lower memory overhead. Reference counting is part of the reason iOS devices don’t need as much RAM.

Nim’s ARC system also doesn’t use atomic or locks which means it’s runtime overhead is very low. I use it successfully on embedded devices for microsecond timed events with no large tail latencies.

pjmlp · on April 16, 2022

Reference counting is a GC algorithm.

I wouldn't buy into much Apple marketing regarding its performance though,

https://github.com/ixy-languages/ixy-languages

It makes sense in the context of tracing GC having been a failure in Objective-C due to its C semantics, while automating Cocoa's retain/release calls was much safer approach. Swift naturally built on top of that due to interoperability with Objective-C frameworks.

Nim has taken other optimizations into consideration, however only in the new ORC implementation.

Still, all of them are much better than managing memory manually.

elcritch · on April 16, 2022

> I wouldn't buy into much Apple marketing regarding its performance though,

I wouldn’t make claims on Swifts overall performance, but just it’s memory usage (really Obj-Cs) and particularly for GUI development. Java’s GCs have always been very memory hungry, usually to the tune of 2x. Same with .Net. Though to be fair Go’s and Erlang’s GCs have much better memory footprints. Erlang’s actor model benefits it there.

Agreed, they’re all better than manual memory management.

kaba0 · on April 17, 2022

OSs and hard real time can also be written in managed languages — there are many research OS written in managed languages (with a bit of assembly, but you also need it for C as it is not low-level either), like Midori, and there are even hard-real-time JVMs used in military settings like jamaicavm.

mlindner · on April 16, 2022

Rust's memory management is "manual" but it feels automatic for most uses.

fyzix · on April 16, 2022

Nim strikes a great balance. No need for a low level language for cli and general software. I liked crystal but the lack of support on windows and lackluster dev experience made me stick to Nim.

Nim also can double as a web language by transpiling to JS.

arc776 · on April 17, 2022

Nim's super power is being ridiculously productive (at least for me). Hack stuff out like a Python script, yet it runs really fast and is a tiny self contained executable, so you can just use it as is and move on to the next task. If you want manual memory management, that's easy too. Want to use a C/C++ library? No worries you have ABI compatibility. As you mention compiling to JS lets you use it as a web language and share code and types between front and back end.

Then you can automate code generation with the sublime macros, which are just standard Nim code to create Nim code. No new syntax or special rules required - any Nim code can be run at compile time or run time, so you can use standard/3rd party libraries at compile time to write macros and give the user a slick syntax whilst removing boilerplate.

I really miss languages without straight forward metaprogramming after using Nim. It's something that multiplies the power of a language, rather than just adds to it.

exikyut · on April 17, 2022

I haven't properly looked into Nim yet, and the sibling comments here make for some interesting signalling.

quazar · on April 17, 2022

I would not recommend Nim.

throwamon · on April 17, 2022

Thank you so much for such a profoundly insightful comment. I'm now even thinking of changing professions thanks to it.

planetis · on April 17, 2022

I like Nim.

cxr · on April 22, 2022

> just interested in command line apps and their speed as compared to, say, TypeScript

There are no fast languages, only fast language implementations.

NodeJS/V8 is pretty fast (even faster than you probably think)—particularly if you're already doing things like making the sort of compromises where you limit yourself to writing only programs that can be expressed under the TypeScript regime. It's usually the case that it's not the NodeJS runtime that is the problem but rather the NodeJS programming style that is the source of the discomfort with "speed" that you will have experienced.