Hacker News new | past | comments | ask | show | jobs | submit login
Zig and Rust (matklad.github.io)
238 points by avinassh on March 27, 2023 | hide | past | favorite | 238 comments



I tried zig after reading so many HN articles and I am not impressed. You have to remember to free after each allocation. The language is supposed to be simple, yet there exist try, catch, defer, errdefer. Rust enums make defining errors so simple with arbitrary payload. Using data structures in rust is much nicer. I think unless you have some precise needs about memory allocation, and want to control exactly where when and how memory is allocated, it is hard to argue in favour of Zig. The allocation control is very nice if you need it though.


This is a good comment that I hope can help other readers manage their expectations.

Zig is verbose and low-level and, to like it, you have to appreciate simplicity and what it means to have control over tiny details. If you don't like either thing, then Zig will not bring anything particularly interesting to the table for you.


Yes to the appreciating simplicity bit, but if you go to Rust expecting to not sweat over having to control tiny details you are in for a ride.


To be honest I just clone and unwrap everywhere when writing the first iteration. If I care more, then I'll refactor to remove unnecessary clones. I really haven't felt the pain of the borrow checker like some people say.


After you've written your X thousands line of code in Rust, you won't clone/unwrap by default, but just immediately write correct idiomatic code that most often compiles on the first try and doesn't trigger any warnings or clippy warnings, without any added effort. As any other skill, it's just a matter of a bit of practice.


"(Zig's) collections are not parametrized by an allocator, like in C++ or (future) Rust. Rather, an allocator is passed in explicitly to every method which actually needs to allocate. This is Call Site Dependency Injection, and it is more flexible.'

I'd once considered the idea of having "space" as a type. Space is an array of bytes, and it cannot be read or written. Constructors take in "space" and turn it into an object. Constructors have the privilege of casting "space" to something else. Destructors take an object and turn it back into "space". Destructors can cast their object to "space", and this destroys the data. This separates construction, which is a typed thing, from allocation. It can be used with malloc/free type allocation, garbage collection, or fixed allocation. You'd probably encapsulate this with a generic depending on what type of allocation you are using.

C++ probably would have been cleaner if it took that route, but C++ added both generics and allocators as afterthoughts.


This is exactly how C++ work.

   union {  MyObject space; }; // just a bunch of bytes
   MyObject * myObject = new(&space) MyObject(); // invoke the constructor
   myObject->~MyObject(); // invoke the destructor
   
If you want RAII use some wrapper like Optional.


Is this syntax new(&something) a new addition? I don't remember it, but then I used to program in C++03


I assume it’s been around forever. https://en.cppreference.com/w/cpp/language/new (Placement-new is the thing you are talking about.) C++20 has `std::construct_at` which does the same thing but is usable at compile time (in a `constexpr` context) https://en.cppreference.com/w/cpp/memory/construct_at


TIL about std::construct_at, thanks!

edit: not sure why they didn't just bless placement new with constexpr powers though.


It's been around since forever, precisely because you sometimes need to construct an object in pre-existing storage. Otherwise you couldn't implement std::vector, for example.


Placement new was already a thing in C++ARM.


> This separates construction, which is a typed thing, from allocation.

But this is what C++ actually does! C++ ctors run after memory has been allocated (on the stack, on the heap, from a pool, etc.) Or maybe you mean something different?


The lifetime of the space can be different, but must enclose, the objects created in that space. C++ doesn't track lifetimes.

Such a separation is only useful in specific situations, such as the one the original article mentions. It could be useful in avionics or industrial control, where you often avoid memory pools and have arrays of specific kinds of objects instead.


> C++ doesn't track lifetimes.

Of course, C++ does not have lifetimes (in the Rust sense), but your original claim was that C++ does not seperate allocation from construction.

> Such a separation is only useful in specific situations, such as the one the original article mentions.

It is actually quite common. You don't even have to reach for advanced things like memory/object pools. For example, if you want to implement a dynamically sized array type (std::vector), you typically want to increase the storage independently from the number of objects.

But then again, maybe you are talking about something different here...

---

Also, I may need to clarify some things about C++ allocators. They are not used to allocate storage for the object itself, but rather control how an object allocates memory for its subobjects. Keep in mind that the object itself may exist in a different kind of storage than its subobjects. For example, the object may live on the stack, but its subobjects must be allocated dynamically.


EDIT

> but your original claim was that C++ does not seperate allocation from construction.

At least that's how I interpreted it.


Depends on what kind of lifetimes, yes C++ doesn't do Rust like lifetimes check, at least not without help from static analysis.

However C++23 has introduced compiler aware functions to start specific lifetimes out of raw memory pools.

https://en.cppreference.com/w/cpp/language/lifetime


If you mean std::launder, this has been around since C++17. (Not that I ever had to use it :-)



Thanks for the link! I got confused because your cppreference link only talks about std::launder.

I did not know about std::start_lifetime_as. Looks very useful! It's crazy that it took so long. Considering that one of the primary design goals of C++ was backwards compatibility with C (too a large extent), I always found it crazy that malloc'ing objects would be undefined behavior. In practice, this has always worked, though. At least now we have a standard-blessed way to do these kind of things.

As a side note, does this mean that you can finally cast an array of bytes to an object pointer (e.g. for deserialization) without violating strict-aliasing rules? The usage example in section 1.2 of the linked paper seems to imply this, but it does not mention aliasing anywhere...


I guess this is also interesting to watch,

"Taking a Byte Out of C++ - Avoiding Punning by Starting Lifetimes - CppCon 2022"

https://www.youtube.com/watch?v=pbkQG09grFw


Goddamn, this is a great way to consider this problem! I dabble in a lot of languages, Zig included, and I never had a good way to consider the allocation model that fit in my head in a reasonable way. See, I like to think of software as a transformation process that happens to physical objects that I actually visualize, and so thinking of an allocator as a "space" that can be transformed is super useful to me. I also love very-strictly-typed programming. I enjoy making everything a type, so thinking about a "space" type is something I can use in the future.


I actually did find the allocation control very interesting and useful. It seemed like something that can be so useful in the context of request/response servers. Use an allocator per request and reset it at the end.

However, it alone does not compensate for all the other missing niceties. e.g. Just dealing with strings is not straightforward compared to Rust/C++.


Super interesting idea.


> ...and want to control exactly where when and how memory is allocated...

Isn't this the whole point and the reason why manual memory management is still a thing - and actually getting more important with the ever widening CPU/memory latency gap?

There simply is no silver bullet for memory management, you can either have high performance or automatic memory management, but not both.

(vastly simplified of course - but in languages with automatic memory management - no matter if garbage collection or refcounting - you need to spend so much effort to reduce memory management overhead that it is usually less work to do it all manually in the first place)


There's the third option, which is what Rust and C++ do by default, where types have destructors that are run when the value goes out of scope, but also have mechanisms to be activated manually at a precise moment if you know what you're doing and have specific needs.


RAII is not a silver bullet. RAII wants to have neat recursive ownership and cleanup of everything, which is undoubtedly a nice model, but sometimes in low-level code you have to choose between nice recursive ownership and performance.

For example, Protocol Buffers are trees of objects. Originally they were simple and had recursively-owned, heap-allocated nodes. But it turns out that traversing and deleting a big graph of heap-allocated objects is expensive, and arenas are more efficient. But arenas make you give up recursive ownership.


Which is why low level languages like C++ and Rust make RAII the default in their standard library (easy, simple), but it's still completely possible to manage the memory on your own if that's necessary for the task at hand. Having easy defaults with the capacity to do everything by hand if you need to is the most flexible system.


The last time I looked (which has been a while, I admit), that meant that you had a very hard time using any of the standard library types with an arena or whatever form of management you used for your specialized code.

What does that look like now?


If you're willing to write Nightly Rust, you can use the allocator_api feature to have the types which allocate take an allocator parameter so e.g.

Vec::with_capacity_in(10, arena_allocator)

... obviously user-defined types don't know these allocators exist and so may not provide a way to pick which allocator is used, whereas in Zig this is "always" how it worked.

For stable Rust, until/ unless allocator_api is stabilized, you will need to replace the global allocator for your code with one that has the desired behaviour. That applies to everything using an allocator, but on the other hand you lose considerable flexibility.


In Rust terms, the arena should own the data and each object should just borrow from the arena. In Rust, when you drop a borrow it doesn't deallocate the memory; only owned values deallocate on drop (that is, in the destructor). Besides, borrowing enforces that the arena won't drop while the borrows are still outstanding.

https://manishearth.github.io/blog/2021/03/15/arenas-in-rust...


There is no incompatibility between RAII and arenas. RAII doesn't require heap allocation.


That's not really the point. If you're using an arena it means that you plan to reset it at some point and erase in bulk everything that was stored on it all at once. This is the opposite of letting the destructor of every object run.


Not every object in an RAII language needs a destructor, e.g. objects that don't manage their own memory because they came from an arena


This is by definition manual memory management.


I would argue RAII is still a type of automatic memory management, but it has the determinism and ability to reason about what's going on the same way fully manual management does.


From the Garbage Collector handbook I got a very great quote about what automatic memory management really is. I will butcher it up, but the basic essence is that automatic memory management is above all a software design tool — well designed programs are built from components that are highly cohesive but loosely coupled. What it means for memory management is that a component shouldn’t have to care about how other modules handle memory for correct working. This is possible with a GC, but isn’t with manual memory management, and RAII doesn’t change it — function signatures leak memory management concerns in both Rust and C++.

Ongoing refactoring and maintenance costs are also higher in low-level languages due to this reason.


> you need to spend so much effort to reduce memory management overhead that it is usually less work to do it all manually in the first place

That has not been the case in the vast majority of applications I've written across a number of domains using garbage collected languages.

Yes, there is runtime overhead to GC. But for many many programs, the cost of the overhead is acceptable and you can spend all of your time worrying about the application domain and not worrying about allocation.


Not allowing you to talk about allocation in any great detail certainly on the surface seems to simplify things, but as with most things of that nature it tends to complicate things in edge cases (or just entire parts of industry). What the GP is alluding to, it seems to me, is that in those cases the effort to remedy the problem tends to become a constant hunt and guessing game of "Where are we accidentally allocating and ruining everything?".

Inevitably some kind of object pooling comes into play in most of these cases and this is arguably a much worse solution to the basic problem. If you know you're going to be landing in this space I don't think there's much purpose to pretending you're better off using a GC that you'll be trying to avoid anyway. It's far easier to sketch out your allocation strategies and do things deliberately.

Fast software allocates and deallocates in bulk, so the straw man of individual `malloc` and `free` isn't exactly relevant. Arenas, bump allocators, etc. are what is actually being discussed as an alternative.

(As a side note I think this makes RAII a somewhat peculiar solution to the issue as well. RAII implies objects themselves controlling allocation and deallocation and this is just not very useful at all if you want to make something that behaves properly.)


> Where are we accidentally allocating and ruining everything

This is trivial to profile under any platform that has remotely sane tooling, and usually it is trivial to fix as well.

Object pooling is a very last resort thing to do, I really can’t think of any project I have worked on where there wasn’t another solution to performance problems.


"and you can spend all of your time worrying about the application domain and not worrying about allocation."

This has never been my experience in 10+ years of middleware development in GC languages in large teams. The problem is just deferred to production where you need to analyze heap dumps and run a profiler to hunt down excessive allocation issues and why the GC is running all the time.

Sometimes, troubleshooting and correcting many of these issues take multiple more man hours than coding the service in the first place! The usual solution of "throw more hardware at it" only works until mgmt starts complaining about costs and uptime.

I have spent more time debugging memory issues in Java/C# projects than in my first large scale project in C++ where the lead architect strictly mandated the use of a custom allocator for everything.


I'm not sure why you trimmed the beginning of my sentence.

The parent comment says: "(vastly simplified of course - but in languages with automatic memory management - no matter if garbage collection or refcounting - you need to spend so much effort to reduce memory management overhead that it is usually less work to do it all manually in the first place)"

That suggests they are claiming that all users of GC languages will eventually spend more effort dealing with memory than if they'd skipped a GC in the first place.

I am making an existence proof counterargument: I have worked on many many programs in GC languages where at no point in the program's lifecycle did I need to spend much time worrying about memory.

I believe the claim "all programmers will spend more time dealing with memory in GC languages" is false.

I have absolutely not made any claim that "no programmer will spend more time dealing with memory in GC languages". There are certainly programs that are better written with manual memory management. I've written plenty.

My point was only that there are also plenty of programs where that's not true.


> There simply is no silver bullet for memory management, you can either have high performance or automatic memory management

That’s only true for workloads where you can optimize the memory layout/allocations. This may not be true in general, where you will end up implementing a shitty GC that will definitely perform worse than a properly written one.

Also, the cost of a proper GC is heavily overblown in my opinion. Especially when value types are available.


GC is not a solved problem and can be costly. We clearly see in a production server written in Go spikes in latency and memory usage. So most likely for the next server-type application we will use Rust.

Then there are benchmarks where Nginx is faster than Caddy by factor of 3.


Go doesn’t really have a good GC for what it’s worth. And of course there will always be domains where the tradeoffs of a GC doesn’t worth it, but those are very very rare.


The problem with GC is that when it does give problems, there are no good choices. One either spends time trying to tune available knobs to hide the problem or rewrite application to try to reuse objects etc. turning the initial nice architecture into spaghetti mess.

I very much prefer for this reason reference counting. Yes, time can be wasted fighting leaks through cycles, one still have avalanches of releases problems, the performance can be bad with certain usage patterns. But solving these problems is at least local and does not required rewriting the whole application.


The real downside to reference counting is concurrency. If the Reference Counter lives in one Core A's L1/L2 cache, then Core B needs to write to the reference count, the cache line is invalidated, and Core A needs to re-fetch it again. You get two cores fighting over the cache line. Everything still works correctly, but it causes some delays if each core is constantly updating the reference counter.

One workaround is giving the threads their own private reference counter (making sure they're on different cache lines), and updating the shared reference counter only when their private reference count reaches 0. Not only do you stop the fighting, you don't even need atomic operations to update the private reference counts anymore.


Apple mostly solved that problem in their silicon. But thread-private counters is a good solution for other platforms.


What kind of workload does the server run?


It serves as a proxy or manager for other components. It is IO bound. As such Go should be an ideal language for it. But GC kicking in a wrong moment still brings troubles.


So Go GC just need a api that pauses GC when it wants to run


Reference Counting (including a secondary counter for internal cycles) is your silver bullet.

The only hairy issue then becomes multiple processors needing to update the reference counter, where the count has to go in an out of the various processor's cache. One possible way to resolve that is to have a per-thread reference count, and only change the globally shared reference count when a per-thread's reference count would reach 0.


> There simply is no silver bullet for memory management, you can either have high performance or automatic memory management, but not both.

RAII is the way.


RAII for memory management implies that destructors free owned memory, which can quickly cascade out of control for complex objects (e.g. you still need to think very carefully about memory management).


You are right in the sense that for high performance applications you still need to explicitly take into consideration memory allocations, allocators, indirection, into consideration. RAII helps you implement your preferred design, but it doesn't make you ignore the details.


That's my complaint about Zig as well coming from C.

I feel that there are too many reserved words in Zig, and some of them are too long (comptime for instance).

Maybe that's just me being too picky about surface level concerns, but that kind of stuff really bothers me for some reason.

Other than that, I do like some parts of Zig. The way templates are designed seems superior to C macros and C++ templates, having your build scripts be in the language you are building is something more languages should do, the compiler is impressive in some ways... I suppose in theory all of that should make up for the poorly chosen keywords and whatnot.

I disagree with other commenters that think Zig is simple. At least compared to C, which I think is fair given Zigs target audience. C feels simpler despite all the undefined behavior and extensions to the standard and portability issues. Somehow I think it's down to the reserved words that were chosen.


> The way templates are designed seems superior to C macros

Everything is superior to C macros, that’s like the worst thing ever created. Honestly, C is not a good language but if it would have had at least a proper macro system it could at least be remotely usable, e.g. having proper generic data structures.


> I think unless you have some precise needs about memory allocation, and want to control exactly where when and how memory is allocated, it is hard to argue in favour of Zig.

I do have a hard time to think of use cases other than embedded microcontroller work where that matters nowadays. That said, I suspect that while they could have written TigerBeetle in Rust (no_std does not assume the existence of an allocator), it would have been more difficult.


> I do have a hard time to think of use cases other than embedded microcontroller work where that matters nowadays.

Another example: realtime audio applications. There are rather strict rules about what you can and cannot do in the audio callback. In particular, you must not use the system memory allocator (because it may block). Instead you have to pre-allocate memory or use real-time safe memory pools. For this reason, the vast majority of (realtime) audio code is still written in C and C++.


JIT compilers, games, database server. Anything where you need high performance and do a lot of allocations.


JIT compilers don’t actually need that low-level programming, and compilers in general use memory in a very haphazard way so they are likely better off with a GC.

In fact, you probably loose way more by not being able to write as many/great optimizations in a low-level language than what you lose on a slight overhead — e.g. Java’s Graal JIT compiler is nowadays performing better (written in Java itself) than the “original” C2 compiler which is written in C++. But this is mostly a maintainability question.


Indeed, having played with doing compiler / parser / transformation type work (for relational query transformation, etc) in Rust, and got stuck in a maze of borrow-checker disasters around the tree of references... what I'd say is: this kind of thing is actually better done in a higher level statically typed garbage collected language. For myself, I'd use a mature functional something like OCaml/StandardML or a Lisp or similar. It's just easier to not have to worry about borrow checking references and allocation issues when writing stuff like that.

The case against garbage collection is where latency is key. P95, P99 latency always suffers under any kind of garbage collector, no matter how advanced. You can minimize pauses, you can move them around, you can play with different collector aspects, but in the end... compare to explicit allocation/destruction/lifetime mgmt, it's going to suffer.

I don't think compilers generally have these latency concerns. Throughput matters, but not the odd unpredictable pause.

There's only so many application domains where that is super important: Certain kinds of high traffic servers, database engines [not so much query parsing & planning, but execution, data structures, buffer mgmt, networking], operating systems, and games. Also embedded systems where a tight memory profile is important.

I like Rust (and Zig) but sometimes when people have a good hammer, they go looking for nails where there aren't any...

(All that said, I feel like some tree/DAG-heavy problems in garbage collected languages could benefit from static (and runtime) reference analysis tools that look a lot like Rust's borrow checker, purely for code sanitation/tracing.

Also, GC tracing and some of the programming patterns GC encourages can cause havoc on L1 caches reducing throughput as well as a side-effect, so there's that)


Still major compilers and optimizers are written in C and now C++. GCC was written in C with a custom GC. They have moved to C++ and, as as far as I understand, doing more and more with RAII instead of GC (turns out that peak memory usage is a major factor in compiler performance)


That's fair, and I'm not a compiler engineer so I'll avoid making authoritative claims :-). But a C/C++ compiler is a whole thing, with some pretty intense performance requirements, wouldn't you say?

Mainly I was speaking from the experience of trying to write more complicated parser & tree transformation pieces in Rust and finding the ownership stuff a hassle.


I second that; there is nothing inherent about a JIT that requires low-level memory tricks, except that last step of making the machine code executable and callable. I'm not sure that Rust is a good fit for optimizing compilers, either. E.g. most compiler IRs are fundamentally graph-based and use unrestricted cycles. TurboFan in V8 uses arena-style allocation, as do most of the JITs in other JSVMs, mostly because managing every single node is overkill. A GC with a big enough young gen will do well for the kinds of allocation patterns of a JIT.


Firstly, an arena-based allocator is a custom allocator, which is something most programmers never need to touch and probably don't know even know about. It is a "low-level memory trick".

Secondly, we were talking about how Zig makes allocation first-class. This is a prime use case: specifying your own custom arena-based allocator.


> I do have a hard time to think of use cases other than embedded microcontroller work where that matters nowadays.

People keep saying things like this and software keeps getting more bloated and shittier. It is difficult to believe there isn't a correlation.


I might be misreading this comment, but the answer to bloated software is not writing all of it in rust/C++/zig.

The reasons for bloat is not garage collection.


It is not about the minutiae of language choices no, it's about a pervasive programmer mindset of not wanting to think about resources like memory of clock cycles or bandwidth as things that actually matter.


A lot of it is that APIs, particularly for GC'd languages, often require lots of intermediate garbage and defensive copies. It wasn't until Java 17 that there was a standard way of parsing integers that didn't allocate. I blame libraries more than languages, TBH.


Sure, I am just confirming that mindset, but can you really reason about the performance impact of that allocation? Even if it does end up allocating due to escape analysis not being sufficient, it will do a thread local pointer bump allocation on a hot, in-cache arena basically, and will just be zero-cost cleared once the still achievable objects are moved.

My point is, knowing when to think/not think about allocating and its relevant costs is the proper way to program in a high level language. Only care about it when you are at a part that runs in a hot loop or very often, etc. Pretty much as per the second part of the often partially quoted “premature optimization…”.


The issue is not always garbage collection. The issue is, as others have pointed out, not having to think about it (memory & cycles.) But this also ties very much into another thing about many modern applications: excessive reliance on third-party packages, many of which are written with (or without) constraints and features that aren't always a match for the core application. It's very easy to blow out your memory use and runtime costs this way.

And in this respect Rust could be vulnerable, because Cargo.toml makes it way-easy to pull in a huge tree of transitive third party dependencies without even thinking about it.


I think as hardware and software have matured, more and more people have access to develop software. They get correct results without having to dictate every little item. I think it's great, personally. I don't know if it's necessarily "bloat" because it's probably a net win that we wouldn't get some of this software otherwise.

The really awesome news is that for those of us who still develop software whose requirements mandate this kind of control - better tools like Zig (and Rust) are making things easier but without requiring things like a GC.


Re: TigerBeetle... caveat with I haven't looked at the TigerBeetle internals&source code, but. Some thoughts...

The kind of page buffer allocation done inside databases is kind of a different nature than the kind where you, e.g. override the default allocator in STL collections, or pass a different allocator into your standard collections library in Zig.

It's generally such that only specific data structures -- usually BTrees or HashTables meant to store relations/tables/indexes -- are managed this way. And so they're usually built from scratch around an explicit specific page buffer mgmt system. In some systems (like Umbra or LeanStore) this might be somewhat murkier in that it might use pointer swizzling etc behind the scenes, but it's still usually the case that the data structures used for relation storage are tied directly to the DB's own page buffer implementation anyways.

Now, other parts of the application stack may benefit from having a custom allocator in the same way as any performance critical system might, but that's of a different nature.

(That said, Rust is also not easily suited to the kind of "pointer-swizzling" behind the scenes bait-and-switch with memory that e.g. LeanStore does with C++. I've tried, and, while it's possible, the language in general gets in the way and you're `unsafe` all over the place anyways.)

Anyways, all this to say, TLDR given that all serious databases do explicit memory / buffer pool mgmt ... and build data structures specific to them, I don't see any intrinsic advantage to Zig for this purpose really? Other than it's a decently modern systems programming language that mostly stays out of your way.

But... I can see major advantages to using Rust: more developers, bigger community, larger ecosystem, safer for memory mgmt elsewhere in the stack, safer for concurrency, etc.


I think you might actually like Virgil more. It's actually memory safe and has nice enums. It has no way near the adoption of Zig, nor the standard library or batteries included. But it is my fullest expression of what I think a systems programming language should be.


I second that. Several ideas of Virgil like unification of an argument list and tuples or handling of objects are very nice.


Actually I found rust to be easier to learn than zig, as it is more consistent in most cases.


Rust without std lib can control all details of allocations and you still get safe sum types, automatic free etc.

The real problem with Rust is hostile syntax for unsafe code that one often needs when doing low-level programming.


try = ? in rust. catch = match in rust. Zig does have its own general match statement, and the reason it needs a catch is because errors in Zig aren't actually ADT's, but it's not that much more complex.

And you're really going to argue that defer,errdefer is more complex than RAII?


> I tried zig after reading so many HN articles and I am not impressed. You have to remember to free after each allocation.

You write like it was a something surprising, but you know Zig is often described as a replacement for C..


Yep. I hardly fine a situation where Zig is the best tool to solve the problem. Guess it will fall out of favour with HN readers soon.


"Rust enums make defining errors so simple with arbitrary payload."

enum variants are overused in Rust and are highly expensive in memory IMHO. One tends to run into problems like: https://github.com/serde-rs/json/issues/635 in Rust projects when used at scale. Value is an enum variant: https://docs.rs/serde_json/latest/serde_json/value/enum.Valu...

Comment from that issue: "For example, ElasticSearch returns a 8,683KB document, I deser it into Value and the next RAM reading gives me delta of 98,484KB of RAM use. That's more than 10x the original size."

Compare that with a similar issue in simdjson (written in C++), where people are complaining about parsing 4GB documents into a tree !!! https://github.com/simdjson/simdjson/issues/128. serde-json has trouble moving a bullock-cart while simdjson is moving a container ship.

I don't like some parts of Zig - I think not having RAII in the language is a terrible mistake. But I do love that anything and everything that can be allocated needs to be passed an allocator. I do think they could have introduced an allocation context for RAII cleanups though instead of forcing the programmer to manually coding defer base cleanups - this is a major source of errors even for advanced programmers.


> Comment from that issue: "For example, ElasticSearch returns a 8,683KB document, I deser it into Value and the next RAM reading gives me delta of 98,484KB of RAM use. That's more than 10x the original size."

I think this is misleading. `serde_json` is designed to parse JSON documents directly into your user-defined structs. Thus the raw `Value` type isn't optimized for direct manipulation.

That said, what do you expect? The `Value` enum has the `Value::Array` variant that contains a `Vec<Value>`, which is 24 bytes in itself. One for the heap pointer, one for the length, and one for the capacity. Because of alignment requirements the enum discriminant must thus also be aligned at the 8-byte boundary. This gives us a 32-byte type.

It is possible to reduce the variant to 16 bytes by making the value type `Box<[Value]>` instead. Ignoring the `Value::Object` variant for a second, that would make `Value` a 24-byte type. However, you've now also made the type immutable. It's a tradeoff. 8 bytes for mutable values.

Again, do remember that deserializing into `Value`s is _not_ the primary path when working with JSON through `serde_json`.


Yes, I fully understand how enum variants are implemented under the hood. My point was in response to the enthusiastic use of enum variants which generally causes excessive memory to be consumed for even moderate inputs. Folks are surprised by this and then re-write their code to avoid enum variants or use pointer tagging.

Enum variants should come with a STRICT warning in the Rust book and Rust reference that their real-world use should be incorporated very carefully. Most proponents of Rust tend to never mention their costs or caveats. They are most certainly NOT a zero-cost abstraction and tend to trip up lots of programmers.

"Thus the raw `Value` type isn't optimized for direct manipulation."

Maybe this statement this should be explicitly mentioned in the documentation: DO NOT USE `serde_json::value::Value` for moderate or large sized JSON inputs in production! Stack overflow answers merrily recommend the use of `Value` to a get a piece of data out.


How else would you implement tagged union? Rust already optimize away the discriminant when its Option<NonZero>


I don't think serde-json has a limit on size. But under some usages it will allocate extra. For example see: https://github.com/serde-rs/json/issues/160#issuecomment-841... 2.4 GB file can be read and parsed with serde.


I mean... if the size of `Value` is a problem, use `ijson`, the very crate that the author reporting the issue developed.

This is more or less a solved problem.


Still not convinced that memory semantics are critical in the vast majority of domains. Incredibly fast speeds can be achieved with simple GC and RC for short-running programs, and programs with sustained runtimes can rely on generational GC. These methods have the advantage of nearly eliminating the need for memory semantics, leaving only business logic behind.

The best example might be Nim, which drastically reduces GC pressure over Java or Go by stack allocating and passing immutably by value wherever it can. No gigantic runtime to link against! I've been able to make fast, safe programs so quickly with Nim that it promptly ended my Rust fascination phase. Kernels, realtime software on rockets, and Facebook- or Discord-scale backends seem like good fits for Rust, but I'm not convinced it should be used almost anywhere else. Looking at the sheer volume of memory logic in Zig and Rust, I have to ask: "Why do we continue doing this to ourselves?"


You can get the idea that a GC has minimal impact on performance when you just look at things like how fast the GC runs, or what % of time is spent in GC, but that only represents a portion of performance implications, there are lots of indirect effects:

1. More memory is used, which means more CPU cache evictions happen 2. There will be things the language won't let you do, because it is garbage collected 3. There will be extra FFI costs, like when making OS calls or calling into other languages 4. There will be cultural tendencies, if a language uses GC, the people who designed it and used it are less likely to be interested in compromising other things for extreme performance. Example: If a linq query has overhead in C#, and they have to make a breaking change to eliminate it, they will not. In Rust if an iterator is slower than what hand coded C would be, its a bug, and every effort will be made to fix it if possible.

That said, it is still the case that most programs aren't penalized by GC much. But people do still work on operating systems, games, databases, video editing tools and such where getting great performance is really important.


Nim uses stack allocated value types by default, and GC (or ptr) is an optional tag to the type definition.

GC types use borrow & move analysis like rust (not as good yet tho) so it can elide GC work when possible. GC is also not stop-the-world, deterministic (assuming no cycles), and configurable to soft realtime performance.

So you dont need to use the GC, but it will give you RAII if you want without needing to perform the dance of the borrow checker.

BTW Nim is also closing in on Rust's borrow checking semantics, too!


> More memory is used, which means more CPU cache evictions happen

The first part is true, and the second part is sort of true, but it may or may not matter. The order of magnitude speedups that good cache usage can cause only happens when you are having a small hot loop working over a huge amount of data that has an easy to guess access pattern (e.g. serial). I am not convinced that your ordinary program would have too many such loops and thus that it would have a huge speed up were it rewritten in C from a managed language.

> There will be things the language won't let you do, because it is garbage collected

There are always escape hatches

> There will be extra FFI costs, like when making OS calls or calling into other languages

Do you mean like GC barriers and such?

But we do agree on your last paragraph, and sure low-level languages will always be needed. But the niche where they are irreplaceable are very small, if we are pedantic OS and games are also not necessarily to be written in them (see Microsoft’s singularity, the latter has plenty of examples)


Also, aside from the fact GC is optional in Nim, what are you thinking of that cant be done with a GC?


Malloc isn’t free, either.


`malloc` doesn't make up a meaningful percentage of performance-oriented software's execution time. Most of those programs bulk allocate and manage memory internally, so the cost of `malloc` isn't a relevant argument against managing memory manually. Indeed it's entirely possible to write a program that doesn't ever use `malloc` while still using allocations internally, from a stack allocator.

The point isn't to `malloc` every single allocation and pair them up with `free` calls; it's to use allocators that alleviate that burden while also allowing you to pick an allocation pattern that makes sense for the use case (and internal use cases).

As an example, if your total memory usage for your program could be determined at startup to be 1GB, then you could allocate 1GB up front and allocate from that memory in the rest of the program, guaranteeing that nothing after startup ever goes to the OS for memory.

Overall it's much easier to write obviously efficient software in Zig than it is in Rust, IMO, and it's much easier to see at a glance that something won't behave stupidly when it comes to allocation and deallocation. To be fair, this is something Zig has over most other languages as well, so it's not exactly specific to Rust.


Another hidden penalty of some (but not all) GCs is that they can't handle interior pointers so anything that is addressable need to be separately allocated. So not only you have more allocations to track and more space and time overhead, you also add indirections where they are not needed.

Of course a Sufficiently Smart Compiler can remove the overhead, but well...


> That said, it is still the case that most programs aren't penalized by GC much. But people do still work on operating systems, games, databases, video editing tools and such where getting great performance is really important.

This statement simply isn't true though, at least in the general sense. Sure, for old school atomic RC and compilers that didn't do RC elision etc it might've been true. But we can and have made smarter compilers, just like Rust has gotten smart about life times.

Like @netbioserror I use Nim, but for embedded and real-time stuff. There is a minimal overhead (one machine word per allocation), but using Nim's ARC "memory management system" gives me results comparable to Zig or Rust though with much less work. Technically ARC without the cycle collector isn't a GC.

There's no reason D and other languages with "memory management" couldn't do similar RC based systems. As the compilers get smarter it approaches the same result as Rust's manual lifetime analysis. In many cases compilers can be smarter than human programmers about memory.

Even then, there's occasions where in Nim one can and will reach for manual memory allocations. It's easy to do when wanted.

> 2. There will be things the language won't let you do, because it is garbage collected

There will be things that systems like Rust's lifetimes makes harder / less performant as well: https://ceronman.com/2021/07/22/my-experience-crafting-an-in...

> 3. There will be extra FFI costs, like when making OS calls or calling into other languages

For generational GC's this is true, but for RC systems not as much.

> There will be cultural tendencies, if a language uses GC, the people who designed it and used it are less likely to be interested in compromising other things for extreme performance.

This is certainly a thing. However, cultures can vary a lot. For example Nim provides zero-cost iterators as well as minimal-cost "closure" iterators. Now Nim's string libraries do a lot of copies by default which makes it easy to ruin performance. However, there's an open RFC to provide CoW strings to reduce this overhead. There's also the new `lent` type too. After all, it's used in games and embedded so there's desire for that.

Certainly if your language is tedious about memory management, you'll spend more time on it. Though as an end user I'm not sure I've noticed a practical difference between Go and Rust programs as compared to C#/Java applications. The latter I avoid running if possible because of their general bloat.


Frankly, my reason for using Rust is not memory semantics (I'd be happy for most applications with a Gc), it's the type system.

I could be writing in Haskell for an even more powerful type system, of course, (especially now that Haskell has gained linear types) but the language never took hold outside of academia.


I know this is facile, and it's speaking as someone with a lot more enthusiasm about than expertise with Haskell, but whenever I write some toy Haskell code and have to think about "remember to close this file handle before the end of this block" or "make sure not to use this file handle outside this block (because we just used a combinator to close it)", I'm baffled that Rust has mostly figured out that whole class of problem before Haskell


> I'm baffled that Rust has mostly figured out that whole class of problem before Haskell

Have you heard of `withFile`?

https://www.stackage.org/haddock/lts-20.16/base-4.16.4.0/Sys...


Yeah, that's what my second gripe was meant to complain about. Really,

    withFile :: FilePath -> IOMode -> (Handle -> IO r) -> IO r 
is just a more general way of writing

    withFile :: FilePath -> IOMode -> (Handle -> IO Handle) -> IO Handle
;)

I've also heard about the ResourceT monad transformer but as of right now I don't think I could explain what it does or judge how well it does it, maybe that's also a sufficient solution.



Note that a `Handle` can escape from `withFile` (unless perhaps this library has been updated to use linear types, I haven't followed Haskell in a while).

That being said, even without linear types, it's much harder to accidentally leak a scoped value in Haskell than in most other programming languages, as the leak shows up in the type signature.

/pedantic


> Note that a `Handle` can escape from `withFile`

Yes, this is true. If someone really wanted they could write a version that didn't using an ST-like type variable trick. I guess it hasn't been considered sufficiently necessary.


Well, “never took hold” is relatively. It is and will likely always be a niche language, but one with a quite big audience and ecosystem. There are people happily using different languages that are much smaller, they just happen to occupy a less academic niche — e.g. I’m guessing but you probably don’t have such feelings against D, when it might very well be smaller than Haskell.


To clarify, I have actually coded a bit in Haskell and I have actually released open-source code in SML/NJ, a commercial application in OCaml and worked (as a very minor contributor) on industrial code in Coq.

But if I have to start a new industrial project, I will most likely use Rust, because it is easier to hire and because there is lots of momentum in the Rust ecosystem, which means that there are crates for just about everything.


> which means that there are crates for just about everything

There is definitely momentum, but it doesn’t make Rust’s ecosystem big relative to the top 5 in any way.


Absolutely.


They are not for majority of programs, but they are essential in the niches that Rust and Zig target.

This is why it took so long to have a viable alternative to C and C++. Every other language decided "memory management is hard, GC is fine for 99% of programs, let's go with GC", and language after language kept disqualifying itself from the no-GC niche.

From the Rust intro talk: http://venge.net/graydon/talks/intro-talk-2.pdf

> • Everyone is dodging the niche I'm interested in.


A great point. As I’ve commented before though, watching webdevs get excited for it, or “rewrite it in Rust,” or developers contort it to try and do things it shouldn’t is sort of morbid from a distance. I’m sure Darwin will work his magic and Rust will settle in the intended niche. But seeing the insane excitement levels and recruiters using Rust to attract applicants…like I said, morbid, and hard to sustain.


Rust has features besides memory safety that are generally useful — robust error handling, pattern matching, fast and powerful JSON support, nice package manager, ease of deployment, first-class WASM support, etc. Some people value these features enough to use Rust even outside of its niche. Even when performance isn't critical, it may be a nice to have.

I sometimes feel like Rust is overcompensating for its hard parts :) It's not easy to learn its memory management, but the error messages do their best to teach you!


I completely agree with what you're saying here and wonder every day why Nim hasn't taken over the world. People seem not to realize that Nim not only has a "pluggable" gc system (so you can pick the gc that makes sense for your use case), but also allows you to turn the gc off. As in, no gc (which is exactly what you need in certain very narrow situations).


One of these days I'll be able to try embedded Nim with no GC. I want to see what its arena allocation semantics are like.


The thing that turns me off to Nim is the issue with it getting picked up as malware. I can't even download the installer for the language without it getting removed from my computer after getting picked up by Windows Defender

It's not Nims fault, but is still a point of friction


That's a a compelling argument: GC/RC w/ stack allocation where possible. I associate GC w/ pointer chasing and the unavoidable L1, L2, L3 cache trashing that goes w/ it.

To what extent is this possible? Does Nim have LTOs that rewrite memory handling across compilation units? I'm guessing no, and instead it's local & one-off, rather than something one can bank on.


> That's a a compelling argument: GC/RC w/ stack allocation where possible.

Indeed, it's a great combination that means you're productive and performant without really trying most of the time. The type system is really good as well, and in general there's a strong focus on compile time over run time like Zig, but with better procedural macros than Rust.

> To what extent is this possible? Does Nim have LTOs that rewrite memory handling across compilation units?

The GC is built on general move analysis with destructors you can hook for your own types. The nice thing is it's a deterministic, compile time expansion (you can view with '--expandArc:somefunction'), so it's useful for embedded or high performance stuff.

The intro to ARC is pretty good for an overview: https://nim-lang.org/blog/2020/10/15/introduction-to-arc-orc...


https://zevv.nl/nim-memory/

  Local variables (also called automatic variables) are the default method by which Nim stores your variables and data.

  Nim will reserve space for your variable on the stack, and it will stay there as long as it is in scope. In practice, this means that the variable will exist as long as the function in which it is declared does not return. As soon as the function returns the stack unwinds and the variables are gone.

  In Nim, all your data is stored on the stack, unless you explicitly request it to go on the heap.
Strings and seqs are allocated on the heap and accessed via a stack pointer, but the stack pointer controls their lifetimes, and semantically behaves like any other local variable.


Do note that this was written before Nim's new runtime, ARC/ORC, became stable! The author has been working on a companion guide to the new runtime that goes over how it manages to be efficient with reference counting, but it's (very) much a work in progress.


> Still not convinced that memory semantics are critical in the vast majority of domains.

This is okay. Zig is not targeting the vast majority of domains. It targets the low-level, performance-critical domain that C occupies. It would be a great language to write a compiler/interpreter for a higher-level language that has the bells and whistles that you want.


It's still bizarre though that Rust is capturing such ridiculous mindshare. I suspect it has a lot to do with web developers being plugged into Mozilla, and Mozilla spending quite a lot on Rust development and marketing. And Zig may be being roped into it. It seems to be a temporary low-level programming zeitgeist driven by YouTube and Reddit recommendation algorithms to an audience that has never done it and probably never will.


> It's still bizarre though that Rust is capturing such ridiculous mindshare.

I don't think it's that bizarre. The two big headline features that bring Rust such popularity are: #1 "70% of bugs are memory-safety bugs" [1] and Rust can help solve those, and #2 C/C++ have a couple of package manager solutions - none of which have critical mass and Rust "comes with" cargo.

Those two make me really eager to continue experimenting with Rust.

> It seems to be a temporary low-level programming zeitgeist driven by YouTube and Reddit recommendation algorithms to an audience that has never done it and probably never will.

This is some weird gatekeep-y kinda thing. Most of us didn't start out with low-level programming. Wouldn't it have been odd and frustrating for someone to tell your younger self that you have "never written C and probably never will"?

[1] https://github.com/microsoft/MSRC-Security-Research


> This is some weird gatekeep-y kinda thing. Most of us didn't start out with low-level programming. Wouldn't it have been odd and frustrating for someone to tell your younger self that you have "never written C and probably never will"?

I'm not making any sorts of demands about who should be able to program at a low level, but I'm being realistic about who will. I have an intuition that I do not think is unreasonable that the vast majority of the 20- to 30-something crowd that recently learned JavaScript in a boot camp would bounce off of systems programming hard, while a handful might discover it's what they should've been doing all along.

The young are not who I'm thinking of at all. Someone with time and neuroplasticity can and should be spending that time toying around with C, Haskell, and Lisp to expand their understanding as much as possible.


People bounce off of systems programming because C/C++ is hard. Rust's focus on ergonomics, error management, tooling and learning resources actually makes people want to try again.


> 70% of bugs are memory-safety bugs

70% of security bugs.

Approximately 0% of bugs that I had to deal in my career are memory-safety bugs.

That doesn't mean it is not important to fix or avoid security bugs (it indeed is!), but you have to be clear what you are selling.


Rust is the first low-level language that actually solves a fundamental problem of low-level programming. I don’t see why is it surprising that it gains weight.

Zig, while I appreciate many of its design goals and definitely has some novel ideas, is “just” a better C.


I think your comment can actually help me clarify what I meant. My perception is that there is a rift between what domains Rust is targeting versus what audience is actually hyped about Rust.

When I look to the C++ world and the embedded/realtime systems industry, what I see is lots of discussion about Carbon and herb Sutter's CppFront. I don't see as much Rust discussion.

When I look to the JavaScript/webdev/fullstack world, that is who I see discussing Rust; these are domains where GC languages, even those with lots of GC churn like Clojure, have already proven their utility. In the creator sphere, the same folks talking about JS frameworks are the ones covering Rust and, to a lesser extent, Zig.

As usual, the C world seems completely disinterested towards what is outside of it.


There is definitely a hype among programmers who are familiar only with higher level languages, but I think the actual core audience is the (ex-)C/C++ community. I am definitely not some integral part of the Rust community, so my experience there is limited, but based on discussions in for example their subreddit, it is definitely not made up of beginners who are just learning what a pointer is.

There are beginners, but I see mostly veteran embedded/systems programmers with very good understanding of both low-level systems and the advantages/disadvantages of C++ for example, making up for very technical and interesting comments.


Rust solves a problem for people who only know GC languages - performance. They see their developer tools rewritten in Rust to great success. Examples include ruff for Python, turborepo and turbo build for JS. These tools are worth adopting because they're much faster than what came before them. Therefore Rust is worth learning to build things where performance matters.

For C and C++ developers, the value prop isn't so clearly defined. Sure it's possible that Rust code they write might have fewer bugs. But if they think they're the sort of developer that doesn't write bugs, then what does Rust do for them? Just slows them down. I won't get into whether they're right or wrong to feel the way they do.

There's also an element of the former group being more open to learning and adopting new tools. The latter group is more likely to say "if it ain't broke, I don't need to learn a new thing".


Part of my purpose with my OP was pointing out that the limiting factor for performance is not GC. When developers are coming from JavaScript and Python, the dominating performance factors are interpreting and dynamic typing. For native-compiled GC languages like Nim, Go, and D, GC only becomes an issue when it interferes with deterministic runtime constraints (RC, bounded GC, and arena allocation fix this), or when the GC is subject to lots of allocations and enormous pressure (which Nim minimizes).

Which is why I find it weird that webdevs are hyped about Rust: They're using it in an enormous number of publicly distributed applications where a native-compiled GC language would be a clearly better long-term choice.


> Part of my purpose with my OP was pointing out that the limiting factor for performance is not GC.

Agree. Also agree that it’s possible to write performant software in the languages you mentioned.

> where a native-compiled GC language would be a clearly better long-term choice.

I don’t think this is the case, and I don’t think you’ve made the case for it. I think there’s some implicit assumption that the GC gives a better developer experience and that’s why it’s a better choice? In a vacuum maybe, but the dev ex in Rust is better than Nim for a number of reasons.

It’s easy to see if you’re right or they are. In 5 years I’m willing to bet that the tooling for the JS ecosystem is all written in Rust, Zig or Go. Nothing in Nim or D.

You might see that and think “oh, it’s because they made suboptimal choices”. I see it differently though.


If not having to clutter code with memory semantics isn’t enough of a case, I could gush about UFCS abrogating the need for interfaces, or Nimble and the dead-simple module and export system (and not needing to scatter ‘pub mod’ files all over the project), or the choice of any number of useful GCs or to forego GC, or the very quick compiler presenting very powerful, sensible options and making things like static linking trivially easy. This all blew my Rust experience out of the water.

But it’s ultimately an uphill battle. Nim has no marketing budget. It’s comparable to any language introduced in the 90s going up against Java. There is a zeitgeist of “cool” around Rust that means many tools will be written in it regardless of how much grassroots rooftop shouting anyone does about Nim. I think too many people assume we all live in a world of perfect information and rational decisions, but that just isn’t the case.


Yeah it’s entirely possible that some people pick what’s popular instead of what we think is theoretically optimal.

I would only caution you about assuming that people are following the zeitgeist instead of making sound technical decisions. For example there’s only one language that prevents race conditions at compile time. Maybe they write more multithreaded code than you did and really prize this feature.

Maybe they like the libraries that make the ecosystem rich. Does nim have anything as polished and performant as clap and serde? Maybe they like the generally high quality of documentation in the Rust ecosystem? I could point out several other legitimate axes on which Rust is a superior alternative.

It doesn’t matter, because you’re aware of most or all of these. What’s important is for you to avoid falling into the fallacy of “what’s popular must be bad, because the average person isn’t as smart as me”. Lots of smart people fall into this trap. It’s especially bad to fall into this when you’re thinking about platforms because the utility of a platform is directly proportional to its popularity.

Language X might have so many flaws, but if it’s popular then the community around it could find myriad ways to mitigate the flaws. They’ll release a million useful libraries that make our lives easier. And if the language gets 50% more popular, it gets 50% more useful to us because the ecosystem gets richer. Better ecosystem means more companies adopt it => more jobs available => it’s a sound choice to learn this language in your free time => more devs available with this skill => it's a good idea for more companies to adopt it. It’s a fly wheel of adoption. Java benefited from this, even if it wasn’t a perfect language. Rust benefits, but to a lesser extent because there’s so many good alternatives now (including Java!)

Even if Rust doesn’t do it for you in 2023, I’d counsel taking a look again in 2026. The growth around the language (80% per year in crate downloads, 30% per year in crate authors) means that it’s possible that you might find most of your pet peeves might have been addressed by then. Either in the main language or in a library.


>Does nim have anything as polished and performant as clap and serde?

"Polished" and "high quality" are more subjective/implicitly about adoption, IMO. "Performant" has many dimensions. That said, I just tested the Nim https://github.com/c-blake/cligen vs clap: cligen used 5X less object file space (with all size optimization tweaks enabled in both), 20% less run-time memory for large argument lists, and the same run-time per argument (with march=native equivalents on both, within statistical noise). cligen has many features - "did you mean?/suggestions", color generated help and all that - I do not see obvious feature in clap docs missing in cligen. The Nim binary serde showing is unlikely as good but there are like 10 JSON packages and that seems maybe your primary concern.

More to add color your point than disagree (and follow up on my "adoption") - your ideas about polish, quality, docs, etc. are part of feedback loop(s) you mentioned. More users => Users complain (What is confusing? What is missing? etc.) => things get fixed/cleaned up/improved => More users. Besides "performant" being multi-dimensional, the feedback loop is more of a "cyclic graph". :-) Or maybe this is just a sub-graph of "ecosystem richness". While I probably prefer Nim as much or more as @netbioserror, I am not too shocked by the mindshare capture. It seems to happen every 5..10 years or so in prog.langs.

While many of your points are not invalid, tech is also a highly hype-driven & fad-driven realm. In my experience, the more experience with this meta-feature that someone has, the more skeptical they are of the latest thing (more rounds of regret, etc.). Also, that feedback graph is not a pure good. Things can get too popular too quickly with near permanent consequences. ipv4 got popular so quickly that we are still mostly stuck on it 40 years later as ipv6 struggles for penetration. Whatever your favorite PL is, it may also grow features too fast.


Yeah, your points are valid.

I think the beauty of serde is its flexibility. I can change the format from JSON to one of the other 21 supported formats with ease.

We're both agreed - popularity benefits platforms in many ways. I'm not a fan of hype-driven things, unless the hype ends up making the thing more valuable.

Rust is a good example - there was undoubtedly something useful there with its concept of compile time checks and memory safety with performance. It also had nice tools like cargo. But that wasn't enough. Quality alone isn't enough.

Look at the adoption curve (https://lib.rs/stats). If it had remained as popular as it was in 2018 then it would not be something I could recommend for general purpose programming. The main criticism of Rust in 2018 was that there weren't enough libraries to do basic things. You had to write and maintain a lot of that yourself. Fortunately for Rust, the hype based around its feature set and early successes snowballed into more adoption. In 2023, hardly anyone complains about missing libraries. At most they're missing one or two, and they can write those themselves.

Joining a hype train of a platform is a rational thing to do, because of the technical benefits of using a popular platform.

I'm far more critical of fads in our industry where we blindly copy practices from large tech companies, like algorithmic interviews. Copying interview practices doesn't lead to shared benefits, unlike a platform.

> Whatever your favorite PL is, it may also grow features too fast.

For what it's worth, Rust is really conservative about adding new features. Only big features I can remember in the last 3 years is GATs, and that's a pretty nice feature. Most of the feature work is making things more consistent. For example, we can write async functions but not async trait methods. They're fixing this inconsistency now.


While I love Nim, another reason for Rust's success (aside from its type system, general good language design, and zero-cost memory safety) is that the tooling is ridiculously good, probably the best of any language I've used, sans maybe Java. Nim's tooling is improving but still has a ways to go - I still occasionally get orphaned langserver processes, and while the langserver itself is pretty good, it's nowhere near the level of what rust-analyzer can do. Especially with error messages and (automatic) refactoring.


That's a very limited perspective.

I have worked in HPC, Cryptography and Genetics, and Rust is sailing full speed in all these domains -- which are as remote as may be from WebDev & Mozilla.


The good thing about Nim is that you can chose between several garbage collection algorithms, use no garbage collector at all and do manual memory management or allocate without freeing if the program is short lived and doesn't use much memory.

Also having a C interface is great since you can use any C library.


How different are these options from each other, syntactically and semantically? Do i need to know all of them to read library code?


There a lots of good languages to choose from with GC.

Zig would not have much unique to offer if it was a GC language, imo.


This title is almost perfectly designed to do well on HN, but it's definitely worth reading in it's entirety.

Some highlights for me:

- The author sees Rust and Zig in different niches. When Rust was created it didn't need to specialise because there were no languages like Rust. It was free to be a general purpose language targeting multiple domains. But that's not the case for Zig. He sees Zig shining at writing systems software that requires low level control - where unsafe code is a necessity and fine grained control over allocation is a blessing.

- The author was the original author of rust-analyzer, the LSP for Rust. I remember reading his original proposal for why Rust should ditch their existing LSP effort (called Rust Language Server) and start afresh with a completely new. And he was 100% right. He knows what he's talking about when it comes to IDE experiences for languages. I hope the Zig developers listen to what he's suggesting here.

- "Rust is a language for building modular software ... the core of what Rust is doing: it provides you with a language to precisely express the contracts between components, such that components can be integrated in a machine-checkable way." That is a good description of Rust. Rust allows you to confidently assemble a program that works well, even if you have to collaborate with hundreds of other programmers to create it.

- "Zig is about perfection. It is a very sharp, dangerous, but, ultimately, more flexible tool." Makes sense, but this tells me that I probably lack the skill to write correct Zig.

- Comment by the author on reddit (https://www.reddit.com/r/rust/comments/123jpry/comment/jdv9x...) about advantages that Zig has is illuminating. Again, none of these appeal to me, but these are great to have in the right use case.

- In a different comment on reddit (https://reddit.com/r/Zig/comments/123jpia/blog_post_zig_and_...) he said "Until release-safe guarantees the absence of UB, I wouldn't be ready to recommend Zig as a general-purpose language." Which is fair, but it's still early days for Zig, years to go before 1.0. It's entirely possible that this behaviour could come to be.

I think the author's article makes a compelling case that Rust and Zig can both co-exist and succeed.


Zig feels like a good match for replacing C on microcontrollers because it permits “hold my beer” shenanigans and memory management is less of an issue. Zig metaprogramming also seems like a better match for these environments - think about how much nicer you could do stuff like qmk -, the Rust embedded stuff seems quite painful in comparison.

Haven’t tried this yet, it’s on my perennial todo list and when I touch this stuff I usually want to get something done. I also imagine you’ll have to do everything yourself with Zig here, while Rust does have a lot of HAL crates.


I have the exactly opposite experience. Rust move semantics, borrowing, Send&Sync types and checks and type based compile time state-machines is exactly what I was missing specifically on embedded systems in C/C++ and what I'm sorely missing in Zig.

In Rust you start by splitting peripheries to blocks, then to registers, then to register parts and distributes these (these are zero cost operations!) into irq handlers and tasks and the type systems will check that everything that need exclusive access has it .. that you do not use read-modify-write operation on a same register from two interrupts where one can preempt the other without using some guard as mutex.. while it perfectly allow you to do this with write-only registers without guard etc... it is just so nice to work with.

People who find it complicated and/or not worth it are usually doing either some rudimentary super duper easy stuff on architectures or OSes which does not have or utilize nested interrupts or they do it on some IOT like noncritical stuff where they don't give a s** (vs industrial machinery, drones, other vehicles..).


> Zig feels like a good match for replacing C on microcontrollers because it permits “hold my beer” shenanigans

People who have used Zig to do embedded work have reported the opposite: they like Zig because it helps keep simple things simple and not more unsafe than they have to be.

https://kevinlynagh.com/rust-zig/


I'm not sure what was available to the author at the time of writing, but it feels like abstractions have gotten better and writing embedded Rust has gotten easier.

I also wrote keyboard firmware in Rust and thought it was great! I leaned on embedded-hal and the chip HAL crates (one layer of abstraction higher than the author in your link) and was able to get the majority of the firmware done in a day or two. For example, here is my key matrix scanning function:

https://github.com/bschwind/key-ripper/blob/d33db6144bbb6f80...

I also gave a small talk on the subject (aimed more at beginners) at a Rust meetup in Tokyo:

https://www.youtube.com/watch?v=x7LQevYn7d0

And finally, a (somewhat) dated article I wrote about using Rust for embedded work:

https://blog.tonari.no/rust-simple-hardware-project

Overall, it feels totally usable to me, and I love how Rust's ownership and lifetime concepts prevent me from doing stupid things like reusing a pin for two peripherals or configuring a peripheral with pins it can't work with.


> I remember reading his original proposal for why Rust should ditch their existing LSP effort (called Rust Language Server) and start afresh with a completely new. And he was 100% right.

Where can I read more about this? What ended up happening?


https://blog.rust-lang.org/2022/07/01/RLS-deprecation.html was the announcement of the replacement, and talks about it a little bit:

> RLS was introduced by RFC 1317 and development was very active from 2016 through 2019. However, the architecture of RLS has several limitations that can make it difficult to provide low-latency and high-quality responses needed for an interactive environment.

> rust-analyzer uses a fundamentally different approach that does not rely on using rustc. In RFC 2912 rust-analyzer was adopted as the official replacement for RLS.

These RFCs provide some more color.

The long and short of it is, the RLS relied on a process where rustc would sort of pre-process your codebase, and spit out a "save analysis" file. RLS would read this file, and do its thing. rust-analyzer, on the other hand, doesn't use rustc at all(*), and instead analyzes your code on-demand, in an incremental style. The latter has many advantages in an IDE context.

The * is there because part of the original idea is that rustc would end up producing some libraries that rust-analyzer would also use, allowing them to share code. Given that rustc has also been moving towards an incremental, on-demand architecture, this would be easier. I am out of the loop these days, but from the relative outside, this effort seemed to stall out. I don't really know why. Maybe it has been happening and just isn't really visible.

Regardless of those details, rust-analyzer is very good. It's a shame the transition took so long, in retrospect!


Very cool. Thanks for the links & write up!


I can't find the original link to his proposal, but I did send him a message on reddit asking if still had a link to it.

The gist of it was, he felt the first attempt to make a Rust LSP (RLS) had the wrong architecture. It was basically running the Rust compiler in check mode and relaying the errors back. In his opinion an IDE needed a different, incremental architecture that stored state. It also needed to be much more tolerant of errors.

He had already made a Rust IDE plugin in Java for JetBrain's IntelliJ-Rust plugin, and wanted to make a similar one in Rust.

So he started the rust-analyzer project and it succeeded. Last year it became the official LSP of the Rust project. (https://blog.rust-lang.org/2022/02/21/rust-analyzer-joins-ru...)


> - The author was the original author of rust-analyzer, the LSP for Rust

Not only this. matklad is also original author of IntelliJ Rust - another popular IDE.


> He sees Zig shining at writing systems software that requires low level control.

Honestly don't see why we would need to specialize this use case to a different language altogether when Rust already has `unsafe`.


Some arguments about how Rust `unsafe` is not the same as Zig:

https://zackoverflow.dev/writing/unsafe-rust-vs-zig/


My general understanding is that Zig was made to make video games and Rust was made to make system software, and everything falls out from there.


Did you mean Jai, which indeed is made with game development in mind? Because I've never heard anyone talk about games with Zig, and iirc one of the original motivations for the language was audio software.


> First, I think Zig’s strength lies strictly in the realm of writing “perfect” systems software. It is a relatively thin slice of the market, but it is important.

This resonates with me. This is exactly why I use C today, and why Zig appeals to me.


This definition of "perfect" might be counterintuitive for those that haven't read the piece.

The idea is there might some ideal memory allocation scenario that Rust doesn't do "perfectly" that you can theoretically do better in Zig, which I buy. Most particularly, my experience with arenas and defer drop is -- they aren't always perfectly easy in Rust. But this definition of perfection tends to ignore all the manual, boilerplate memory management you have to do in the interim, that you may still mess up in some subtle way.


> Most particularly, my experience with arenas and defer drop is -- they aren't always perfectly easy in Rust.

That is my experience also: https://blog.reverberate.org/2021/12/19/arenas-and-rust.html

I was hoping the borrow checker would perfectly check arena lifetimes. And it does, but it requires that your types have lifetime parameters. That makes logical sense, but my experience attempting to actually use a type with a lifetime parameter is that it is so painful as to not be practical in a low-level library.


"Why is Rust’s 'Bump' not' Sync'?" was exactly the issue I was thinking of. Thanks for writing this. Seriously, a must read!


Not a particularly deep comment, but... matklad (the author) created, essentially, the entirety of the modern Rust developer experience. He is the original author and biggest contributor to both rust-analyzer and Intellij Rust. If someone this central to the Rust world has switched to writing Zig full time, and not for boring, pragmatic reasons either, a little seed of doubt about the long-term stability of Rust as a professional community begins to grow in my heart.


Grayson Hoare who created Rust has been working on Swift language in the last few years at Apple. People change jobs, there is nothing wrong with that. Rust is a community driven project, and the community will keep building. IntelliJ Rust plugin has not stopped evolving since matklad left. matklad may found a new interesting project that happened to be using zig. He didn't say he left Rust community


The post is about how the languages serve different purposes, and he just happens to be working on a project now that he believes benefits more from Zig. Nowhere does he make broad-strokes statements about Rust's (or Zig's) fate


> Zig forces you to pass the allocator in, so you might as well think about the most appropriate one!

I really feel like this is an underappreciated aspect of some of the more complained-about constraints that these newer languages place on programmers.

Like an i32, the default allocator will do a reasonable thing most of the time, but having to think about each allocation (is this short-lived or long-lived? is it actually necessary to do a heap allocation at all here? do you actually want GC?) makes code better and pushes the programmer to improve their ability to rationalise about what they’re actually doing in ways that provide tangible benefits.

By way of comparison, when I started writing Rust I was very frustrated by all the explicit conversions, but eventually realised that it (a) forces me to actually stop and think about what the most natural type is for a given value, and (b) also makes it clearer where the natural boundaries are between components. In my experience, if there’s a lot of conversion going on within a function or module, it’s usually not because the language is ‘too verbose’. Instead, it’s because the API boundaries are in the wrong places, and the language has helped to reveal this by making doing the wrong thing harder.


Ad absurdum that would make assembly a better choice — we can just as well reason about better register allocation strategies.

I think the fundamental abstraction level is very important to get right. Of course it depends on the problem domain, and it might just make sense for Zig to expose explicitly allocators everywhere, but maybe some implicitly passed allocator when not otherwise specified could be a better trade off (correct me if it is a thing)


> That’s the core of what Rust is doing: it provides you with a language to precisely express the contracts between components, such that components can be integrated in a machine-checkable way.

That's a nice description of rust.


I think writing a language full-time is a given or necessity if you've been hired to work on software that uses it. Unless you've had prior experience and can continue tinkering with other things. It's cool that matklad's moved to such an interesting project.

I saw TigerBeetle a few months back, and thought that it's interesting to build a double-entry DB. As an accountant who's built double-entry systems into 'normal' SQL DBs, I find this an interesting enough concept that I'm now intrigued to try it out.

Perhaps the year is still young enough for me to finally learn Zig while I go through their repo. It might also help me with the awkward knowledge gap that still exists between Rust FFI (in that I seldom know what I'm doing) and C.


For those reading only the beginning, one could think that Matt is endorsing Zig over Rust. It's not that, from the article:

> It’s not true that rewriting a Rust program in Zig would make it simpler. On the contrary, I expect the result to be significantly more complex (and segfaulty). I noticed that a lot of Zig code written in “let’s replace RAII with defer” style has resource-management bugs.

I also have read Matt's "Hard Mode Rust", and I think there needs to be a better way. Why can't we have flexible language that can span both styles of allocating resources? I also wonder what TigerBeetle would look like if it were written in Rust using the "Hard Mode".


“Matt” is not his name https://matklad.github.io/about.html


> we don’t have a reliability-oriented high-level programming language with a good quality of implementation (modern ML, if you will)

Java is fairly popular, I hear.


Calling Java a modern ML is a bit of a stretch :-). It's getting sum types, I hear, but it still has null, and it's not really the feeling of a nice expression oriented Hindley-Milner language. If anything scala 3 might be a closer candidate.


It already has sum types. Pattern matching on them is not yet finalized, but the ADTs itself is.


let us know when Java's type system is sound, and can be completely inferred.


This is easily lost in the noise, but Matklad wrote a good comparison between Rust and Zig which is worth linking to, highlighting Zig's strengths: https://www.reddit.com/r/rust/comments/123jpry/comment/jdv9x...


The rust-analyzer guy now writes Zig full time?


> I now find myself writing Zig full-time

Seemed pretty clear to me. :)


Yeah that made me wow too!

He's gonna have a wild career.


Yes, he found a job writing Zig.


I'm learning zig myself after several years of doing rust and golang (I still use golang a lot).

I particularly like the quote >"Because even if you and I both know how to write memory safe C, it’s very hard for us to have an interface boundary where we can agree about who does what."

This is still a huge issue in many domains, in many languages. Do you throw the contract out the window on responsibility because you lean on your GC? Or do you make it explicit on who owns what when you don't have a GC. If I allocate a struct as part of my ABI to give to you, the application developer, am I clear on who's responsible for freeing the struct? Do I provide a function for that or do I call it out as gospel for using the library? I wish we didn't have to have these fights.


May someone chime in between that and make a quick 3-way comparison with Golang?

I already know Rust and shipped nontrivial stuff, but I need something simpler that average developers can pickup quickly to produce performant+parallel code and be able to crosscompile+ship a single binary.

(The last point already ruled out Crystal which has a poor crosscompilation story as well as insanely long compile times for nontrivial stuff = bad DX)


It takes a while to understand allocations in Go. There are obvious ones like `make()` and `new()` and `append()` (though this one doesn't always allocate depending on the capacity of the thing you pass to append, I think). There are less obvious ones (if you don't have experience thinking about it) like string appending and "casting" between `string` and `[]byte`. I think it's also tricky to figure out when escape analysis goes wrong and you're accidentally heap allocating things you didn't need to.

You develop an awareness of these aspects if you do performance-sensitive code in Go. But in Zig all allocations are explicit so you can't really accidentally have allocations.

On the other hand, you have to manually manage memory in Zig. Unlike Rust, Zig has no builtin atomic reference counting. So memory management in Rust and Go are "easier" if you don't want to be aware of memory allocations. Memory management in Zig is "easier" if you need to be aware of memory allocations.

And neither Zig nor Go are as strict as Rust about ownership so it's "easier" to express internally-mutable datastructures in Zig or Go (see "Entirely Way Too Many Linked Lists" [0]). Whereas it's easier to be sure you haven't screwed up ownership in Rust compared to Zig. In Go you don't have to worry since the GC will not let ownership be a problem.

[0] https://rust-unofficial.github.io/too-many-lists/


The main topic of the article is about resource management (ie. allocation), which is automatic in Golang, so a comparison wouldn't make much sense in this context.

Additionally, a requirement of the project is:

> On the engineering side of things, we are building a reliable, predictable system. And predictable means really predictable. Rather than reigning in sources of non-determinism, we build the whole system from the ground up from a set of fully deterministic, hand crafted components.

which rules out garbage collected languages (also see the design document here: https://github.com/tigerbeetledb/tigerbeetle/blob/fe09404d46...).


Go is absolutely nothing like Rust, Zig, C; it is arguably closer to JS than those. I really dislike how it is camouflaged as a low-level language.


Happy to see the tide turning from Rust towards Zig. I think 2023 is going to be a pivotal year in this space!


Not entirely sure how you got that impression from the article.


Probably just a small miss, but in the first section the author refers to stack space limitations possibly hit when calling malloc. That should probably refer to heap space. Or if it is stack space they meant then maybe not malloc but rather stack allocations / function calls.


I think they’re actually referring the stack usage from the function call itself (the creation of a new stack frame). I believe the point the author is making is that while it’s common to check the return value of malloc to see if heap allocation was successful, it’s comparatively rare to do any analysis of stack usage to verify that none of your function calls are going to cause a stack overflow.


Rust at least will stack probe to trigger any guard pages upon allocation.


Besides that little inaccuracy I very much agree with most things in the article and think it brings up good points. Specifically the part about allocator handling is a huge thing for me. Having the allocator be a global thing has always been a bit of a cheat imo and it solves so many memory issues just to remove that part. And opens up some optimization possibilities ad well.

I do agree with some of zigs Cons that you bring up as well. Currently I have really been into Linear types and feel like that would be the perfect "simple" type-based memory safety zig could incorporate without making it too c++ complicated. But at the same time I don't know if it would work with its comptime and maybe it too would compromise on the simpleness of the language. I don't know.


I think he's referring to the fact that as a function, calls to malloc result in a new stack frame.


Zig appeals to the “lone genius hackers”. I'd read this as the folks that like to know all the code in their code base, as is typical for embedded or moderate size projects. Folks that do not need (and resent) the abstractions made for every use case under the sun, which to them only ends up making the code harder to read and to comprehend.

Though Rust does an admirable job, and things are much better than for OOP languages, it's still a behemoth. Take a (any) small project, and the amount of dependencies upon dependencies explodes. A project easily takes up 1GB on disk, and it is impossible to know exactly what has been pulled in, or why. How much of this code actually ends up in the binary? It feels like a small fraction, like 5% (or even 1%), but that 95% is there and causes breakage. Now you have to debug something that's literally out of your world. It's not Rust, its the build. No matter how small the actual issue is, you are not qualified to find it.

Note that Zig's build system provides the exact opposite experience. Errors are always relevant, and can (therefore) always be solved.

In summary Rust takes away the helicopter view. Yes, in return it provides an unparalleled level of assurances, it really is magic. But I find myself returning to the roots, and Zig feels tasty. I bet Chatgpt will then handily convert the project, when done, to Rust, faster than I ever could.


You don’t need to pull in dependencies if you don’t want to.

But if you do, the contracts on ownership enforced by the compiler allows the software to compose better.


You're a Rust zealot? The point still stands: for lots of projects 95% of the code pulled in is not actually used, but is still cause for breakage. Of the worst kind, because it has nothing to do with the problem being solved. Sorry, but the build system just gave up, for no good reason at all.

Only total perfection is acceptable to Rust, and that is also it's Achilles heel, well exemplified by this case. Now Rust could change and be more flexible, and I would have great faith in its leadership to find a way, if not for people like you, who charge ahead on a straw man and ex-communicate.


# First rule of zealotry

Whoever is first to throw out the zealot accusation is himself/herself the zealot.


Really weird that my innocuous comment would prompt such a visceral response. Not cool.


> Chatgpt will then handily convert the project [..] to Rust

That’s more or less impossible if you mean idiomatic Rust without chock-full of unsafe blocks. Besides the (probably ironic?) overhype of ChatGPT, Rust’s lifetime-memory patterns are a strict subset of what is expressible with correct Zig/C.


From the linked article

> When we call malloc, we just hope that we have enough stack space for it, we almost never check.

malloc allocates on heap, not stack.


This comment from the author elaborates on what he meant here: https://old.reddit.com/r/Zig/comments/123jpia/blog_post_zig_...


In addition to the semantics, part of which I tried to understand and failed [1], Zig also has a problem with compiler bugs.

In that same post, my very first example ran into a compiler bug. This is not reassuring.

(Though to be fair, I've run into more compiler bugs than most do: one in GCC, one in Clang. Maybe I'm just unlucky.)

That, combined with the fact that the bug was still not fixed last I checked, Zig's security bug policy is to consider them non-security bugs until 1.0, and it is still not 1.0 makes me wonder why anyone would choose it for a major project to run in production. That was TigerBeetle's first and biggest mistake.

[1]: https://gavinhoward.com/2022/04/i-believe-zig-has-function-c...


>Erlang style, where we embrace failability of both hardware and software and explicitly design programs to be resilient to partial faults.

>SQLite style, where we overcome an unreliable environment at the cost of rigorous engineering.

I don't think that the way this contrast is illustrated is nearly as helpful as the author intended it to be. Based on my prior knowledge of Erlang and SQLite, I can reconstruct the idea that the difference is that Erlang tries to build a "reliability layer" below the software, while SQLite builds lots of checks and self-correction into the software. But if I didn't already know that, the quoted lines would leave me hopelessly confused.


And also the billions of tests they run against every builds!

https://www.sqlite.org/testing.html

I enjoyed the CoRecursive podcast with Richard Hipp, it's fascinating.

https://corecursive.com/066-sqlite-with-richard-hipp/#billio...


The only advantage Zig has is its ability to consume C/C++ as it ships with clang

That's about it, the language is too verbose and sometimes not enough verbose with all these .{} you need to always look at the documentation to remember what this does, this is time consuming and does useless context switches

Last time i tried to build their language server to use in my editor, and i was surprised it took an eternity to compile, similar to rust here, i was hoping it would do better

I love the build.zig feature, you stick to 1 language, there is no json/yaml/toml/xml BS, it's very refreshing to see

Cons:

- erognomics

- slow to compile

- verbose but sometimes not (.{})

Pros:

- builtin C/C++ compiler

- build.zig

- files are structs

- lazy compilation model (you can write platform specific code very easily without bloating your project)


I tend to agree with most of your remarks.

Zig would clearly be a better choice than Rust to replace C.

Zig is also better than C in some ways, but worse in others, this is frustrating.

The trick is that most of the good parts are either transient QoL improvements (such as the build system) or guard rails to avoid costly mistakes (no macros, explicit memory management) but the drawbacks (annoying syntax, peculiarities, probably trying to do too much) seems to be long-lasting daily annoyances.

I am waiting for 1.0 and also for Jai.


> Collections are not parametrized by an allocator, like in C++ or (future) Rust.

What does he mean by this?


Psuedo-Rust code here, showing off both styles.

Parameterized by an allocator:

  struct Foo<T, A: Allocator> {
      // ...
  }

  impl<T, A> Foo<T, A> {
      fn new() -> Foo<T, A> {
          // do something to make a new foo, calling functions via A
      }
  }
contrast with "an allocator is passed in explicitly to every method which actually needs to allocate."

  struct Foo<T> {
      // ...
  }

  impl<T> Foo<T> {
      fn new<A: Allocator>(a: A) -> Foo<T> {
          // do something to make a new foo, calling functions via A
      }
  }
(EDIT: "new" was a bad choice for me to pick here, see the child comment that uses 'put' instead; I was not trying to suggest that the allocator only parameterizes a constructor, but any call that would need to allocate.)


I suppose the distinction you're suggesting here is that in the Zig-like example, two `Foo<T>` can have the same type and different allocators.

Author mostly notes that this is more flexible. I can see it being possible in each case to have the system be parametric in the allocator, though it'll be a lot more annoying in Rust give the need to pipe the types around.

Something between that noisiness and the global default does bias Rust toward being uncreative with its choice of allocator.

I wonder, too: in this example, the author notes that because their init function takes an allocator and their event loop doesn't, therefore the event loop does no allocation. But, as long as a global allocator could be accessed from somewhere besides your entrypoint, you could still be calling it. Does Zig offer a capabilities model like that?

All in all, I think this difference is more subtle than Matklad is making it out to be, but I in no way doubt that he's correct. Especially in today's Rust.

Edit: I did some research on Zig and also noticed HashMapUnmanaged which might be more what Matklad was referencing. In pseudo-Rust, it has

    impl<K, V> HashMapUnmanaged<K, V> {
      fn put<A: Allocator>(&mut self, key: K, value: V, allocator: A);
    }
This justifies the statement "Rather, an allocator is passed in explicitly to every method which actually needs to allocate." and makes it much more clear where Zig has gone here.


Ah you are right, choosing "new" was a poor choice for my example. I didn't even realize that implied something slightly different until you made this comment, I'm going to update mine slightly to point this out, thank you.


I understood the comment to be more about some way to parameterize allocators on a per-type basis for standard library types, and not a specific mechanism.

C++ nowadays has stateful allocators (e.g., allocate in this specific pool please). With that, even the variant with a type parameter on the type benefits from a constructor argument that specifies an allocator object explicitly.

And I don't think the type-parameter-on-functions-only approach can be made safe for immutable data structures because the allocators used for allocation and deallocation must match. And the appropriate allocator needs to be known at Drop time anyway.


Having dealt with the brittleness of mixed-allocator code in C/C++, I much prefer library designs that ensure I'm using the right allocator in the right place.


Thanks for the example. I wasn't aware this is something that was planned for Rust, seems like a pretty large breaking change. Interesting!


Nothing is breaking! The change isn't a move to a zig-like style, but instead, that in today's Rust, you have

  struct Foo<T> {
      // ...
  }
and not

  struct Foo<T, A: Allocator> {
      // ...
  }
This isn't breaking because there is a default type provided for A.

The handwave is over "Today's Rust", as this machinery has already landed, in a sense, it's just not really possible to use with the standard library because the allocator API isn't fully stable.

Here's an actual example: https://doc.rust-lang.org/stable/std/vec/struct.Vec.html See that "A = Global"?


I see. I'm interested in the motivation and design goals behind it, is there something that spells those out? The only thing I could find quickly is this [0].

[0] https://internals.rust-lang.org/t/why-bring-your-own-allocat...


https://rust-lang.github.io/rfcs/1974-global-allocators.html was the original RFC.

My vague understanding is that there's a working group https://github.com/rust-lang/wg-allocators

The further I get from working on Rust day to day, the less I know about these things, so that's all I've got for you.


what if you want there to be individual allocators (of the same type) that you want to point differently based on <X condition>. For example in a M:N greenthreaded system perhaps you want each greenthread to have its own arena allocator. But you can't use something like threadlocal, because at any time a greenthread might move to any given OS thread? Is that possible in rust?


1. You can do anything you want in Rust, you can make your own collections do whatever you want at any time.

2. These changes are about two things,

2a. the first of which is a trait that represents the concept of an allocator, in case users' code would like to be generic over ones that exist in the ecosystem. You can of course still paper over this yourself but the whole point of a vocabulary trait is so that you don't have to do all the work yourself.

2b. the second of which is how the collections in the standard library are customized by allocators.

So, in your own code, absolutely you could do that if you wanted to. With this proposal, and trying to do that with a standard library provided data structure? I'm not an expert on the API that it gives, so I can't really speak to its viability here. I would imagine it's not super simple, given what I do know.


sorry, I should have been clear: I meant with a standard lib provided data structure.

Thanks!


Yes, the data structures in std are parameterized over the allocator, with Global as the default allocator if not specified.

https://doc.rust-lang.org/std/vec/struct.Vec.html


The point is, they are parametrized over a type, not an instance. By contrast, types in zig that are variadic by allocator are (typically) parametrized over instance.


My understanding is, instead of the allocator being a generic parameter of the type, it's a value you pass to the constructor.


The crucial trick is that it's not just the constructor, it's everywhere which might allocate.

Rust's proposed allocator API (what the article calls future Rust) takes an allocator for constructors, but the effect is the type parameter is just inferred during construction, the same way if you say OK, make me a Vec of this array of Strings, Rust infers the Vec's type Vec<String>.

Zig's standard library provides both conventional compound structures like those I described for Rust, and "Unmanaged" variants in which you must provide an allocator every time you do anything which might allocate, so e.g. addOne on ArrayListUnmanaged requires the allocator, which it will only actually use if adding a single element to the ArrayList exceeded its current capacity. It will assume this is the correct allocator to de-allocate the old backing storage and allocate new storage, so you can't use this design to move from one allocator to another.

Interestingly Zig's ensureTotalCapacity not only avoids the problem of C++ reserve where it destroys the amortized growth behaviour, but it actually insists on exponential growth even if that considerably outstrips the growth requested.

Say we've got 15 Foozles in an ArrayList with capacity 16 (or Rust's Vec or C++ std::vector). We know we want to put 20 more foozles in, for a total of 35, although maybe more.

Rust's Vec says OK, reserve(20), we were thinking of next growing from 16 to 32, but 35 won't fit in 32, so 35 it is. Capacity becomes 35.

Popular C++ std::vector implementations likewise will pick 35 here. But Zig's ArrayList says 16 + 8 + 8 = 32 not big enough, try 32 + 16 + 8 = 56. Capacity becomes 56!


This sort of convinced me to stay away from Zig


can you elaborate why


Not the parent, but Zig being much easier to use unsafely is not a good thing IMO. Rust tries to pave a path for all types of developers to eventually learn how to write performant code safely in a way that integrates cleanly. This is a much more practical need for the programming community at this time.

I have to agree with the parent - I was interested in looking at Zig eventually, but after reading this I am not. They seem to have much more different philosophies than I expected, and Zig's does not resonate.

The author's summary really sums it up nicely. I don't need a language that prides itself on being a bit more dangerous in exchange for some extra control.


the assumption with Rust is that memory safety (or, generally, resource provenance) is something that needs to exist in the type system. Zig is still young, so i would say it's not 100% certain that this shouldn't exist, for example, in a separate step, for example using static analysis.

Rust is already bumping into important cases where the typesystem is incapable of resolving things that "you might want its typesystem to track", and support for these would require breaking all of rust.

Before you say, "static analysis is impossible" -- that's very likely generically true for C (something like Sel4 is in C but gives even stronger guarantees than rust, but IIRC it analyzes machine code). It might not necessarily be. In extremis you could in principle attach a separate file that provides rust function headers for every single function in zig, and checks resource lifetimes inside of each function. So it is not theoretically impossible, just a question of whether or not it's ergonomic and easily accessible.


FWIW sel4 could have verified the C, they simply decided to verify the machine code so that they didn't have to trust that the compiler correctly complex the verified software.

The issue with more broad formal method use is that it's quite a bit of work. The code to be verified has to be designed on the onset for verification, which limits design choices in interesting ways, and even then the verification code far outweighs the code being verified, at about 25:1.


Thank you for the clarification!


> Rust is already bumping into important cases where the typesystem is incapable of resolving things that "you might want its typesystem to track", and support for these would require breaking all of rust.

I would be interested in hearing what it is you had in mind. Are you referring to the discussions around capabilities?


Yes, for example.

Another dramatic one is the whole parametrized keywords thing, though I guess it doesn't quite "break all of rust".


Agree with everything you've said.

> I don't need a language

This, but the bigger issue is that I don't think I can be trusted with such a sharp tool. I'm just aware of my own limitations, and I think I'd better stick to safe languages.


I don't want to need to be trusted. And I don't want to have to trust others so much. I want my tools to tell me I, or someone on my team, did something stupid.


Even better, to not be allowed to do something stupid.


> - Rust is about compositional safety, its a more scalable language than Scala.

> - Zig is about perfection. It is a very sharp, dangerous, but, ultimately, more flexible tool.

Depending on project, experience, goals, team composition and personality, reading these two bullet points you're very likely to gravitate much more towards one or the other, and potentially feel rejection towards the other.


Essentially the other comments capture my view. Speed and memory safety with a rigorous type and language system that disallows unsafe operations unless explicitly marked as such is worth a lot. I don’t think this means zig is worthless, and I see a lot of utility in a fast to write lower cognitive load language that offers the same speed and memory efficiency at the cost of runtime safety, but I don’t trust myself or other programmers to do a good enough job. Memory safety bugs are just so difficult to reason about most of the time that I find the compile time safety mechanisms worth whatever overhead they impose, to a point. Rust is far from reaching that point and is generally easy to develop very complex stuff very quickly. So, after reading the article (I’m more familiar with rust than zig, so it was very informative) it led me to conclude it’ll be fun to learn zig but I don’t think it’ll be useful for Serious Stuff.


tl;dr: author finally gradudated to becoming a lone genius hacker.


TLDR: Zig bad (like C), Rust good.

Not surprised to see an article like this from a core Rust member after two positive articles for Zig on HN frontpage. Gotta keep that strong hold on HN!


It seems you didn't really read the article. Nor do you have any ideas who core Rust members are.


>> When we call malloc, we just hope that we have enough stack space for it, we almost never check.

Does Rust or Zig's malloc allocate on stack? Those millennials.


It's a function, when you call it, it allocates a frame on the stack.


True. But the author mentions malloc, it just sounds like he's using "stack" instead of "heap" by mistake.



In our times malloc always allocated on heap, not on stack. But things are different now, I guess.


I think we should take into consideration AI assistants like ChatGPT when discussing Zig vs. Rust. Differences between Zig and Rust:

1) Zig is memory-unsafe and thread-unsafe whereas Rust isn't.

2) Zig is easier to learn whereas Rust isn't.

3) Zig isn't stable yet whereas Rust is.

ChatGPT can help with Rust:

https://news.ycombinator.com/item?id=33872369

However ChatGPT can't help with Zig since Zig hasn't stabilized yet. Features like for-loops (introduced in 2022) aren't familiar to ChatGPT since ChatGPT's cut-off date is 2021.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: