Author here. I think the Rust vs. Go question is interesting. I actually originally wrote esbuild in Rust and Go, and Go was the clear winner.
The parser written in Go was both faster to compile and faster to execute than the parser in Rust. The Go version compiled something like 100x faster than Rust and ran at something around 10% faster (I forget the exact numbers, sorry). Based on a profile, it looked like the Go version was faster because GC happened on another thread while Rust had to run destructors on the same thread.
The Rust version also had other problems. Many places in my code had switch statements that branched over all AST nodes and in Rust that compiles to code which uses stack space proportional to the total stack space used by all branches instead of just the maximum stack space used by any one branch: https://github.com/rust-lang/rust/issues/34283. I believe the issue still isn't fixed. That meant that the Rust version quickly overflowed the stack if you had many nested JavaScript syntax constructs, which was easy to hit in large JavaScript files. There were also random other issues such as Rust's floating-point number parser not actually working in all cases: https://github.com/rust-lang/rust/issues/31407. I also had to spend a lot of time getting multi-threading to work in Rust with all of the lifetime stuff. Go had none of these issues.
The Rust version probably could be made to work at an equivalent speed with enough effort. But at a high-level, Go was much more enjoyable to work with. This is a side project and it has to be fun for me to work on it. The Rust version was actively un-fun for me, both because of all of the workarounds that got in the way and because of the extremely slow compile times. Obviously you can tell from the nature of this project that I value fast build times :)
> Many places in my code had switch statements that branched over all AST nodes and in Rust that compiles to code which uses stack space proportional to the total stack space used by all branches instead of just the maximum stack space used by any one branch: https://github.com/rust-lang/rust/issues/34283.
Can you work around this by using separate functions for the branches? This will have other benefits for compile time as well. Generally, small functions are better for compile time, because some compiler passes are not O(n), including very basic ones like register allocation.
For large switch statements I do this for readability reasons, because I try to keep my functions small.
> Based on a profile, it looked like the Go version was faster because GC happened on another thread while Rust had to run destructors on the same thread.
Have you tried using jemalloc? It can help a lot for situations like this.
> Have you tried using jemalloc? It can help a lot for situations like this.
Huh, I thought Rust already used jemalloc by default. I looked it up and it looks like it was removed relatively recently. When I was doing this experiment, I was using a version of Rust that included jemalloc by default so that 10% number already uses jemalloc. I remember this because I also thought of trying to speed up the allocator.
Like I said above, I profiled both Go and Rust and the 10% slowdown with Rust appeared to be running destructors for the AST. I think the appropriate solution to this would be some form of arena allocator instead of changing the system allocator. But that gets even more complicated with lifetimes and stuff.
> Can you work around this by using separate functions for the branches?
Yeah, I could have tried restructuring my code to try to avoid compiler issues. But this would have been even more time spent working around issues with Rust. Go was better than Rust by pretty much every metric that mattered for me, so I went with Go instead.
It's too bad because I was initially super excited about the promise of Rust. Being able to avoid the overhead of GC while keeping memory safety and performance is really appealing. But Rust turned out to be not a productive enough language for me.
> I think the appropriate solution to this would be some form of arena allocator instead of changing the system allocator. But that gets even more complicated with lifetimes and stuff.
If you're doing arena allocation in a compiler you might as well just leak all your allocations (which you could get with bumpalo with a 'static lifetime); then you won't have to deal with lifetimes at all.
> Yeah, I could have tried restructuring my code to try to avoid compiler issues.
Well, my point is that it would be good for readability to restructure your code in that way even in Go. 500-line functions are hard to read.
> But this would have been even more time spent working around issues with Rust. Go was better than Rust by pretty much every metric that mattered for me, so I went with Go instead.
I find the opposite to be true, especially for compilers. It's hard for me to go back to a language without pattern matching and enums (much less generics, iterators, a package ecosystem, etc.). The gain of productivity from GC and compile times is not worth Go's loss in productivity in other areas for me. But reasonable people can disagree here.
> If you're doing arena allocation in a compiler you might as well just leak all your allocations (which you could get with bumpalo with a 'static lifetime); then you won't have to deal with lifetimes at all.
I considered this but it's a very limiting hack. Ideally esbuild could be run in watch mode to do incremental builds where only the changed files are rebuilt and most of the previous compilation is reused. Memory leaks aren't an acceptable workaround to memory allocation issues in that case. While I don't have a watch mode yet, all of esbuild was architected with incremental builds in mind and the fact that Go has a GC makes this very easy.
> 500-line functions are hard to read.
I totally recognize that this is completely subjective, but I've written a lot of compiler code and I actually find that co-locating related branches together is easier for me to work with than separating the contents of a branch far away from the branch itself, at least in the AST pattern-matching context.
> It's hard for me to go back to a language without pattern matching and enums
I also really like these features of Rust, and miss them when I'm using other languages without them. However, if you look at the way I've used Go in esbuild, interfaces and switch statements give you a way to implement enums and pattern matching that has been surprisingly ergonomic for me.
The biggest thing I miss in Go from Rust is actually the lack of immutability in the type system. To support incremental builds, each build must not mutate the data structures that live across builds. There's currently no way to have the Go compiler enforce this. I just have to be careful. In my case I think it's not enough of a problem to offset the other benefits of Go, but it's definitely not a trade-off everyone would be comfortable making.
> interfaces and switch statements give you a way to implement enums and pattern matching that has been surprisingly ergonomic for me.
With no exhaustiveness checking (also no destructuring, etc.)
I should also note that you can change the default stack size in Rust to avoid overflows, though there should be a bug filed to get LLVM upstream stack coloring working. It's also possibly worth rerunning the benchmark again, as Rust has upgraded to newer versions of LLVM in the meantime.
This really reeks to me of trying to shoehorn Rust into the solution rather than it being an organic fit to the problem space.
I don’t see the value in this level of thinking here. Why should any developer go through this much hassle when they have a perfectly good solution that, really, I’m not seeing any discussion about this that actually highlights issues in the approach to using ago for this sort of thing
Generally, Rust "should" be faster, because it spends a lot of time on optimizations that Go doesn't do. That's what you're paying for in compile times. If Go is faster on some CPU bound workload despite doing a lot less optimization, that's interesting. (I should note that this is not the norm.)
I'm not sure that this is a CPU bound workload. You're right that if this is CPU bound, LLVM's code generation should come out on top, even if only slightly, but that's not the case. Perhaps writing a JavaScript bundler is more of a memory bound or I/O bound task than a CPU bound task?
What optimizations is Go not doing that Rust is that makes Rust vastly superior to Go? (Or even superior at all)
I don’t think it’s that simple. I understand Go uses garbage collection but that doesn’t automatically mean Go doesn’t do compile time optimizations or is poor at CPU bound work.
While I understand GCs add overhead I don’t think that in and of itself means much here
I remember an article from Figma or even possible from you, where you mentioned that you have rewritten in Rust a tool originally written in node.js. And I remember, back at that time (I think it was 2 years ago) you were very exited about the language. How does it compare to the current situation? What made you to think that Go is a better and enjoyable language than Rust? Is it the faster GC?
A recently published article from a guy working at Discord throw a real flameware, because he was arguing about the opposite: Rust is a better and faster language then Go, but he based his assumption on very old version of Go (1.9) where the GC was way slower then in the current versions.
The Rust version probably could be made to work at an equivalent speed with enough effort. But at a high-level, Go was much more enjoyable to work with. This is a side project and it has to be fun for me to work on it. The Rust version was actively un-fun for me, both because of all of the workarounds that got in the way and because of the extremely slow compile times. Obviously you can tell from the nature of this project that I value fast build times :)
Was the Rust parser written by hand or did you use one of the parser frameworks (e.g. nom or pest) out there? nom, for instance, goes to great lengths to be zero-copy which would probably be a big benefit here.
Both the Rust and Go parsers were written by hand. They are also very similar (basically the Go version was a direct port of the Rust version) so the performance should be very comparable.
I assume by zero-copy you mean that identifiers in the AST are slices of the input file instead of copies? I was also careful to do this in both the Go and Rust versions. It's somewhat complicated because some JavaScript identifiers can technically have escape sequences (e.g. "\u0061bc" is the identifier "abc"), which require dynamic memory allocation anyway. See "allocatedNames" in the current parser for how this is handled.
Note that strings aren't slices of the input file because JavaScript strings are UTF-16, not UTF-8, and can have unpaired surrogates. So I represent string contents as arrays of 16-bit integers instead of 8-bit slices (in both Go and Rust).
In the past I tried using WTF-8 encoding (https://simonsapin.github.io/wtf-8/) for string contents, since that can both represent slices of the input file while also handling unpaired surrogates, but I ended up removing it because it complicated certain optimizations. I think the main issue was having to reason through weird edge cases such as constant folding of string addition when two unpaired surrogates are joined together. I think it's still possible to do this but I'm not sure how much of a win it is.
They are also very similar (basically the Go version was a direct port of the Rust version) so the performance should be very comparable.
Sure, but different approaches are going to be more optimal for different languages.
I assume by zero-copy you mean that identifiers in the AST are slices of the input file instead of copies?
Yes. From the README:
zero-copy: if a parser returns a subset of its input data, it will return a slice of that input, without copying
Geal also makes claims that nom is faster than hand-written C parsers.
It's somewhat complicated because some JavaScript identifiers can technically have escape sequences (e.g. "\u0061bc" is the identifier "abc"), which require dynamic memory allocation anyway.
Nom comes with 'escaped' and 'escaped_transform' combinators. In theory it should be possible, with relative ease, to return a slice if there are no escape characters and an allocated string if expansion is required. Presumably you'd have to use a Cow<str> though.
Note that strings aren't slices of the input file because JavaScript strings are UTF-16, not UTF-8, and can have unpaired surrogates. So I represent string contents as arrays of 16-bit integers instead of 8-bit slices (in both Go and Rust).
Of course it is. My opinion (which is worth what you've paid for it) is that I'd just go for UTF-8 support. I can't remember the last time I've seen UTF-16 in the wild (thankfully).
Performance-wise the other thing that I'd keep in mind with rust is that in debug mode string handling is painfully slow.
- Go is more fun if you are trying to be productive and push stuff out. The programming experience feels fluid and there’s not much agonizing over small details; the language is simple and you don’t need to think as much about things.
- Rust is more fun if you have a focus on perfection. It offers a lot of tools for abstraction and meta programming. These tools can be challenging at times. I do think even with NLL you will find yourself fighting the compiler, trying for example to resolve how you can avoid overlapping a borrow with a mutable borrow in some complicated bit of code, but you definitely get a lot of nice guarantees in exchange. I also do find it frustrating when something as simple as passing the result up can end up being really tricky.
There's truth in what you say, but it only describes the learning phase of Rust. Rust is quite challenging to learn and you spend a pretty long time in the uncomfortable place you describe. But one day, you end up internalizing the borrow checker's rules and you just don't think about it anymore and don't have any productivity penalty at all.
I don’t know how long it takes to fully get past stumbling through borrow checking and learning the intricacies of Result but it’s long enough to be a detriment. Obviously learning curve on its own is a downside, but also this complexity does not disappear when you understand it. It’s similar to, but less severe than, C++, in this regard.
I also think it depends heavily on the type of program you are writing and how. I’ve certainly hit cases where I still don’t know the optimal way of structuring things. In other cases people have managed to help me figure out what I need to do.
Concurrent memory safety is a huge plus, without a doubt, though there are applications where its not enough and applications where its too much. I think that puts Rust in a spot where it has use cases where it is clearly the best option but many use cases where it is overkill. As an example, Go shines particularly well for servers thanks to Goroutines and the fact that many servers have a shared-nothing architecture these days.
> I don’t know how long it takes to fully get past stumbling through borrow checking and learning the intricacies of Result but it’s long enough to be a detriment.
That's right, but no one expect to learn quantum physics in a few weeks either. Rust is indeed way longer to learn than Python, Javascript & others, but it's also much more powerful. And with the same level of power, both C and C++ are way harder to master than Rust (and arguably, nobody really master them in practice since even the most brilliant programmers shoot themselves in the foot from time to time. Yes, even DJB [1].
> Rust in a spot where it has use cases where it is clearly the best option but many use cases where it is overkill.
Indeed. It would make no sense to switch to Rust if Python is good enough for the task. I was just arguing that once you've learned Rust, you can do pretty much anything you want with it without friction, and I personally wouldn't bother writing anything in Python nowadays, because I can write Rust as fast and get the static typing guarantees and sane error handling that comes with it.
> Go shines particularly well for servers thanks to Goroutines and the fact that many servers have a shared-nothing architecture these days.
Having done quite a bit of Go, I don't agree with you. It's way too easy to accidentally share data between goroutines, and then cause data race or deadlocks . The day they introduced the race detector, we found 6 data races in our code (a few thousand loc) and a few others in our dependencies, and in the next year we found two not caught by the race detector (because it's a dynamic thing, it can't catch all races). More than generics (which are being introduced if they don't change their mind like they did for error handling) Go really need something akin to Send and Sync in Rust. M,.or maybe like Pony's capabilities system, but Go definitely needs improvement on that front. Multithreading is a hard thing, and Go makes it too easy to use, to people without the necessary experience (because Go is so easy it attracts a lot of junior or self-taught devs) without safety net and this generally doesn't end well.
- A trade off is a trade off; no need to justify it. Increased cognitive load is a con.
- Sorry your Go experience was bad. I can only say my anecdotal experience was the opposite. Mostly for shared nothing architectures, but I also worked on an MQ-style software in Go and had a relatively good time. I think things that are well-suited to CSP concurrency fare pretty OK. Rust could’ve prevented things like accidental concurrent map accesses, but it still can’t guarantee you are implementing concurrency correctly on the application level (from perspective of say, coherency or atomicity.) So for many apps I’ve written, even somewhat complicated ones, I don’t feel like Rust would always be the best option. To me Rust makes most sense when you really can’t ever afford a memory error. Web browsers seem like an obvious winning case.
While Rust can be technically superior to Go for a lot of use cases, Go has been most fun to program in for me for last 2-3 years. Coming from mostly Python/JS and having maintained systems in Java and Ruby as well, I still feel happiest when writing Go code. I don't know why but I think it's because I've never had deal with a system in Go where I had to peel layer after layer to find how something worked no matter who wrote it. All Go projects I've come across and contributed to have been extremely simple to read, understand and contribute to. _I think_ that is what makes me enjoy maintaining systems in Go so much.
Also not them, but I've worked quite a bit with both Golang and TypeScript.
I find TypeScript's interface flexibility to be pretty clutch when working with high-level code that deals with input that's.. Complex data structures. So I'm thinking configuration files, REST APIs, user/developer inputs. Having worked with many async paradigms, I also favor async/await for developing asynchronous business logic workflows. If you imagine your call graph for a certain workflow, and everything is written to use "async", you can imagine just drawing a circle around a portion of it and then easily "stamp out" more of those to occur in parallel. The way you can collect results and handle errors with the async/await paradigm is a bit nicer than working with channels and go-routines IMHO.
I like Golang for lower-level tooling and network services of course. It also has seamless support for parallel processing in process. The OPs project is something I would certainly look to Golang for.
Now on a tangent, C# has async/await and most of the speed but doesn't quite have the flexibility of TypeScript's interfaces and of course doesn't have the compile speed of Goglang. I would honestly use C#/F# a lot more or stuff if I thought it would fly in my work environment. Would love to work on some open source projects in C# to get it more exposure :)
After spending the last couple days in Rust and then diving back into Java for 30mins, I know what you mean by un-fun. Rust is such a PITA. I really want to like it, but I think I might end up liking C++ more, when Java doesn't cut it...
The parser written in Go was both faster to compile and faster to execute than the parser in Rust. The Go version compiled something like 100x faster than Rust and ran at something around 10% faster (I forget the exact numbers, sorry). Based on a profile, it looked like the Go version was faster because GC happened on another thread while Rust had to run destructors on the same thread.
The Rust version also had other problems. Many places in my code had switch statements that branched over all AST nodes and in Rust that compiles to code which uses stack space proportional to the total stack space used by all branches instead of just the maximum stack space used by any one branch: https://github.com/rust-lang/rust/issues/34283. I believe the issue still isn't fixed. That meant that the Rust version quickly overflowed the stack if you had many nested JavaScript syntax constructs, which was easy to hit in large JavaScript files. There were also random other issues such as Rust's floating-point number parser not actually working in all cases: https://github.com/rust-lang/rust/issues/31407. I also had to spend a lot of time getting multi-threading to work in Rust with all of the lifetime stuff. Go had none of these issues.
The Rust version probably could be made to work at an equivalent speed with enough effort. But at a high-level, Go was much more enjoyable to work with. This is a side project and it has to be fun for me to work on it. The Rust version was actively un-fun for me, both because of all of the workarounds that got in the way and because of the extremely slow compile times. Obviously you can tell from the nature of this project that I value fast build times :)