
Case Study: Npm uses Rust for its CPU-bound bottlenecks [pdf] - yarapavan
https://www.rust-lang.org/static/pdfs/Rust-npm-Whitepaper.pdf
======
javagram
Title could be improved, as I was wondering if the command line tool npm had
itself been rewritten.

It’s actually one of npm’s web services that was rewritten.

FTA: “Java was excluded from consideration because of the requirement of
deploying the JVM and associated libraries along with any program to their
production servers. This was an amount of operational complexity and resource
overhead that was as undesirable as the unsafety of C or C++.”

I find this comparison a bit odd. Even if not using containers, the JVM isn’t
hard to deploy as distro package managers include it. Unless a team is
managing servers manually rather than an automated tool this doesn’t seem that
complex. Am I missing something here?

~~~
feikname
JVM tuning (especially the GC and memory allocation scaling) can be a huge
PITA.

~~~
simion314
But having options is always good, isn't the same with compiled languages? you
keep the defaults options but if you want extra performance you try different
compiler flags or different compiler if the language has more then one.

~~~
weberc2
Honestly I’d rather have good defaults than lots of options. If one tool meets
requirements out of the box, then it’s preferable to a tool that requires a
lot of tooling.

~~~
simion314
But if you do not have options then what is a "good default" , you have a
default option if you have more then an option, so you prefer software with no
options?

Can you give examples of languages(compilers or VMs with bad defaults in your
opinion?

~~~
acdha
Java by default has a hard memory limit which when hit causes services to hang
in a semi-responsive state rather than exiting cleanly, requiring every user
to hand-tune limits and carefully setup health checks and monitoring. Two
decades in, they added a non-standard option to instead exit.

If they had instead made the default to act like almost everything else it
would have worked with standard process monitors and limits with no effort
required and a substantial fraction of the downtime I’ve seen for Java
applications would never have happened.

~~~
simion314
Isn't this default better for average Joe that runs a Java desktop app, and
developers that deploy apps should know how to change this defaults?

~~~
acdha
How is it better for something to stop working but not exit? It’s exceedingly
uncommon for anyone to have correct handling for OOM exceptions so the most
likely effect is that stuff partially works - servers accept requests but
never respond, apps have menus/buttons which don’t work, etc.

Similarly, if developers who deployed apps knew enough to avoid this, we’d
know by now because it wouldn’t happen so frequently. It does highlight who
failed to do real health-checks (e.g. years ago most Hadoop services needed
huge amounts of RAM to start without crashing but they’d be listed as okay)
but it’s the kind of thing sysadmins have been cleaning up for decades.

~~~
simion314
You are probably right for this case, my initial comment was related to
performance tuning configuration, the comment I replied sounded like JVM
should have had X set to my preferred value or "I prefer software that has no
options to confuse me with aka GNOME mentality"

------
steveklabnik
Hey folks! This is part of our whitepaper series. This means that the audience
is CTOs of larger organizations, and so the tone and content are geared for
that, more than HN. Please keep that in mind!

~~~
breatheoften
Are CTO's of largish organizations still not part of the hacker news audience
...? That seems a bit of a damning statement to be about the population of
CTO's ...

~~~
tejasmanohar
I'm sure some are on HN, but it's definitely not "the place" for them, and
many have probably never heard of it

~~~
shaklee3
I would be shocked if any CTOs of large organizations browse HN. Large orgs
put people in those positions that are more about theory and future thinking
over current knowledge.

------
the_duke
A better title would be the subtitle of the article: "The npm Registry uses
Rust for its CPU-bound bottlenecks".

Note that only one service (authentication) was rewritten from node to Rust.

~~~
steveklabnik
That’s what I titled it when I submitted it after we first published this.

There’s also a second service we know about in Rust, and that’s the one that
renders package README pages.

------
BlackFly
> Java was excluded from consideration because of the requirement of deploying
> the JVM and associated libraries along with any program to their production
> servers. This was an amount of operational complexity and resource overhead
> that was as undesirable as the unsafety of C or C++.

Just so everyone here is aware, this is by now an outdated complaint against
Java.

[https://vertx.io/blog/eclipse-vert-x-goes-
native/](https://vertx.io/blog/eclipse-vert-x-goes-native/)

I'm choosing vertx as an example since it competes already with rust and c
based applications over at
[https://www.techempower.com/benchmarks](https://www.techempower.com/benchmarks)
but you ought to be able to compile general programs ahead of time.

~~~
ricardobeat
Are there any companies using these native images in production?

~~~
victor106
Shared this in this same thread for another comment.

The below videos show GraalVM based app loading up Spring framework, and
flowable process engine and making a rest call to an external service all in
13 ms!!!

Checkout

[https://youtu.be/9BQiDmvOnZw](https://youtu.be/9BQiDmvOnZw)
[https://youtu.be/yLvnkkRys2Y](https://youtu.be/yLvnkkRys2Y)

------
faitswulff
I like how this whitepaper sidesteps the "but the rewrite is the real
improvement!" by also rewriting the service in Node.js along with Go and Rust.

------
adamnemecek
I’ve been playing around with rust since it came out but only recently did I
decide to use it for a part of a project. It’s a very pleasant language. I
didn’t fight the borrow checker much (maybe due to prior experience).

The language is nuts. It’s true what they say cargo is even better than the
language, it’s just so easy to add packages to your project or to split your
project into packages.

Cargo is an amazing investment as this will help people write non duplicated
code. Like how many string implementations are there across c code bases. Each
c project has so much code that’s the most boring, repetitive shit you can
imagine. Cargo let’s you concentrate on writing your code without hassle.

I have experience with a lot of package managers, gems, go, cocoapods, sbt,
cabal, pip, spm, npm, you name it but cargo is on a different plane of
existence. Cargo makes the whole internet your standard library.

I also like cargo workspaces. Modern development needs a workflow where you
pull in a dependency, and work on it in tandem with your code. Achieving a
good workflow for this is surprisingly hard.

~~~
c-smile
> write non duplicated code ... Like how many string implementations are there
> across c code bases.

In one of projects I needed foo::string that can share data with foo::variant
without copying the data each time. So foo::string was implemented as a COW
string - smart pointer for foo::string_data. std::string simply does not work
in such requirements.

So I am not sure I understand how cargo will help in this case. Either Cargo
source repository will contain implementations of any possible permutation of
requirements of strings or people will just use standard std::string.

The only feature that I need in C/C++ is unified ability to include libraries
in code:

    
    
        #define PNG_APNG_REQUIRED 
        #include source "libs/png/png-amalgamated.c"  
    

I am perfectly fine with downloading png.tar.gz manually and putting it in
place where I need it.

In any case _decision to include library_ to a product requires quite a lot of
reasonings and architectural investigations.

For typical web front-end projects NPM or Cargo probably make sense. But for,
say, NodeJs or Cargo itself they should not use any such automatic downloader.

~~~
adamnemecek
Your problem isn't hard to solve in Rust. Reasoning about allocations is a
foundational idea of Rust and is solved using lifetimes.

Basically, lifetimes are allocators as a first-class language construct. You
can reason about what happens if values are stack or heap allocated and
specialize your code based on that. Lifetimes are definitely an advanced
feature and I think you can do some crazy optimizations with them. But your
use case is not hard to implement. If you show me the C++, I'll show you how
to achieve the same semantics.

Cargo helps because Rust lets you build cleaner abstractions and abstractions
that compose nicer. So integration is super easy.

> For typical web front-end projects NPM or Cargo probably make sense.

I might misunderstand your sentence, but neither Rust nor cargo are for front-
end?

~~~
steveklabnik
npm is used a lot for front end code, incidentally. Node is an environment for
a lot of build tooling.

------
ilovecaching
I'd like to point out that even though it took them a week compared to an
hour, a week is actually incredibly fast to learn Rust and build something
useful with it. Learning C++ can take more than a month of training, and it's
only because most people learn it over an entire semester at school that they
learn it at all. This is also the time it takes to learn Rust, and presumably
now that they've written one program, writing the next program will take them
a fraction of a week.

The amount of time it takes to learn something is often indicative of its
power. Anyone who has learned a foreign language or a musical instrument knows
that the time spent investing up front pays huge dividends down the road when
you have the skills and tools to richly express yourself. The reason that Go
takes two days to learn is because it artificially limits the amount of up
front investment at the cost of limiting expressiveness over the lifetime of
your use of the language.

~~~
koyote
> I'd like to point out that even though it took them a week compared to an
> hour, a week is actually incredibly fast to learn Rust and build something
> useful with it.

I would argue that something that takes even an experienced engineer a mere
hour to write is very small and has little complexity (especially if unit
tests are counted towards the hour it took them to re-write it). This means
it's difficult to gauge how much Rust was 'learned' during that week.

------
spricket
While I'm a big fan of Rust, excluding Java because "JVM" is kinda laughable.
It's not hard to run at all. You package everything into a jar then run a
single command. As easy to get working as a JS backend.

If their complaints are about GC tuning, is it not the same thing as tuning
the GC in Js/Go? Java still had arguably the more mature GC of any language

------
StreamBright
No surprise here, good language design pays off for real world applications
especially at scale like the NPM infra.

------
nikeee
Anyone knows if they considered .NET Core? It has dependency management, can
be deployed as a self-contained binary and is memory safe. Seems to match the
requirements for me.

~~~
steveklabnik
[https://news.ycombinator.com/item?id=19295166](https://news.ycombinator.com/item?id=19295166)

~~~
colejohnson66
> This stuff happened before, or at least around the time, that .NET Core was
> released. So it either didn’t exist or was an extremely new option, at
> least.

------
nothrabannosir
This entire article is a pretty damning report on JavaScript in general, but
this sentence takes the cake (emphasis mine):

> The process of deploying the new Rust service was straight-forward, and soon
> they were able to forget about the Rust service because it caused so few
> operational issues. _At npm, the usual experience of deploying a JavaScript
> service to production was that the service would need extensive monitoring
> for errors and excessive resource usage necessitating debugging and
> restarts._

Is this satire?

~~~
twiss
They also state that writing the service in Node took them an hour, two days
for Go, and a week for Rust. Even taking into account their unfamiliarity with
the language, it's probably fair to say that when switching to Rust, you'll
usually spend more time writing and less time debugging. Whether that trade-
off is worth it depends on the project.

~~~
pjc50
> writing the service in Node took them an hour

I'm really skeptical of this unless it's just a wrapper for a thing that
happens to already exist. It would be interesting to have comparative LOC
numbers.

> At npm, the usual experience of deploying a JavaScript service to production
> was that the service would need extensive monitoring for errors and
> excessive resource usage necessitating debugging and restarts

So, they _deployed_ it after an hour, but it wasn't _finished_ until they
stopped having to debug it in production?

~~~
krferriter
Fail early, fail often

deploy anyways

------
cryptica
The idea that some programming languages can solve scalability issues is a
myth. A language cannot solve scalability issues; all they can do is push the
needle a tiny little bit further in terms of performance but this is
completely meaningless.

Scalability is an architectural concern which cannot be ignored by system
developers. This is because scalability is not about speed or performance,
it's all about figuring out which workloads can be split up and executed in
parallel; in order to do this, you need to understand the real-world problem
which the software is trying to solve; this is not something that you can
delegate to a compiler.

The best that a language can offer in terms of scalability is to make it
easier to reason about parallel workloads and make the difference between
serial and parallel workloads as explicit as possible. Whenever a language
tries to hide the complexity of parallelization behind thread pools, they're
not solving any real scalability issue; they're just delaying them some more.

~~~
mavelikara
> The idea that some programming languages can solve scalability issues is a
> myth.

True.

> A language cannot solve scalability issues; all they can do is push the
> needle a tiny little bit further in terms of performance but this is
> completely meaningless.

> Scalability is an architectural concern which cannot be ignored by system
> developers.

A language can prevent or delay such architectural concerns from being
addressed by not offering sufficient capabilities.

------
megous
This is no study. It reads more like a conclusion to one.

------
makkesk8
Interested to know why they didn't consider .net core.

~~~
steveklabnik
This stuff happened before, or at least around the time, that .NET Core was
released. So it either didn’t exist or was an extremely new option, at least.

~~~
makkesk8
According to the article they have been using rust in production for 1½ years,
.net core 2 was released in 2017 so it definitely existed. If .net core was
mature enough for their liking is however another matter.

~~~
steveklabnik
.net core was released in August 2017. That’s basically one and a half years
ago. Depending on how fuzzy the one and a half year time is, it may not have
been released. Like it may not have literally been 18 months exactly. That’s
all I’m saying.

------
twiss
For all this talk about performance, are there any benchmarks anywhere? Also,
is there any blog post or anything by npm itself that this is sourced from?

~~~
steveklabnik
It was sourced by taking to npm directly; they signed off on the final text as
well.

They have spoken about this publicly before; it was in their newsletter when
they first deployed it, and there’s been a couple of conference talks.

------
truth_seeker
I have my doubts and confusion about this problem statement

> Most of the operations npm performs are network-bound and JavaScript is able
> to underpin an implementation that meets the performance goals. However,
> looking at the authorization service that determines whether a user is
> allowed to, say, publish a particular package, they saw a CPUbound task that
> was projected to become a performance bottleneck.

Oh, Really ???

So essentially Authorization service and I doubt the security algorithms
computation are the main cause.

What i dont understand here is why is it not possible to write lower level JS
or asm code to craft a well optimized code which V8 can totally nail to
minimum CPU instructions required?

~~~
spricket
My guess is that this uses public key cryptography. Generating a signature is
rather expensive. We found this out the hard way a few years ago, when we
tried to verify OAuth tokens using RSA rather than HMAC.

The server was hammered at maybe 500 signatures a second, greatly slowing down
token generation.

I'm guessing they dropped to Rust so they could use native C libraries for
signature generation.

------
erichdongubler
Previous submission here:
[https://news.ycombinator.com/item?id=19255639](https://news.ycombinator.com/item?id=19255639)

------
tekknik
The statement about Go using global dependencies being the standard is just
someone’s opinion on the Go team. I’ve written Go for a few years now and
never once shared a dep across projects. Create a new folder, set your GOPATH
(use direnv) and pull your deps. I very much doubt they’ll be more productive
in rust vs Go had they actually given Go a chance.

------
ramshanker
OK. Downloading Rust. :)

------
z3t4
You shouldn't be afraid to make independent micro services. Using a different
language makes it more likely that it can be deleted, and more easily
rewritten. If you are forced into a monolith usually means the architecture
has a complexity problem.

------
arhyth
can someone please give this outsider-noob an eli5 explanation why golang do
not adopt a same policy/attitude/implementation towards package management?

~~~
Corrado
Go is being developed by Google and as such is designed to meet their needs.
Google's view of package management is to bring every package in-house and
manage it themselves. So, if a Go program needs package "A" then Google will
directly import that package into their workflow.

This is actually one of the reasons that I don't think Go is a very good
language for most people/organizations. It was conceived with a specific set
of guidelines that Google needed; easy to learn, performant, etc. Go is also
designed to be used by teams of thousands, so it's much easier to adopt misc
packages into the fold and maintain them than it is to manage links to outside
requirements.

------
BooneJS
As the yearly pace of processor improvements continues to dwindle, more and
more problems will become CPU bound.

------
JamesClear99
It's heartening to see the community being mentioned as a positive factor.

> npm called out the Rust community as a positive factor in the decision-
> making process. Particular aspects they find valuable are the Rust
> community’s inclusivity, friendliness, and solid processes for making
> difficult technical decisions. These aspects made learning Rust and
> developing the Rust solution easier, and assured them that the language will
> continue to improve in a healthy, sustainable fashion.

------
Tomte
Editorialized title. Don't do that.

~~~
sctb
We've updated the title from “Rust at NPM”. It's not so obvious how to extract
a decent 80-character title from this one so we can cut the OP some slack.

------
hartator
Didn’t knew NPM had a speed issue.

