
Malicious code in the purescript NPM installer - braythwayt
https://harry.garrood.me/blog/malicious-code-in-purescript-npm-installer/
======
hombre_fatal
Another reminder of how annoying it is for a package system to have
unqualified package names.

Having to ask someone to gift a `purescript` package shouldn't even be a
thing. It should've been `@shinn/purescript` and the compiler developers just
create their own `@whatever/purescript`.

This is something Elm and many others got right. [https://package.elm-
lang.org/](https://package.elm-lang.org/) It's just infinitely, obviously
better.

You see all sorts of problems because of this, like people "giving packages
away" when they quit. Or buying package names. Or coming up with annoying name
hacks because the obvious, best name is simply taken. Or people
thinking/guessing that `npm install mysql` is the correct/best/canonical
package because it's the simplest name, and anyone who publishes a better
library has to name it mysql2 or better-mysql, etc. These just shouldn't even
be things.

~~~
munificent
_> It's just infinitely, obviously better._

Whenever a large number of skilled people do something for which an
alternative is "infinitely, obviously better", there's a good chance that
there is more going on than you know.

RubyGems used to be namespaced this way and moved away from it. They didn't do
so lightly.

The problem is that ownership, and even _names_ of owners change _all the
time_. In the very very large majority of cases, this change of ownership is
an implementation detail that doesn't need to impact package consumers. If you
enshrine the owner's name in the package, it means any change of ownership is
effectively a breaking change to the package. When you have very large
transitive dependency graphs, the result is constant, pointless churn.

~~~
hardwaresofton
It seems like a reasonable fix to this is to prevent name-specific owning
(changing it to a group instead) -- this has the benefit that package
maintainers who plan to maintain their packages forever can keep the same
names, and those that don't can essentially fork their project and stop fixing
the older version (@ <maintainer>/<project>) and force all changes to go to a
new one (@ <group>/<project>) and hand off ownership as necessary.

This doesn't break old consumers and it allows for a pretty graceful migration
path (if you want new updates, change your version) -- can eve be helped with
marking package as deprecated or repo as archived and what not.

~~~
akerl_
As a thought experiment:

Assume Rubygems chose to operate in that way, where all gems must be owned by
a group rather than an individual user.

Then, assume that the most flexible option is for each gem to be owned by a
_unique_ group: that way even if two gems are maintained by the same users
right now, they use two distinct groups in case that ownership changes in the
future.

We might as well just name the “group” the same as the gem name, since only
one gem is managed by each group. So now the “purescript” group maintains
“purescript”, the “pry” group maintains “pry”, etc.

As syntactic sugar for users, since the group name and gem name will always
match, why make them type both? Let’s have all the commands support just
referencing the gem name. If somebody wants to fork a gem and release it,
their group and gem get a new name.

I think there’s a pretty compelling case that package managers should support
group ACLing on publishing (giving multiple humans the first-class right to
publish using individual creds to a group namespace, with the ability to
add/remove users from the group over time. But once you’ve done that, the
distinction between explicit group-name-in-package-path and changing-name-to-
fork (so the difference between fork-group/orig-name and orig-name_fork-group)
seems to shrink.

~~~
hardwaresofton
> Assume Rubygems chose to operate in that way, where all gems must be owned
> by a group rather than an individual user.

Sorry, this wasn't my premise -- I meant to have this _as an option_. As in
<user>/<project> or <group>/project>.

> Then, assume that the most flexible option is for each gem to be owned by a
> unique group: that way even if two gems are maintained by the same users
> right now, they use two distinct groups in case that ownership changes in
> the future.

> We might as well just name the “group” the same as the gem name, since only
> one gem is managed by each group. So now the “purescript” group maintains
> “purescript”, the “pry” group maintains “pry”, etc.

I don't agree -- even if it's always group-a/purescript I think the
distinction is still important, because in this case previous-
group/puresecript _still exists_ , but is frozen/archived. Here's how I'm
understanding the scenario you laid out:

1\. group-a/purescript is created

2\. purescript changes ownership, group-b is going to be publishing it going
forward

3\. group-b/purescript is created

4\. group-a/purescript freezes/archives/deprecates itself

5\. group-b/purescript is actively developed

The fact that "group-b" is the "right" purescript is arbitrary/subjective to
some degree.

> I think there’s a pretty compelling case that package managers should
> support group ACLing on publishing (giving multiple humans the first-class
> right to publish using individual creds to a group namespace, with the
> ability to add/remove users from the group over time. But once you’ve done
> that, the distinction between explicit group-name-in-package-path and
> changing-name-to-fork (so the difference between fork-group/orig-name and
> orig-name_fork-group) seems to shrink.

I think these two issues are a bit separate. Letting people dynamically change
who owns/can publish to a repository is _one way_ to solve this problem, but I
think it's more complex than the fork-and-move approach.

IMO if some user wants to give up/transfer their repo, they:

1\. find someone else to take over if they want

2\. freeze/archive/whatever their repo

3\. let the person fork & continue their work

An ownership change _should_ be opt in, unless it was known @ package creation
time that ownership would be a shared/rotated/changing/nebulous thing (which
would be demonstrated by a group owning the package from the beginning).

~~~
akerl_
I wasn’t implying you claimed that all gems needed a group, I was proposing it
as part of the thought experiment. My apologies if that was unclear.

To your list of examples: my point parallels your own, I think. I’m saying
that given the “right” version is arbitrary and subjective, the difference
between “group-b/purescript” and “purescript-group-b” is effectively nil. More
concretely: if namespacing existed, you could fork “group-a/purescript” to
“group-b/purescript”, but if namespacing didn’t, you could fork “purescript”
to “purescript-group-b”. In either case, dependent projects need to update
where they source their dependencies from.

Namespacing, in my experience, tends to make the forking process slightly
“cleaner”, because you avoid having a potentially non-“right” “original” (for
example, “purescript” tends to look more legitimate than “purescript-
group-b”). But some comments in this thread seem to paint namespacing as a
hard requirement, or claim that package managers without namespacing are
missing a core, mandatory feature. The case I’m presenting is that this isn’t
the case: namespacing is a useful feature for several workflows, but adding
namespacing doesn’t fundamentally alter the issue.

~~~
hardwaresofton
> I wasn’t implying you claimed that all gems needed a group, I was proposing
> it as part of the thought experiment. My apologies if that was unclear.

My apologies I certainly misread your comment.

> To your list of examples: my point parallels your own, I think. I’m saying
> that given the “right” version is arbitrary and subjective, the difference
> between “group-b/purescript” and “purescript-group-b” is effectively nil.
> More concretely: if namespacing existed, you could fork “group-a/purescript”
> to “group-b/purescript”, but if namespacing didn’t, you could fork
> “purescript” to “purescript-group-b”. In either case, dependent projects
> need to update where they source their dependencies from.

I agree -- the effects are definitely similar and almost equivalent. However
does requiring a group/author change things _at all_? It seems like it could
introduce an abstraction lever.

> Namespacing, in my experience, tends to make the forking process slightly
> “cleaner”, because you avoid having a potentially non-“right” “original”
> (for example, “purescript” tends to look more legitimate than “purescript-
> group-b”). But some comments in this thread seem to paint namespacing as a
> hard requirement, or claim that package managers without namespacing are
> missing a core, mandatory feature. The case I’m presenting is that this
> isn’t the case: namespacing is a useful feature for several workflows, but
> adding namespacing doesn’t fundamentally alter the issue.

I'm on the fence -- I'm not sure if this is a good counter case, but what
about the layer of abstraction introduced by the implied/required existence of
<group>? You could write code that imports "purescript", but then resolve it
later (as some others mentioned, go.mod is or some other modules file that
clarifies mappings) to determine _which_ "purescript" that is. You could solve
this by "alias"ing "project/purescript" to "purescript" (and then having some
similar extra configuration that says "purescript" -> "project/purescript",
and now I'm not sure if either is better (so basically having this indirection
be a "module resolution feature" or a "module aliasing feature"), and if
there's any value in forcing one (requiring the existence of <group> would
almost certainly force the module resolution thing, but also break builds the
second similarly named packages were published...) or if they really are just
the same.

I also found the page on this by the rust team pretty convincing[0].

[0]: [https://internals.rust-lang.org/t/crates-io-package-
policies...](https://internals.rust-lang.org/t/crates-io-package-
policies/1041)

~~~
akerl_
For clarity, I think the comment referencing “go.mod” that you’re describing,
at least in this thread, is from me :D

I think I agree that the core feature that impacts this issue is what go.mod
solves, and what you’re describing: it should be easy and language-supported
to sub in one fork of a dependency for another fork, so that users can flip
between “group-a”’s purescript and “group-b”’s purescript, regardless of how
the namespacing works on the module registry (notably, golang dispenses with a
registry entirely: there’s no central system, except insofar as github is used
for lots of people’s packages).

------
svnpenn
The real issue the Balkanization of JavaScript programs. The `rate-map`
package is essentially one line of code:

    
    
        start + val * (end - start);
    

[https://github.com/shinnn/rate-
map/blob/90c234c9/index.mjs#L...](https://github.com/shinnn/rate-
map/blob/90c234c9/index.mjs#L38)

~~~
SCLeo
I honestly don't understand why people use packages like this. If I need this
functionality, I will simply write my own. Plus, I will never able to find
this specific package. I guess PureScript uses this because its author is also
the author of rate-map.

~~~
ricardobeat
I’ll give you one: it’s code already written and tested by > 1 person, edge
cases already figured out. Saves you time. The gains are small but quickly add
up.

This is why lately I’ve been a fan of very extensive standard libraries (like
Crystal has) - its like having a huge repository but vetoed by the same team
and without any of the package management drawbacks.

~~~
spenczar5
> The gains are small but quickly add up.

There are costs to these micro-libraries that outweigh the gains. This code is
trivial; there aren’t really edge cases to be worked out.

~~~
jasonhansel
Also: the way the library handles those edge cases isn't necessarily the way
you want. Case in point: rate-map throws exceptions in situations where you
might expect it to fail more gracefully.

------
Benjammer
I always think of this article when these things come up:

[https://hackernoon.com/im-harvesting-credit-card-numbers-
and...](https://hackernoon.com/im-harvesting-credit-card-numbers-and-
passwords-from-your-site-here-s-how-9a8cb347c5b5)

------
z3t4
Before "tree shaking" I stored all npm modules in SCM and reviwed all updates
as I had to commit after "npm update". I also put ton of files in .ignore as
90% of files in some packages are not required. I also used to include npm
modules in distribution/deployment. So my request to npm is to add an option
in the main package.json to disable tree shaking.

~~~
robocat
I wish there was a way to "bless" packages when they were reviewed.

I want a network of trust, such that a Google reviewed package is worth 10
points, a package fuzzed by foobar is worth 2 points, something skimmed by a
dependant user is worth 1 point etc.

I can then chose a compromise between a highly rated/reviewed dependencies and
functionality/risk/cost-to-review.

My own blessing of a package I have reviewed might become a very small signal
in a web of trust.

~~~
yaa_minu
You may find npms.io[1] useful. For each package, they provide a score on
maintainability, popularity and quality.

[1] [https://npms.io](https://npms.io)

~~~
bauerd
npms.io rates the `rate-map` package with quality 99%, so not sure how helpful
that is

------
nightkoder
I have been installing purescript using Nix from
[https://github.com/justinwoo/easy-purescript-
nix](https://github.com/justinwoo/easy-purescript-nix) for a while now. It
works quite well and I get to avoid npm.

~~~
Shoue
Purescript (purs) is in unstable now as well

------
a-dub
I wouldn't use the words "malicious" or "exploit" wrt this... It's more like,
I dunno, trolling on planet JavaScript? I feel like there should be a big
Twitter fight about it...

~~~
nameiscubanpete
That was my first thought. But then I realized some guy basically broke
something so his stuff would work and someone else's wouldn't. He didn't
destroy files, but that was malicious as hell.

~~~
a-dub
mean spirited and dramatic as hell, yes... also, a bad place where real
"malicious" things could be done. but "malicious" has a specific meaning and
this didn't affect users.

more like dramaticious if you ask me... but also uncovers actual dangerous
weaknesses in the npm delivery pipeline...

~~~
a-dub
it's kinda like a cat-fight in the one hundred acre javascript wood... pretty
harmless, nobody's shit got pwned, but holy shit, kind of a vulnerable vector
they found...

------
runeks
The work “exploit” is used several times but none of the code seems to exploit
anything. Also, “malicious code” usually has a different meaning than
something that intentionally makes the program crash during the installation
process.

~~~
rjmunro
I feel like this is only part of a wider attack - like by causing this not to
download, it meant that users do some other action which opens them up to the
real attack.

~~~
jeltz
Nah, it was probably a childish old maintainer who decided to sabotage it.

------
bsamuels
I wonder how bad will this cred-stuffing package authors problem will get
before npm/other package managers flat out require 2FA for maintainers

~~~
hn_throwaway_99
I think the blog author is implying as much as he can, without directly
accusing, that he believes that
[https://github.com/shinnn](https://github.com/shinnn) was responsible for the
bad code, not a random hack.

~~~
btown
To quote the article: "As far as we are aware, the only purpose of the
malicious code was to sabotage the purescript npm installer to prevent it from
running successfully... the purpose of this condition [in the code, hardcoded
to include the word 'cli'] seems to be to ensure that the malicious code only
runs when our installer is being used (and not @shinnn’s)."

:hmm:

~~~
uponcoffee
>>[[ 9 July, around 0100 UTC: @doolse identifies thatload-from-cwd-or-
npm@3.0.2 is the cause. See purescript/npm-installer#12 (comment) @doolse
opens an issue on the load-from-cwd-or-npm repo pointing out that the package
is breaking the purescript npm installer (although at this stage, none of us
spot that the code is malicious). This issue is later deleted by @shinnn. ]]

Hmm indeed. A hack is possible but the timeline of events is dubious.

------
blackoil
What JS needs a standard library(s). Developed and manintained by MS/Google or
preferably a foundation. No reason every library have a 10 level deep
dependencies.

------
s_Hogg
High-velocity ecosystems like this make it way too easy to optimise your code
for getting pats on the head. It's good that people are pointing out things
like this (relatively speaking) not long after the fact, and that a lot of
people on here are annoyed about it. The fix is cultural as much as any 2FA
implementation.

------
quickthrower2
Part of the problem is the bounty for attacking NPM packages is high. You get
a high profile exploit and lots of people talking about it, or you can even
get some of your evil JS code running on thousands of sites on the back end or
the front end.

Compounded by the fact there is no decent base class library for JS like you'd
get for .NET [0]. Want to do anything you could do by default with .NET BCL?
Like open a url, save a file (with nice api) or parse some XML?

Then npm i ... it is. And hope it doesn't pull in an exploit.

As a mitigation I recommend people consider writing their own code (NIH) for
simple stuff not npm i all the things.

[0] I'm comparing to .NET but same could be said of Java/Python/Ruby etc.

------
SanchoPanda
This is not the first time this year we see an npm issue, and it could have
been much worse than this. All package managers in general create risks, but
how the community etiquette evolves around package managers is just as
important. Something is wrong with the latter here.

------
ruuda
One of the things I like about Purescript is that you can use it without
needing any javascript package manager, and without running any javascript
outside of a browser. Nix works well for installing the Purescript compiler as
well as psc-package.
[https://nixos.org/nixos/packages.html#purescript](https://nixos.org/nixos/packages.html#purescript)

------
dmix
This was first posted 16 days ago
[https://news.ycombinator.com/from?site=garrood.me](https://news.ycombinator.com/from?site=garrood.me)

------
gitgud
NPM gets a lot of hate for it's dependency managemnet, but I'm not sure what a
solution would be to this problem.

\- They can't currate packages, or else that _friction_ will drastically slow
down the ecosystem (1000's of packages get published everyday).

\- They can't remove/disable packages _(most of the time)_ , or dependencies
will no longer be strictly immutable.

\- They can't disable sub-dependencies, or else this would greatly reduce code
reuse and increase redundancy and complexity of packages (every package may
have to roll there own X, or compile their package dependencies into bundled
JS with no dependencies).

I think the problem is simply; it's a low friction dependency management
solution -> which made it so popular -> which is making it a target for
malicious actors.

~~~
ricardobeat
After 10 years of nodejs I can honestly say I wouldn’t mind the friction. NPM
is a wasteland of abandoned packages and reinventing the wheel. They never
solved discovery so you have 500x implementations of the exact same thing.
Around 2015 we passed the point where looking for the “right” package takes
longer than writing your own.

~~~
xeonoex
See, .NET get a lot of hate from the open source community, but this reason is
one of the big ones of why I prefer it. Even when you compare it to Java, I
feel like it's way better in this regard.

If I want to work with JSON, I use Newtonsoft.JSON. If I want an ORM I'll use
Dapper (lightweight) or Entity Framework. So many libraries that would you
need if you're using Java or JS is just built in to the standard library.

I know it's not a 100% fair comparison, since I am biased. Part of it also
might be because .NET is younger and less widely used, but I feel it's pretty
true. Though, Python has ton of open source support and is better in this
regard.

~~~
filleduchaos
> I know it's not a 100% fair comparison, since I am biased.

It might also have something to do with you apparently unironically comparing
a framework to a programming language.

~~~
xeonoex
I'm not sure that you understand what .NET is. .NET is a platform/ecosystem. I
was talking about packages. Packages are managed by nuget, which works across
all .NET languages, which is why I didn't specify a language.

~~~
filleduchaos
Framework, platform, ecosystem, SDK, etc: you're rather neatly sidestepping
the point that you're comparing _not a programming language_ to _programming
languages_. A fair comparison would be C# to Java, or Javascript.

What Microsoft chooses to call the .NET Standard Libraries is not at all the
same thing as a language's standard library.

~~~
xeonoex
How is that relevant at all? We're talking about package management. .NET uses
nuget. Let's look at the nuget site:

[https://www.nuget.org/](https://www.nuget.org/)

> NuGet is the package manager for .NET.

So my point is invalid because I wrote .NET instead of C#/VB/F#? You might
want to contact nuget to tell them their site is wrong too then.

My point is that finding quality packages is easier with nuget than the JS or
Java package managers. And using them is usually easier too. It's a fair
comparison.

------
austincheney
Uniqueness and trust of package naming is easily solved: use a URI. It doesn't
even have to resolve. URI is a naming convention that invokes both uniqueness
and universality.

------
weq
So what you are saying, is all the code i see on blogs, demoing that cool
little JS thingy, is actually just demo code, not prod ready. To get prod
ready, u need to break the NPM dep, and vet everything on your own. So your
telling me npm run serve isnt good enough for prod either?!

Lies, more lies and broken promises!

------
sieabahlpark
Well I guess it's good I didn't start using this after an article I saw
yesterday about it.

~~~
ff_
The facts the article refers to happened two weeks ago and have since been
resolved, taking further steps to reduce the chance of this happening again in
the future (e.g. by vendoring a lot of code)

~~~
sieabahlpark
Well that's good

------
throwaway3627
You know a platform doesn't care about security if either:

a. They don't do end-to-end integrity and non-repudiation (not signed hashes
of files, not just https, not just hashes, but signed archives/files that can
be verified as coming from the developer either with gpg, s/mime or x509
certs)

b. They allow packages to execute code or scripts on download or installation

And, they don't care about your time if they don't automatically offer a
prebuilt, reproducible binary mechanism with a build-from-source
install/verification option.

------
ga-vu
Not actually malicious. It doesn't steal user data, drop malware, or damage a
computer. Just crashes the library. Looks like another developer-developer
slap fight.

~~~
braythwayt
I beg your pardon, but if I am using this library as part of a shipping piece
of software-as-a-service, and I am in the middle of shipping a new feature
when suddenly things mysteriously crash...

If I later discover that the crash was put there deliberately, I am going to
call that malice, and malice that has directly impacted a functioning business
and its customers.

It's no different than a disgruntled person putting tacks on the road outside
of a supplier. If my truck goes there, gets a flat, and crashes into the ditch
as a result, I would call that malice as well.

Deliberately crashing software that other people depend upon is malice.

~~~
danShumway
It's malicious.

That being said, I always get pushback when I mention this but I think SaaS
projects should _often_ be vendoring dependencies. It's safer, it's more
secure, it gives you more consistent installs -- and it prevents `leftpad`
scenarios. It makes source control slightly more complicated, but the other
benefits (often) greatly outweigh that.

This is something that used to be more commonplace in the Javascript
community, and it's something that `node_modules` makes very easy, but it's
fallen out of style in modern web development.

To the best of my knowledge, this was also commonplace in the original design
of Go, since it was coming out of Google, which does vendor all of its
dependencies. I'm not sure which way the current Go community leans.

~~~
d4mi3n
Vendoring as practice isn’t common in some ecosystems (ruby, JavaScript,
python, Clojure), but I do see more people using caching proxies and services
like JFrog’s to ensure they always have access to particular versions of a
dependency.

Sadly, this still doesn’t fix non-repudiation problems we see in ecosystems
that don’t enforce things like package signing.

