
NPM lockfiles can be a security blindspot for injecting malicious modules in PRs - fagnerbrack
https://snyk.io/blog/why-npm-lockfiles-can-be-a-security-blindspot-for-injecting-malicious-modules/
======
est31
Note that with Rust's cargo, replacing a registry source with a source from
github isn't possible without also changing the Cargo.toml. Cargo always
consults both .toml and .lock files, and only uses the .lock file to the
extent where it's consistent with the .toml. If it notices any
inconsistencies, e.g. .lock pointing to github but .toml pointing to
crates.io, its behaviour depends on the parameters passed. If you pass
--locked or --frozen, it gives an error. If you don't pass any parameters, it
tries to fix the issue by itself. It can't use the information from the .lock
file to get the precise version, and thus just uses the latest version that's
still consistent with the version requirement in the .toml file.

I don't think that cargo's decision to always consult the .toml was actually
motivated with security in mind. Instead I think cargo's authors just wanted
to avoid the hassle for developers of having to constantly invoke commands to
update the .lock file (is this what "convention over configuration" means?).
So I think this is an instance where a choice done for better ergonomics
actually ended up improving security as well.

Note though that even with current cargo, any PR that claims to do a simple
cargo update could very well not be such a PR. It could e.g. include an
"update" to a yanked and possibly vulnerable/backdoored version. One malicious
upload, even if yanked, could be milked for a long time. See repi's comment in
this thread for a solution:
[https://news.ycombinator.com/item?id=21887445](https://news.ycombinator.com/item?id=21887445)

------
repi
Added a verification step to prevent this in Rust code with our cargo-deny
tool. [https://github.com/EmbarkStudios/cargo-
deny/pull/80](https://github.com/EmbarkStudios/cargo-deny/pull/80)

We run this in GitHub Actions for all of our Rust repos, primarily to prevent
banned or duplicate dependencies and approved licenses. But also works for
verifying the sources of crates to prevent this specific issue now

------
zeta0134
Wait, you're supposed to commit those? Genuinely curious.

I always set up an ignore rule for them and use package.json's dependency list
instead. My projects are all on the small side though; is the idea to commit
the .lock file and avoid breakage through things like minor version updates of
your first level dependencies? Wouldn't that carry a bigger security risk for
large numbers of packages that then aren't receiving patch updates as readily?

~~~
stickfigure
Without the package lock file, your build is not repeatable. In two years (or
more likely, two weeks) someone new will check out your code, it will be
broken out of the box, and the poor sod will have no idea why. More than
likely they will blame you.

~~~
splix
May be I'm missing something, but why dependency resolution cannot be
deterministic? If you always choose a version specified in the config, or a
highest version if few different versions of the same dependency are defined
through some of intermediaries, you'd always end with the same set. AFAIK it's
the strategy in Gradle for JVM projects, and a similar deterministic strategy
is used by Maven.

~~~
javagram
NPM resolution is not deterministic because package dependencies may be
specified using SemVer, e.g. “~1.0.0” which may result in pulling 1.0.0,
1.0.1, 1.0.2, and so on - anything that matches 1.0.x specifier.

The NPM ecosystem commonly uses these types of specifiers. Even if your own
dependencies are all specified in “package.json” using exact versions, they
almost certainly depend themselves on other versions (dependencies of
dependencies, and so on) using the semver specifiers. That means whenever an
author of that package publishes themselves, your resolved dependencies may
change.

The package-lock.json is simply a record of the exact dependency resolution
graph based on the registry state that existed at the moment you generated it,
so that you can reliably reproduce that graph a month or a year later.
Otherwise it’s very common for NPM-based projects to have “NPM install” fail
later on (or succeed, but create a bug or unit test failure) because some new
version was added to the registry and created a compatibility issue or bug.

------
krainboltgreene
While the title of the page says "NPM lockfiles", the article actually talks
about yarn lockfiles. In essence though, it applies to all lockfiles. An easy
step for many organizations (especially with github actions) is to have it
verified.

~~~
jakear
Better yet don’t accept third party contributions to lockfiles... I don’t know
any reason I would approve an external PR that modifies the lockfile.

~~~
TheDong
Perhaps you're misunderstanding what this article about.

The author is making a pr that updates the package.json, which also means the
lockfile is updated.

If you don't accept any PR that touches the package.json file, then that means
you're not accepting a PR that introduces any new dependency, tries to update
a dependency for security reasons, or uses a feature available in a newer
version of the dependency.

All of those reasons seem like valid reasons to accept a lockfile change --
that the author of the PR changed the package.json file for any of those
reasons.

~~~
narsil
But since the lockfile can be generated from the package.json, I still don't
see a reason that the lockfile needs to be committed by a third-party
contributor instead of being generated automatically post-merge, or by the
repository maintainers during the course of normal development.

~~~
robjan
If you use the lockfile like that then there is little point in having a
lockfile. The lockfile specifies the exact versions of the whole dependency
tree, and their hashes, at a given point in time. If you don't commit it then
everyone will have different versions of the dependencies.

~~~
narsil
I see your point, but if a dependency package needs exact versions for its
dependencies, I'd expect it to be specified in its package.json as well. If
not, then I don't see why exact dependencies for the entire tree are needed.

Sure, if a package absolutely needs exact dependencies for its entire tree, it
can check in the lock file, but I've not found this necessary in practice
provided I use dependencies I trust , that follow semantic versioning.

~~~
allover
> Sure, if a _package_ absolutely needs exact dependencies for its entire tree

Not sure you're talking about the same thing as everyone else.

There's a big difference if you're maintaining a package vs maintaining an
app.

If you're a "package" maintainer, you don't want to pin dependencies. Because
the package consumer (i.e. people building apps using your package) should not
have their exact versions dictated to them.

If you're an "app" maintainer, you absolutely need to check in your lockfile,
because you should care about repeatable builds.

------
K0nserv
It certainly doesn't help that the NPM's and Yarn's lockfiles are hard to
read. Compared them to Gemfile.lock which bundler generates.

NPM's package-lock.json [https://github.com/search?q=filename%3Apackage-
lock.json&typ...](https://github.com/search?q=filename%3Apackage-
lock.json&type=Code)

Yarn's yarn.lock
[https://github.com/search?q=filename%3AGemfile.lock&type=Cod...](https://github.com/search?q=filename%3AGemfile.lock&type=Code)

Bundler's Gemfile.lock
[https://github.com/search?q=filename%3AGemfile.lock&type=Cod...](https://github.com/search?q=filename%3AGemfile.lock&type=Code)

------
seniorsassycat
Generate the lockfile (and any generated files) on a trusted machine instead.
Generate it and merge it from a ci server.

~~~
allover
But a lockfile is not like "any generated file", because for typical generated
artifacts they should always be generated the same, for the same commit.

This is not true for a lockfile, where the whole point is to capture the
specific versions at the point in time that the generation is done.

The benefit of the contributor committing the lockfile is that it encodes the
exact combo of dependencies that worked at the time the related code changes
were made (in the same PR).

This means other project maintainers aren't left scratching their heads trying
to figure out why a PR worked on your machine but fails in CI.

~~~
seniorsassycat
Code smells if a change works with one lockfile but not another generated
around the same time. The utility of a lockfile increases with the age of the
file.

We expect lockfiles to change when we update it add dependencies. If a pull
request cannot build with a lockfile generated when it is merged then that
change should not be merged.

~~~
allover
> Code smells if a change works with one lockfile but not another generated
> around the same time.

This isn't true at all. You can't predict when a new, potentially breaking,
version of a dependency might get published. It could happen a second after
you generate your lockfile, or create your PR.

At time of PR creation we could have versions 1, 2, 3 and 4 of our transitive
dependencies.

At time of merge we could have versions 1.1, 2.2, 3.3 and 4.x available.

Dependencies are outside of your control, so always "smell", that's why it's
crazy to do anything other than pin the heck out of them.

------
fractalf
To me this looks extremely frightening. Sure, you can create good routines etc
to prevent this, but that's not really enough. As a user of any module you
can't trust every package to have good routines. In any basic project there
are tens and hundreds ++ dependencies, and we all know that some of them will
be vounerable for this exploit. We need a MECHANISM that prevents this,
routines and good practices are not enough

------
jrochkind1
There is something really weird and complex with the way dependency management
works out in JS that I can't quite put my finger on, but seems to make things
especially intractable somehow. While other platforms have pretty similar
dependency management tools... I think that the ruby `Gemfile.lock` for
instance would usually have much smaller and human-readable diffs for an
analogous change.

------
mxscho
Instead of changing the module's URL to e.g. a private evil GitHub repository,
how about registering a new package within the official npm registry, using
homoglyphs/similar-looking characters.

For diff viewers that are highlighting entire lines for changed content, the
difference in the package name wouldn't be spotted in reviews when also
changing the version or some other part of the URL.

Or am I missing something?

~~~
saghm
At least when I review code, if I see a line changed that doesn't need to be,
I leave a comment asking why and hold off on approving the code for merge
until I figure out what's going on. This is more due to not having changes
people made by mistake merged in rather than a strategy for dealing with
malicious submitters, though.

~~~
jakear
Curious, would you ever approve a third party change to a lockfile? I maintain
a very large OSS project and I don’t know any reason I would accept a PR that
changes the lockfile.

~~~
reificator
Would you accept a PR that changes the package.json file to add a new
dependency?

~~~
jakear
Not likely, we try to minimize dependancies as much as possible. If a new
feature were to absolutely necessitate a new dependency, it probably wouldn't
be open for third party contributions in the first place (not least of all
because maintenance of the dep now falls onto us if ever upstream falters).

------
mmis1000
It looks like only validate the domain isn't enough.

NPM does not care about whether the package name in the tarball match to the
name in lock or not.

So you can just point to tarball path to another package that has the similar
name. and nobody with ever notice that.

And domain whitelist won't work in this situaltion either.

~~~
sombremesa
Does it care about casing? I doubt it does...so I could for example point
ABCmagical to ABCmagicaI. (The last letter in the latter being an uppercase i)
You would have to update the package too, or something else to red herring the
colorized diff.

------
29athrowaway
The irony of this article is that it recommends you to use npx.

------
leblancfg
Great work, very creative vector!

On the other hand, looks like a big missed opportunity from snyk.io! Feels
like they could have taken the extra step and time to come up with a counter
to this in their product, and use that post as both a way to showcase their
product and the discovery.

------
andy_ppp
In general I wonder about the ability to receive a root kit from any number of
sources, sharing software projects is a major concern. What is best practice
here? Only work inside a VM?

------
aliceryhl
Why would the lock file contain the url?

~~~
city41
Different modules can come from different hosts. Whether that’s a good idea is
debatable.

~~~
chii
so why isn't the host for the dependency specified in the package.json
instead?

I think NPM is poorly designed. These lessons have all been learnt in the past
by the maven ecosystem, and yet, it is repeated again in the javascript npm
ecosystem.

~~~
AgentME
Maven sidesteps the issue by not even having lockfiles and leaving everyone
who checks out a repo free to have different versions of subdependencies, so
I'm not sure it's a very stellar example.

~~~
chii
only if you use snapshot dependencies (or your dependencies use that).

Maven pins down the version for you if you followed the "best pactise", or you
can override sub-dependencies' versions if you should so choose (via
dependencyManagement), and also lets you control the actual repo to download
from as well. And as a bonus, you get to share the binary across different
projects if they are on the same machine.

------
labawi
Why does the article recommend requiring https, if the packages have sha512
integrity checks?

