Hacker News new | past | comments | ask | show | jobs | submit login
Alert: NPM modules hijacked (drinchev.com)
238 points by vdfs on March 23, 2016 | hide | past | favorite | 141 comments

There's a lesson to be drawn here about dependency hygiene. On one hand, code-reuse is a good thing, and incorporating other projects by reference is a way to make it happen. On the other hand, each dependency you add creates a little bit of risk: that the package will be updated in a way that breaks your application, or the package maintainer will go rogue, or the package will get hijacked.

Most programming-language communities manage the code-reuse/dependency-hygiene tradeoff by concentrating code into a relatively small number of large libraries, and when people want a function or two that isn't in one of the libraries they're using, they incorporate it by copy-paste. In the Javascript/NPM world, on the other hand, I see a lot of projects with dependency references to huge numbers of tiny dependencies.

Today, we're seeing one of the reasons why that's a liability. Most people will take away lessons about code signing and signatures, and presumably the NPM software is going to be improved in that regard. But the other lesson is that projects should have fewer dependencies. Using a library may be as simple as adding one line to a package configuration file, but using a library properly requires substantial due diligence on the library, its license, its bug tracker and its author.

> Most programming-language communities manage the code-reuse/dependency-hygiene tradeoff by concentrating code into a relatively small number of large libraries, and when people want a function or two that isn't in one of the libraries they're using, they incorporate it by copy-paste.

I don't think it's necessarily "most programming languages". I know Perl, and probably Python and Ruby, have very large ecosystems of small modules. I suspect it's more along the lines of static vs dynamic, or compiled vs interpreted.

> On the other hand, each dependency you add creates a little bit of risk: that the package will be updated in a way that breaks your application, or the package maintainer will go rogue, or the package will get hijacked.

Those specific risks are not necessarily from including a module, but are from subscribing to a module. If you include a specific version of a module and can confirm it has not changed to a degree that satisfies you (whether that's keeping a local copy to build from, or trusting the distribution system to be immutable and pegging a specific version to install), then your risk is fairly well defined. It's not zero, but it is limited to the problems the module existed with at the point you reviewed and included it.

Subscribing to a module, that is having the system download and build "the latest and greatest of whatever thing you call $foo" is not very safe, and if the build mechanism can execute arbitrary code and is automated, is insanity.

> I don't think it's necessarily "most programming languages". I know Perl, and probably Python and Ruby, have very large ecosystems of small modules. I suspect it's more along the lines of static vs dynamic, or compiled vs interpreted.

The Perl, Python, and Ruby communities might have a lot of "small-ish" modules, compared to something like Boost in C++, but the Node.js community has really taken it to a new extreme with an explosion of "one-liner" modules like the absurd user-home https://github.com/sindresorhus/user-home

The rallying cry of this movement is "Modules are Cheap in Node.js!". Incidents like this help demonstrate that explosive growth of your dependency graph isn't ever cheap, in any language.

I agree that there is a definite cost of dependencies. I don't agree with the implication that these small modules are frivolous - doing one thing and doing it well is a time honored Unix principle. It saves developers from having to roll their own dependency, and gives them free wins if the package is well maintained and widely used. Edge cases are often identified and accounted for, even if the responsibility of the package is tiny.

The example you gave was a poor choice - read further down https://github.com/sindresorhus/user-home#why-not-just-use-t.... It exists to avoid exactly the problem that sparked the conversation around npm today.

We disagree on the utility of adding an entire dependency for simple rename of os-homedir.

But there are myriad other equally ridiculous packages. is-positive, for instance: https://github.com/kevva/is-positive.

Abstractions are good when they hide and generalize complexity. These modules merely conceal mundanity.

> Edge cases are often identified and accounted for, even if the responsibility of the package is tiny.

In my observation, the opposite is true -- most of these trivial packages punt on edge cases. is-positive makes no attempt to handle any. https://www.npmjs.com/package/average implements an utterly naive calculation of an array's mean that is numerically unstable in the presence of JS's everything-is-a-float semantics.

Hiding tiny, naive implementations behind dependency imports actively harms understanding of the implementation details and failure modes of the code being relied on, in addition to making your build much more fragile by encouraging you to rely on external dependencies that can and will vanish from the Internet at inconvenient times. It's a bad trade-off from every angle.

Edit: this blog post (currently on the front page!) does a really good job of capturing my utter bafflement at how completely crazy the node community has gone with this nonsense http://www.haneycodes.net/npm-left-pad-have-we-forgotten-how...

Edit 2: and none of this even touches on the complete insanity that is npm running arbitrary code in each dependency at install time. Having 2,000 one-liner trivial dependencies amounts to inviting 2,000 strangers to run arbitrary code on your build server. The node.js community appears to be a pack of complete amateurs.

How is that absurd? I've written Node apps where I needed to get the user's home path, crossplatform. Instead of memorizing this:


I'd rather `npm install` it and require it. Of course, now Node has this: https://nodejs.org/api/os.html#os_os_homedir, but it didn't and you get the point.

The problem was with npm and allowing unpublishing/hijacking, not the Node community.

It's absurd because the obvious choice from the beginning should have been a library of abstracted OS concepts. Even if it starts as a single method/function for determining the homedir, making sure it can easily handle more abstractions is the obvious choice. People wanting homedir() will likely also want a way to separate a path from a file, know how to separate path components based on platform ('/' vs '\'), and a few other obvious things (see the os module you referenced).

A low barrier of entry for submitting modules is generally a good thing, but if it's too low, you end up filling your ecosystem with crap.

Plus hopefully the functions in the module would be more consistent(with each other).

It's absurd cause the repo contains files to handle git, npm and other stuff that have more line's written than the actual code.

I continue to be amazed by how far some people will take abstraction for the most menial of components.

If that's the case, that is rather excessive. I hope there's at least a movement going on to shift towards using utility modules that combine a few tens of these one-liners into a cohesive namespace/class/object where complementary. Keeping track of many, many very small modules does indeed impose a cost, which it looks like some NPM users have gotten around by doing the absolute minimum to keep track. This is the result.

There are modules that combine modules.

Alas due to the beauty and power of combinatorics, there will inevitably be more compound modules than atomic ones. If there are not already.

Yeah, I meant combine less in the "concatenate into one large new module automatically" and more "someone as put out what is rapidly becoming the de-facto standard for all OS high level actions in a comprehensive single module."

I don't see automatic combination of lots of small modules actually making this situation better in any way. Quite the opposite, actually...

> If you include a specific version of a module

That only works if the central registry is append-only. But as we've seen packages can be removed from npmjs, which caused these problems.

The scope of my comment is obviously beyond just NPM, and I addressed that specific complaint, so I'm not sure how this comment follows mine.

The difference here is that many people in the npm world are working on the web. Minimizing application size is of absolute importance on the web.

If we copy and past snippets of code between packages, we lose out on potential code de-duplication from dependency flattening.

If you're really worried about your dependencies disappearing/changing beneath you just check them into git.

This is a great comment, especially the advice to use git. Sometimes you may have dependencies, and that's okay. Keep track of them, see what changes in your own capacity. Also an added benefit is if the unicorn dependency distribution system goes bust, you still have local copies of everything.

I guess I don't see what checking the dependencies into git gets you over just archiving your build artifacts instead, while the downsides are apparently visible (repository size/clutter increase, temptation for devs to modify code in your vendored dependencies etc.)

That's also a valid solution. Definitely better than copying and pasting code snippets as OP suggested.

I tried checking in all dependencies for one of my projects a while ago, and my git history quickly became ridiculously bloated as I added/upgraded/replaced more and more dependencies as time went on.

For those who go this route, any advice on what you can do to keep your version control history from blowing up?

Ahh... That must be what I was missing.

So instead of checking in your dependencies into your app repo directly, you turn your dependencies folders into submodules to keep their histories separate from your app repo's history.

Makes sense! I'll be sure to give that a try on my next project.

I've only made some small contributions to a project that used them and haven't set them up myself. But yeah, that's my understanding of it.

There's a bit of extra overhead because you have to fetch them separately after cloning a repo, and updating them is separate from pulling origin/master. If you do one without the other you can probably get into unexpected version mismatches where your project doesn't play nicely with the dependencies.

This is absolutely true, which is what makes tree-shaking so exciting.

> On the other hand, each dependency you add creates a little bit of risk: that the package will be updated in a way that breaks your application, or the package maintainer will go rogue, or the package will get hijacked.

If your development environment is such that you can't test these things before going into production, and that production updates its packages on its own, then you have much bigger (and harder to fix) issues that dependency hygiene.

How can a test detect a hijacked dependency that acts exactly like the original, but with a backdoor?

The risk is much bigger than simple lifecycle testing will account for. Upstream micro libraries could easily disappear or have bugs, leaving you spending more time trying to workaround than if you had just written the darn thing yourself in the first place. Or, you suddenly have a new requirement that doesn't play well with the lib.

Programmers loves dependencies because it lets them pass the buck, but each and every 3rd party module is a potential time bomb waiting to happen.

> before going into production

How about 3 years after a release? 5, 8, 10?

If you haven't upgraded your packages in 10 years, you've got nothing to worry about ;)

What about using "trustable" dependencies? For small utilities like left pad, you could use lodash. You can import just the function, but still get the benefit of knowing that its lodash so it won't go down, it'll be fast, and well tested.

This is what a lot of non-JavaScript languages do.

I suspect JavaScript ended up this way because people wanted to avoid downloading anything they absolutely didn't need in the browser. Now packagers can automatically remove unused functions (WebPack 2 does this IIRC, and I think require.js did ages ago) so it's no longer a concern but somehow NPM still ended up with a million tiny libraries.

Thank god Underscore/lodash didn't use the approach of one-module-per-function.

one-module-per-function is exactly the approach Lodash uses and the Lodash devs tout is as a big advantage over underscore.

The difference is they also provide the rollup bundle with all of Lodash. Which is what most users actually use.

Yep, that's what I meant. It's one module per function (so I can just grab what I want), but it's all part of a big project under a good maintainer, so it's trustable.

This only "kind of" works for dependencies which you explicitly include in your project.

My understanding of the left-pad issue was that it was frequently down the dependency tree where dev's themselves could not remove the dependency on left-pad without modifying or branching a dependency (or perhaps multiple).

Left-pad was a trustable, fast, well-tested package too, up until NPM let it disappear.

How is it fast? (obviously it was not trustable or well-tested) It is the slowest possible way to implement string padding without any optimization effort whatsoever. This is what I hate about npm, every module claims to be fast even though they have made no effort to be fast.

It's fast 'cus it uses libuv, duh!

Isn't it a bigger lesson to get that Open Source project are for the people and not the interest of a corporation?

No, no. This is about whether people should be able to unpublish from the registry, or how that should be handled since it's problematic in any case.

you are part of the problem here. Many folks are refusing to even consider the fundamental issue because it is different from the mindset they are used to.

Yes this particular episode is related to the unpublishing results... however the underlying issue is not only about unpublishing results. The underlying issue is:

number 1, Having thousands of dependencies is a software engineering trade off, their is a upside and a downside of that trade off. Having Thousands of dependencies is fundamentally more 'risky' than have a few. Period. End Of Story. It is ok to accept the trade off based on what you are optimizing for. It is not ok to refuse to accept the fact you are making a tradeoff.

number 2, is you should not be running production builds or builds for important components based on dependencies on the naked internet. This is also an engineering tradeoff but is firmly in the camp of what is considered 'bad practice'. This is fundamentally the same exact underlying reason why we have build servers instead of just builds from developer work stations. Developer workstations are not controlled environments, which means the builds are not guaranteed repeatable. A build server that depends on the state of the naked internet is not a controlled environment. It blows up today because of unpublishing, it will blow up tomorrow because of something else.

I agree with what you're saying. But I think the "grand experiment" of npm and modern javascript development is to go "all in" on the "lots of small modules" approach. Having many small dependencies is, and always has been, a trade off. The question is how far can we get optimising away from the pitfalls, while still retaining the benefits?

I think it's a worthy exercise, since the prize is no less than efficiently distributing coding effort across the globe.

Problems like we've seen today are a double edged sword. On the one hand they're troubling, but it'll focus the community on the problems, and hopefully npm will make a few changes too.

> A build server that depends on the state of the naked internet

Whether npm based or not, don't the vast majority of build servers, for whatever language or platform, rely on the state of the naked internet? Whether that's downloading binaries, compiling from source, or whatever?

Your build servers shouldn't really be exposed to the public internet, best practise would be to get all dependencies/binaries of a local network.

> This is fundamentally the same exact underlying reason why we have build servers instead of just builds from developer work stations. Developer workstations are not controlled environments, which means the builds are not guaranteed repeatable.

A side note, if your build depends on the box it's being built on, your build process is bad. The build from any dev workstation should be identical to any other build. People have build boxes to automate building/deployment and testing and alerting people things broke; they aren't about dependency management.

If you're using a build box because of dependency management issues with dev workstations, you're doing it wrong.

> Having Thousands of dependencies is fundamentally more 'risky' than have a few. Period. End Of Story.

No, this is misrepresenting the issue. The amount of dependencies you have is just as irrelevant as the amount of lines they contain.

What actually matters here are the following things:

1. How many different entities am I trusting?

2. How hard is it to audit this code?

3. How easy is it to 'lock in' a dependency version without missing out on updates I need?

4. How much work is it to review updates?

There's no significant difference between small and large dependencies for point 1. You can choose how many entities you trust, regardless of whether the functionality you are using is split up into many small packages from the same author, or combined into one big one.

There will usually be no difference for point 4, because the amount of updates generally relates to the amount of functionality that is being updated. This means that larger modules means less CHANGELOG files to track down, but smaller modules means less chance of an upgrade affecting something you're not using to begin with. It evens out nicely.

However, for points 2 and 3, there is a difference - in favour of small modules. A much smaller, more well-defined surface with less implicit dependencies and coupling. These are the metrics that actually matter.

EDIT: To clarify, 'smaller' and 'larger' here refers to complexity, not line count.

Thank you. I find it very surprising that dependencies on the naked internet are so common nowadays. Why isn't it trivial to build my own package repositories where I mirror the known good dependencies that I need?

Your comment reminds me of what SourceForge did.

>22 Apr 2015: By this date, SourceForge Open Mirror Directory has expanded to take over popular former SourceForge projects which left the site, such as VLC and GIMP for Windows. Some projects have malware embedded despite earlier assurances that the adware system would be purely opt-in.

>16 May 2015: GIMP developer asks SourceForge to remove gimp-win from the site. They don’t. SourceForge later claims they didn’t receive this message.

Honestly, I think you're both correct. It's definitely dangerous to allow packages to be unpublished, but it can potentially be just as dangerous for people to blindly include a library without looking into a bit first.

First lesson learned is of course in regard to how package managers such as NPM should handle scenarios like this. However, I would also hope this might make some people take a harder look at their dependencies to see if everything they are referencing is both truly needed and trustworthy.

God, I wish I could downvote you.

As it is, this is the first time I've logged in for about 12 months, just to berate you.

This is a decade old lesson, well understood by few developers that want their code to be around for a while.

The HN "web developers" are a different crowd.

I've been on the receiving end of a NPM name dispute violation handed down by Izs. https://docs.npmjs.com/misc/disputes

Was on vacation and found out that I had lost a published package name within 24 hours of the dispute request. Broke a few production systems. Really messed up my day.

Ended up having to beg with the person who filed the original request and they eventually gave me the package back.

Honestly, the whole process was a bit personal and I felt like I was being singled out as an individual by NPM, rather than being treated like a developer who was using the service. Not a nice feeling.

It's insane that NPM allows anybody to just take over a module name if the original is unpublished, without even a warning to users.

Insane, yes, but not unexpected. This is the sort of amateur behaviour that is emblematic of the entire nodejs community.

...Along with allowing packages to run arbitrary JS code when installed.

That's the part that really got me. Questionable dependency practices is one thing; but running whatever JS when a package is installed, with no warning whatsoever, is probably a very bad idea.

A year ago I never ever would've thought I'd be happier writing C than NodeJS, but here I am. Weird how that works.

Essentially every package manager has to allow that, and every package manager I can think of does allow that. What's your threat model anyway? The package itself will run arbitrary code once you start using it. Is it so problematic if it starts doing that at installation instead of later?

That's what happens when a language and framework have very low entry level, more and more not responsible people gather in one place. Yes, they make it popular and trendy, but they're dangerous for themselves and others.

Amazingly shameful cheap shots and mudslinging against hundreds of people from both you and @na85. Bravo.

Npm is the most miserable tool I've ever used. I just spent the last 30 minutes waiting for Windows (robocopy) to delete the node_modules directory.

It destroys days of development.

This problem is largely solved in npm 3.

Those people deserve criticism, because they're making bad decisions.

and by 'Those people', you mean the entire NodeJS community?

> This is the sort of amateur behaviour that is emblematic of the entire nodejs community.

This is pure mudslinging

No, I'm sure he didn't meant whole community. There was a reason why njs became so popular, and there are great people behind it, but as everything that's easy to start with, it attracts people with no skills, no experience. When learning C one of the first things you learn is how operating systems work, what memory leaks are, why they are dangerous, C, C++ have higher entry level, people who stay longer with C/C++ have very good experience, they know more about security, risks, standards, systems... When you have low entry lever, you don't even bother reading on wiki what XSS is, what .ssh folder is. We just learnt the hard way that package manager for NJS is crap, just to compare it, take a look at crates of a language that has high entry level:


If any of thousands of strangers could push a button and prevent me from deploying to production, I have made several bad decisions. This seems to apply to the majority of NPM users.

> If any of thousands of strangers could push a button and prevent me from deploying to production, I have made several bad decisions.

That's a strange thing to say considering how insanely complex modern hardware and software are. While the NPM setup is indefensible, there are countless ways a stranger can prevent your deployment, if that stranger works at Microsoft, Intel, or the manufacturer of any production hardware component. OS bugs, driver bugs, compiler bugs, and hardware/microprocessor bugs are a thing. At any point, you are dependent on the work of many thousands of strangers, no matter how perfect your decisions are.

A proper build is hermetic. Sure we have to deal with pre-existing bugs, but nobody outside our org can make intentional changes to our build today without our actively consenting to deploy their updates after first testing them.

And it's somehow one of my most upvoted answers on HN.

I use github to get golang/bundler dependencies. This could easily happen on github when someone deletes their account( or changes handle), someone could hijack their account and create a repo with same name.

It's insane that NPM allows anybody to un-publish modules to begin with.

Even if I have a dependency locked down in my NPM shrinkwrap file, it can change underneath me? That is pretty absurd and gives me zero confidence in my packages. It means I MUST commit them to source control or risk having my project completely broken some day.

I thought for sure that since NPM removed the ability to re-publish the same version of a package with different content that they also wouldn't let you remove versions of a package. It also means you should never user the "^" version specifier or risk downloading some completely different project.

They do have a "warning" at https://docs.npmjs.com/cli/unpublish: > WARNING > > It is generally considered bad behavior to remove versions of a library that others are depending on!

Seriously, what good does that do? Nobody takes warnings seriously.

A specific version of a package is immutable on npm. So if you depend on exactly foo@1.2.3 in your shrinkwrap you'll always get the same code. In other words, even if the package is unpublished, and picked up by a new user they can't re-publish new code to 1.2.3.

If you depend on a version range like ^1.2.3 or ~1.2.3 its a different story, of course.

Moral of the story is imo always pin to exact versions and use shrinkwrap for production apps.

EDIT: This may not be true .. see below.

> A specific version of a package is immutable on NPM.

That's not true. I thought it was but it's not. That's the terrible part.

To demonstrate with one of the packages that was removed, run:

$ npm info andthen

You'll see that versions 0.0.1 and 0.0.2 were published at one point. However, for "versions" it only mentions 2.0.0. And of course, if you run:

$ npm install andthen@0.0.2

It blows up in your face.


Insane, yes, but I guess it's just never really been that much of a problem until now. Hopefully something good will come out of this..

> it's just never really been that much of a problem until now

That's a reasonable excuse in the 90s, before automatic updates over HTTP were common. Our industry now has decades of experience securing HTTP updates and package managers, with various Linux solutions demonstrating good practices.

Something similar is (or at least was) true for Github: I changed my username a couple of years ago and I recently found out that another user is now using my original username. Less of an issue due to the much longer timeframe, but I think not allowing username re-use would be a safer choice.

You would think this is part of security 101 for these things...

And people wonder why big enterprises are scared of touching open source stuff.

> And people wonder why big enterprises are scared of touching open source stuff.

some open source stuff. Most enterprises dig distributions, especially with LTS.

Many big enterprises run on free software. Just because this particular free software project is a shit-show doesn't make all free software projects this bad.

Module repositories have been a thing for decades. Even a platform as crusty-old as WordPress doesn't allow repository take-overs for plugins. Surely someone had some experience outside npm to think about this.

Link rot and domain squatting are a well-known decades-old issues. NPM identifiers can be considered as just another weird form of pseudo-URIs and aren't special here.

It is insane that modules don't have signatures (that are actually verified, of course). Because npm can feed you basically anything.

It's not a problem that something else gets published under the old address. It's perfectly natural. The real problem is the trust model - that new content's accepted without even warnings.


My semi relevant tweet, it's just asking for it with that warning

Why not just post the tweet here instead of spamming your Twitter feed?

That's a perma link to the direct tweet... why paste in the tweet when I can just link it...

HN would be unusable if everyone stopped writing comments and just pasted links to an ephemeral third-party service.

Bravo, you've illustrated the whole problem

Edit: I'm definitely wrong. The person who published these was merely parking the packages. Leaving the existing (inaccurate, see response) message as one example of how badly this could have turned out.

> and the content of the files is suspicious

The script seems as though it might publish your entire codebase to NPM. Sorting NPM packages by creation date[1] reveals[2] a[3] few[4] potential victims. Filtering by the ISC license also seems to work.

[1]: https://libraries.io/search?order=desc&platforms=NPM&sort=cr...

[2]: https://libraries.io/npm/alaska-dev - internally hosted repo, edit: license doesn't match other projects by the same developer

[3]: https://libraries.io/npm/b3app-prototype - private bitbucket repo, edit: deleted from npmjs

[4]: https://libraries.io/npm/nodework - private bitbucket repo

Installing one of these packages doesn't execute the script though, you'd have to do that yourself. (It could be updated to do that though, which is terrifying.)

Well there is pre/post install scripts that can be run. Not saying this package does that, but it is very easy to run a script just after installing from npm

I wonder if the author is possibly proving a point. Regardless, it seems as though having an NPM user set up on your CI server is a big risk.

You have to have an NPM user setup on your CI server in order to use private NPM packages, which some companies do instead of hosting their own onsite NPM.

Yes, npm will publish any package.json without `private: true` set, and will publish all files in the current directory by default (anything not listed in .gitignore or .npmignore). There is a great CLI tool called "irish-pub" I use, it shows a publish dry-run to avoid these simple mistakes.

Great suggestion, thanks!

x.sh was just the script I used to automatically register all the packages.

It takes one argument, a package name, then attempts to publish that package.


    cat list | xargs -I{} ./x {}
was what I used to publish the whole list.

It seems as though, unintentionally, you've found a pretty big vulnerability. Either way, kudos for parking these packages until they find a new home - this could have turned out so much worse.

Hey really sorry for what happened.

Anyway did you think of explaining your intend to doing so?

I've been doing development for years and I've never seen a developer naming files "x.sh", except if it was not malicious.

As shitty as unpublishing is, I think the best thing the npm community needs to take away from this is auto updating is bad. Remove the ~ and ^ from your dependencies so that only a package of a specific version can be installed. "This or better" thinking doesn't work if "better" is unknown. I know that the version I am using currently is fine but I don't know about future updates. Even if we had signed packages, we are still installing unknown software if we just trust other devs. I lock my dependency version and then use https://www.npmjs.com/package/npm-check-updates to figure out what needs updating, then I test my code. This should not be done as an automatic part of the build process.

*edited npm capitalization

No need to muck about with your package.JSON, actually fix the problem by just checking in your node_modules folder.

Then continue working like you always have.

There are some changes that you'll need of you are using native modules, but they are simple and easy to do.

You can use shrinkpack[1] to solve the issues with native modules. It allows you to check in the archives that npm downloads instead of the installed node_modules folder.

[1]: https://github.com/JamieMason/shrinkpack

That's a great solution, but i was just talking about installing the source into node_modules and running `npm rebuild` instead of `npm install` when you install your app.

Dependency version ranges should be a _wide_ as possible, so that users of your libraries have flexibility when selecting versions.

Rather than artificially constrict your version ranges, NPM should support real version locking, and applications should check in their lock file.

Known ranges are fine. e.g. It is fine to accept known published versions 0.0.1 to 0.0.4, but not unknown future versions 0.0.5 and up, which ^ and ~ allow.

Do you mean like npm's shrinkwrap [1]?

[1] https://docs.npmjs.com/cli/shrinkwrap

By adding `save-exact=true` to .npmrc in a project NPM saves the exact version.

Learned this trying to fix Shrinkwrap a few weeks back but seems like it's a good practice for securities sake now.

This doesn't lock the versions of your sub-dependencies.

`npm shrinkwrap` exists to do that. Applications should use it to pin the versions of all of their dependencies and sub-dependencies.

Yes, and updates of sub dependencies have bitten me before. I will look into shrinkwrap.

I challenge someone to tell me the difference between npm and an RCE.

"npm" is two things in this picture: the server and centralized service, and the software tool on everyone's build/dev machines.

Folks have historically been happy trusting the centralized npm server to behave consistently and pleasantly, and some opinions have recently shifted on that. But frankly, the server doesn't matter, in the big picture. The npm tool on your computer does. It's the one executing code on your computer with all of your local user's privileges.

This tool starts executing new code from a new author on my host as my user without any authentication except "trust the server". This is exactly the same words we would use to describe behavior of $script_kiddie_virus_of_the_week:

> "download code from C&C server; run it, thanks:D"

What's the difference here? "good intentions"?

I'd rather have something more than "good intentions" controlling what code ends up running on my computer. Wouldn't you?

Author of the post here. According to a tweet [1], the user @nj48 seems to be non-malicious.

I updated the blog post.

Nevertheless I find his actions dangerous and irresponsible.

[1] https://twitter.com/seldo/status/712673227630313472

Given the conversation in yesterday's thread [1], I think a known (and personally identified) non-malicious person defensively reserving these packages is less dangerous than some unknown entity taking over these packages later.

I personally think the best course of action would have been for the NPM team to immediately blacklist these names (aside from left-pad, that's a separate conversation) after the entire list of unpublished packages was shared, and then make them available on a case-by-case basis.


Curious, should those of us using their Linux PPAs disable them until this is sorted? The idea of re-using disabled package names is out there now and I wonder how long it is before someone malicious tries this. Is this an unreasonable concern?

Not sure about that. I'm away of the PPA's + NodeJS's NPM packaging model.

For those of us who are pissed that this is going down but need the status to get on with our lives:

nj48 is a known friendly who has identified himself to us. We're going to clarify later today.


I bet the orig dev was considered a known friendly until he decided to unpublish.

Relying on the notion of a "known friendly" to protect packages and namespace does not strike me as a sound practice.

As others may have mentioned, publishing packages really should be fire and forget. If something bad goes out, a replacement should be sent out. And for the life of me, I don't understand why they did not go with the <author/package> scheme.

Yes, as far as I've heard, this was done defensively to prevent malicious actors from claiming these packages.

Also, relevant conversation here in yesterday's thread: https://news.ycombinator.com/item?id=11340510

It's worth noting that the official NPM dispute resolution policy makes the following very clear (https://docs.npmjs.com/misc/disputes)

> Some things are not allowed, and will be removed without discussion if they are brought to the attention of the npm registry admins, including but not limited to:


4. "Squatting" on a package name that you plan to use, but aren't actually using. Sorry, I don't care how great the name is, or how perfect a fit it is for the thing that someday might happen. If someone wants to use it today, and you're just taking up space with an empty tarball, you're going to be evicted.

5. Putting empty packages in the registry. Packages must have SOME functionality. It can be silly, but it can't be nothing. (See also: squatting.)

Looks like the author of the modules wrote a shell script to generate a package.json file and publish empty modules to npm to grab up all the unpublished names. However, when they ran it, they ran it in the same folder with their shell script and list of available modules and `npm publish` included them in the published modules as it does by default.

A big problem with Software repositories that don't allow for /enforce cryptographic signing by the developer is that this can happen...

Ideally the developer would sign before publishing and the consumer could check the signature to validate before using.

Whilst not a silver bullet this is a kind of essential part of a secure package management solution.

Plenty of repositories require signatures.

for NPM? As far as I'm aware it's not even an available feature. None of rubygems/PyPi/NuGet require digital signatures...

What repositories were you thinking of that do require that?

NPM doesn't. Maven does. Debian has authentication of the repository itself.

yep the linux repositories are generally way ahead of the programming language lib ones in this regard (evidently with the exception of Maven), one of the reasons that it's a shame to see newer ones not learn the lessons that previous repo's have on security

Overly dramatic. The content is not suspicious at all, it's clearly a script to generate the package.json for an arbitrary package name passed in as an argument to the script.

I'm guessing @nj48 used a script to go over the official list of unpublished modules and attempt to generate a placeholder for each of them.

The same user also published some actual (unmanipulated) forks of the original modules, so I'm guessing this is mostly a quick move to preempt any malicious hijacking by others.

There is no reason to assume the "hijacking" is malicious. Certainly not in the scripts. The user is active on GitHub and shows no indication of malicious intent.

However it IS worth pointing out that the unpublished modules should now be treated with caution if you still rely on them because even if the replacements are identical and benevolent you probably need to take action.

> a script to generate the package.json for an arbitrary package name

Yep, that and it publishes your project to NPM!!!

edit: I was wrong, the x.sh script looks malicious but appears not to be? In any case it shows that installing NPM modules is just as unsafe as 'curl | sh'.

Ways to be safe are to first check what scripts will be run by a package: 'npm show $MODULE scripts'. There is also the '--ignore-scripts' flag[1].

[1]: https://blog.liftsecurity.io/2015/01/27/a-malicious-module-o...

If you're logged in, if you call it with an argument that isn't already in use as a module name, and if you specifically enter the full path to the script (you don't have cwd on your $PATH, do you?). TFA would make more sense if it contained a warning not to run random scripts that one downloads without reading them.

As it is, TFA seems like an attempt to invent malicious rumors about a completely innocent person.

[EDIT:] you're still wrong; in order to be equivalent to "curl | sh", x.sh or its equivalent would have to be referenced in the "scripts" object in "package.json".

You got any proof of that?

I don't think there's anything malicious here. The file x is simply a list of the modules that were 'given up', and x.sh is used to loop through that list and republish them as placeholders, meaning other people can't use it for malicious purposes. The author seems to be in good standing and has even republished the original code to a few modules

There are some very shaky bits in the JS dev arena. Just a little earlier, we read about how a developer unpublished his libraries from NPM, the fallout from this affected numerous high profile projects like node.js, which were relying on a left pad "library" that basically was a small string padding function.

This is the same story... It's a module that the developer unpublished

Noticed another random package was uninstalled from NPM. Please oh please don't let this thing become a trend.

Sorry but the elephant in the room IS the lack of namespacing. There is no namespacing on NPM. If there were something like that this issue would have never happened. This is not a problem of distributed vs non distributed, NPM doesn't need a "blockchain" or whatever. Packages SHOULD by namespaced period and any serious package manager uses namespaces. I (and many others) called it from the very beginning and warned NPM authors that the lack of namespace would eventually lead to this kind of issue. NPM authors didn't give a damn ( there are other issues that have been known for years but they fell on deaf ears ). Packages should be resolved by a namespace + the name of the package. Now everybody is in panic mode because nobody knows what one is fetching from NPM anymore.

Now I hope NPM author will come to their senses and change the way NPM works. But the trust is broken, no question. Between stuff like that, people selling "realestate" on NPM for real money... Nodejs has been the least professional platform I have ever used. Everybody's out there to make a quick buck, nobody gives a damn, this is whole thing will collapse sooner or latter.

Finally people need to stop with the "unix philosophy" excuse. Importing 10 lines of code from a random package is not the "unix philosophy". Splitting a 100 loc module into 20 packages is not the "unix philosophy". Packages have so many dependencies it's getting ridiculous.

edit: corrected

Sorry, 100% agree with you but with your life going forward, it's "fell on deaf ears"

Oh, sorry, I'm not a native English speaker so yeah I tend to make stupid mistakes like these.

Name squatting in those package managers is a real problem. I found this on pypi and it was a very unpleasant surprise. https://sourceforge.net/p/pypi/support-requests/571/

In the root project folder (containing package.json and ./node_modules), you can run the following:

    comm -12 <(ls ./node_modules) <(curl https://gist.githubusercontent.com/azer/db27417ee84b5f34a6ea/raw/50ab7ef26dbde2d4ea52318a3590af78b2a21162/gistfile1.txt) 

It will output any of the modules that @azer unpublished yesterday (that are being used by your project).

NOTE: this only works for NPM 3's flat node_module structure

Anything wrong with checking your dependencies into your repo? That way, you know, they can't go away or be hijacked.

As a rule of thumb, if a dependency would take you less than two days to rewrite (one day for writing, one day for testing), it's better to do it yourself and avoid the problems of dependencies.

How many nodejs developers actually audit their dependencies ? In my experience not many. it's just "npm search" without thinking. then "my problem is somebody else now". Frankly the whole ecosystem is just scary.

PSA: you should be keeping your dependencies in git

Bundling is not the answer.

It worked for me. After countless injections of errors after updating a large list dependencies we made the decision to stop updating and freeze the versions, patch remaining errors ourselves and leave it alone. Worked so well we now have time to rebuild the whole front end on a new stack as an experiment instead wasting time with endless bug hunts.

Maybe bundling is fine for your in-house proprietary software, but it's absolutely not OK for free software where users and administrators need to keep on top of things like security updates. When projects bundle their dependencies, users become dependent on that project to provide critical updates to software that the project didn't even write. This multiplies for each piece of software that bundles their dependencies. It's simply unsustainable and irresponsible.

I agree. Someone making free OS software for others to use shouldn't bundle.

I made the assumption on the top post that they were in-house proprietary software given the reference to keeping everything in git.

I guess we're on the same page! Sorry!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact