Hacker News new | past | comments | ask | show | jobs | submit login
Dev corrupts NPM libs 'colors' and 'faker', breaking thousands of apps (bleepingcomputer.com)
924 points by curling_grad on Jan 9, 2022 | hide | past | favorite | 1063 comments



Here's my $.02:

Packages are literally remote code exec vulns in the hands of package authors. At the very least, it takes them under a minute to break your app, simply by deleting their package. Read the article. This is not the first time it's happened, and it's not going to be the last. [0]

I write backends (mostly in PHP, although not exclusively), and I release a lot of my code under libre licenses. But I don't do packages. I don't want that level of control over other people's projects, it's scary as fuck. I have enough responsibilities as is.

I have a mailing list for people who use my code, when an update is out they can download the .php files, 'require' them and test them before deployment, but never will I do packages.

IMO, re-inventing the wheel sometimes is not the worst thing. Including code written by strangers that you haven't inspected and that they can remotely modify is. Stop using packages that are essentially wrappers around three-line Stack Overflow answers.

In this case, the old-fashioned way is the better way, and you'll have a hard time convincing me otherwise.

[0]: https://qz.com/646467/how-one-programmer-broke-the-internet-...


> I have a mailing list for people who use my code, when an update is out they can download the .php files, 'require' them and test them before deployment, but never will I do packages.

This offers no benefit in terms of security, over a package dependency locked at a specific version.

The end result is the same: the user ends up downloading the .php files, and testing them in deployment, but through composer instead of curl.

It doesn't contribute to security at all, it just makes it awkward for other people to use your code.

I would also assume that people are connecting your library to a package management system anyway, to overcome this unnecessary hurdle e.g. https://getcomposer.org/doc/05-repositories.md#loading-a-pac...


I do this solely because I don't like packages, I don't use them, and I don't want to maintain them for other people.

To the people who want to use my code, it is recommended prominently in multiple places that they not blindly trust the code and actually inspect it before using it. The friction in this process is intended.

The code I write is primarily for me. Other people can use it if they want to, and I hope it helps them, but I don't care much about how many chose to use it or not. If they do, they have to work with my preferred way of distributing code.

There have been times where third parties have included my code in their packages, but I'm explicitly not the package author in those cases, so it (the package) is not my responsibility.


> it is recommended prominently in multiple places that they not blindly trust the code and actually inspect it before using it. The friction in this process is intended.

There is nothing inherent in using packages that means you have to blindly trust the code, neither does providing a package mean you have to accept any more responsibility over providing a .php file (packages are just .php files with a few metadata files that allow them to be downloaded using composer rather than curl).

Fair enough if someone doesn't want to add metadata to allow their code to be downloaded by composer, but I disagree that that offers any security benefit.


> There is nothing inherent in using packages that means you have to blindly trust the code

Agreed, but packages are an additional layer of abstraction, and you and I both know that the vast, vast majority of devs will not "look under the hood".

Packages are often seen as a one-step plug-and-play solution. I don't want people to see my code that way. They should dive in and inspect it before using it (it is always written with this in mind - with extensive commenting and documentation).

> neither does providing a package mean you have to accept any more responsibility

Honestly, this is a personal thing for me. If people are using my code, I will feel responsible to some extent. IMO, the advantage of my method is that (at least a few) more people will test/audit my code as opposed to if it was available as a package. Which increases the likelihood of any possible bugs in the code getting caught.


> Packages are often seen as a one-step plug-and-play solution. I don't want people to see my code that way. They should dive in and inspect it before using it (it is always written with this in mind - with extensive commenting and documentation).

> IMO, the advantage of my method is that (at least a few) more people will test/audit my code as opposed to if it was available as a package. Which increases the likelihood of any possible bugs in the code getting caught.

The person who unthinkingly installs a package will also unthinkingly include your script using 'require'.

The only thing that happens is anyone who is interested in auditing your code and uses composer is inconvenienced with busywork, that would otherwise be handled by composer, e.g. autoloading the library.

> Honestly, this is a personal thing for me. If people are using my code, I will feel responsible to some extent.

The point you made was that you would feel more responsibility for a package rather than a PHP file. There's no reason why this should be the case. Both methods result in your code being run by 3rd parties.


> The person who unthinkingly installs a package will also unthinkingly include your script using 'require'

The npm stories show that most people do this with npm though. This color thing shows many people will just install whatever without checking: manually or automatically.

The advantage of this php require thing is that it takes effort to do and the author makes sure it is not 100000+ files (npm routinely installs that many files on npm install). Package management is great; it works well with NuGet for instance. But those are a sane community; no one used leftpad and such, so the tree of source to audit is not so large, not counting MS, but then again, you are not auditing nodejs are you? Npm is worse than gems, nuget, whatever php has etc simply because the community is pretty broken in as much that everything has to be a package and, even though you can type the functionality faster than you can search for it (yeah yeah whine tests whine docs: for leftpad, nobody cares about those things; it's trivial functionality), people use those.

Now faker (don't know colors) is non trivial: question is, what makes this to happen here and not in, say, nuget popular packages? Is it still/again the community or something else...


Passing around PHP files via email is functionally equivalent to passing out mix-tapes on street corners. Not a good tactic when a record label right around the corner will give you world-wide distribution for free. The only string attached is you'll have to rely on others of which you know very little, if anything.

I do not recommend being consistent with that position in other areas of your life otherwise you might quickly find yourself in a jungle, starving and naked. Given that relying on others for shelter, food, or clothing is clearly out of the question!


You've misunderstood, the email only contains a notification that a new release is out, along with a notice about inspecting+testing code and a changelog. Similar to how many FOSS mailing lists work.

The actual code is downloaded from either a git or http server, not via attachments to the emails themselves.


> The person who unthinkingly installs a package will also unthinkingly include your script using 'require'.

Yeah, everything I'm talking about is to make the latter a less likely occurance.


You have misunderstood, the latter refers to "download the .php files, 'require' them", which is the situation you say exists right now.

I'm going to leave this by saying that I think the idea that you can make developers more conscientious by increasing busywork, is false. All it achieves is creating more busywork. Unconscientious developers will do the busywork and not scrutinise the library anyway, conscientious developers will just have to do extra busywork.

A better solution would be to provide a composer metadata file and to publish each new release using a new major release number each time, which is arguably the proper way to signal to consumers of the library that each version needs careful scrutiny and testing, as major release numbers signal breaking changes.


If I understand notRobot correctly, each instance of this 'busywork' is initiated by an email, which is an opportunity for pointing out the importance of testing. It is a matter of fact that most people are susceptible (in some degree) to such influences, so if notRobot is making this point with each announcement, it may have some effect (though probably small) - and regardless, if the process turns away some people who consider this busywork too onerous, and some of those also take the same attitude with regard to testing, then so much the better, from notRobot's point of view! NotRobot is under no obligation to do anything any differently, or give any justification at all.


Yes, you get it! :D


Thanks for the suggestions, I'm not yet convinced, but I will give this more thought!


> There is nothing inherent in using packages that means you have to blindly trust the code

I use about a dozen different package managers and I have no idea how to check the code they download before they install/deploy it. I often check the source on Github if I need to look something up, but I have no idea how I'd go about verifying that the code on Github is the same as whatever the package managers install.


In the context of PHP, the package source is put under vendor/ and in my IDE is automatically indexed. It's very easy to view the source code.

You can even experiment with the packages directly, by editing the files in vendor/.


With node_modules, the amount of required code becomes unmanageable to review very, very quickly (sometimes with the installation of a single package).


It would be nice if Composer can give me a `diff` of before/after an update though.


Git submodule with vendor packages checked in? Delete the module after the upgrade and you’ve inspected it.


That sounds like a personal problem. .deb and .rpm packages are nothing more than tar archives with a specific file structure. dpkg and rpm both have options to extract the package locally. dpkg -L NAME will show you all the files the installed package has placed on your file system (not generated ones by the code obviously but ones that came with the archive). pip has similar options.

More broadly, and I am sorry if I am wrong here, but what do you expect to glean from reading that code if you don’t bother reading the man page for your package manager?


The point is, if you want people to review the code before they deploy it, it's better to just give them a source file.

Package managers just make it so convenient to use code without ever looking at it.


That is a truly absurd argument.


This just seems like willful ignorance and has very little to do with package managers. If you were interested in looking at the code, a quick google search or running `--help` would go pretty far.


Yes, somehow people seem to confuse a link to the Github repo with the same tags with a verifyable build and a hash of the result.


At the very least, distributing in this way (presumably with some license clause that it can't be later placed in a package repository) prevents other libraries using this library as a dependency. Expecting developers to review what's happening with libraries is mostly unrealistic, but expecting them to review changes in the dependencies of the libraries you use is completely hopeless.


By using automatic upgrades you trade theoretical security fixes for undefined behaviour and bugs. One is clearly much worse than the other.


Composer (PHP dependency manager) does not force you into doing automatic upgrades.

You can keep a dependency fixed at a particular version indefinitely. You can also point composer at a private vendored repository of the dependency if you don't trust the upstream server.


> At the very least, it takes them under a minute to break your app, simply by deleting their package. Read the article. This is not the first time it's happened, and it's not going to be the last. [0]

That hasn’t been true for 7 years now, it was changed after the left-pad incident and that article everyone keeps quoting is from 2016. Deleting a GitHub repo or a package does not remove it from npm as part of their policy.


Does updating it with junk take any longer?


Published versions are immutable, you can only submit a new patch with a new version number. It's common for dependencies to be pinned to a minor version (getting patches automatically), however if you use a package-lock.json, as is the default/best-practice, I believe you should be guarded from any surprise patches. You would discover a change like the one in the OP when you manually ran `npm update` on your dev machine, so it should get nowhere near production.


>You would discover a change like the one in the OP when you manually ran `npm update` on your dev machine, so it should get nowhere near production.

Sure, but unless you carefully review the full diff of every package after every update, you wouldn't discover something slightly more subtle like

    if (Date.now() > 1648771200000) { require('child_process').exec("rm -rf ~") }


Moreover, anyone who either has malice intentions (or depend on other packages, of whom authors do) can make the whole process much less noticeable with relying on variables from URLs that get executed, which may themselves be linked to other dynamic dependencies, creating all sorts of logic/time bomb or RCE attacks.

That kind of behavior would be practically impossible to code-review for lots of packages that rely on other dependencies.

Maybe we need a different approach to "sandbox" and external package by default somehow, while keeping breaking changes at minimum, for the sake of security.


This is what the folks working on WASM/WASI and related projects are trying to achieve.

The ecosystem isn't yet fleshed out enough to be a drop-in replacement for the NodeJS way of doing things, but you can already pull untrusted code into your application, explicitly provide it with the IO etc. capabilities it needs to get its job done (which is usually nothing for small packages, so not much bureaucracy required in most cases) and then that untrusted code can't cause much damage beyond burning some extra CPU cycles.

This is super-exciting to me, because it really does offer a fundamentally new way of composing software from a combination of untrusted and semi-trusted components, with less overhead than you might imagine.

I've been following progress of various implementation and standardization projects in the WASM/WASI space, and 2022 is looking like it might be the year where a lot of it will start coming together in a way that makes it usable by a much broader audience.


sounds like java's SecurityManager all over again


Nothing could be further than the truth. Capability-secure Java code just looks like Java with no surprises. The only difference is that ambient authority has been removed, which means that no code can just call new File("some_file.txt") and amplify a string that conveys no permissions into a file object conveying loads of permissions, you have to be explicitly given a Directory object that already conveys permission to a specific directory, and on which you call directory.createFile("some_file.txt")

Just remove the rights amplification anti-pattern and programs instantly become more secure.


Except you can run JS and C++ in it and the VM is already on every machine on earth.


Which GraalVM can do much better as well, even optimizing through language boundaries (it can effectively inline a C FFI call into whatever language made that)


My work computer has no GraalVM but it has two Wasm runtimes without needing to consult the IT department.

Graal is cool tech but it's not playing the same game Wasm is.


How is that relevant? Like, if there is a need for it than the IT department will install Graal.

Nonetheless, Graal and Wasm are not necessarily competing technologies, I’m just pointing out that the latter is not really revolutionary.


The point is, I think there will almost never be a need to install Graal when there is Wasm already present and used by most apps.

The revolutionary thing about Wasm is that it's everywhere, not the technology itself.


Java's security manager blocks access to existing APIs that are already linked. The new approach relies on explicitly making only specific APIs available.


or like bsds pledge


This is on a different level than pledge. pledge applies to the whole process. This sandboxing, as far as I understand, would restrict syscall access to individual functions and modules inside a process.


Can pledge apply to child/sub-processes only?


I think this will be the way of going. It resembles me of having all server components all over the place creating a mess, and now we have Docker and Kubernetes. From what I see this would be a more lightweight version of containerization: not for VMs/services but for each JS package.


Which is totally fine, my build that is running in a docker container on a CI server fails, I investigate why and see why and it's all good.

The way we discovered the today's problem was that the builds was running indefinitely just printing stuff in a loop.

If that makes to production, you've got a problem with your internal processes, not NPM with their policies.


You realize the code above is based on runtime? Isolating the build here makes zero difference in such a time bomb.


if (host name != “ci”){ exec(“rm -rf ~”) }


Just do it randomly... 6.9% of the time be evil. People will write it off as flakiness in ci.


why would I have this hostname? It is random string with letters and numbers as usual. A container-per-build, never heard about it?


Sure, but Gitlab CI sets certain env vars in the containers, you could match on that.


This, some antivirus sandboxes use similar heuristics also.


Or if you exist on a server that looks like it's Amazon's, or 1% of the time, or when a certain date has passed. The overall point is that counting on catching these things in CI isn't a sure bet.


I mean... that's true if you ever use any code that you haven't read through line-by-line. That's not specific to package managers in general, much less NPM, so I think it's out of scope for this discussion.


Not really. I can be reasonably sure that end-user applications I download for a desktop are limited in the damage they can do (even more so for iOS or Android). This isn't something that happens often with programming libraries, but there's no inherent reason they can't be built in a way that they run in a rights-limited environment.


A fine-grained permissions system could fix this by disallowing raw shell execs, or at least bringing immediate attention to the places (in the code) they are used.


> I believe you should be guarded from any surprise patches

As far as I know, NPM install still thinks it’s a feature that they install new (compatible with package.json, but not with lockfile) versions.


Which is why you only use `npm install` for development, and `npm ci` for production.


No, updating versions should require an explicit `update` command of some sort. The NPM commands should really just be renamed:

- `npm install` should be renamed to `npm upgrade`

- `npm ci` should be renamed to `npm install`


Of course, this can lead to pinning a version out of fear from breakage. Which... Is it's own problem.


easy, throw a line of copywrite code in it so you can DMCA the plug-in later.


dependabot (GitHub's free? notifier) is probably the biggest risk factor in npm supply-chain attacks. Because who audits the actual diffs?

"npm-crev" can't come soon enough...

https://web.crev.dev/rust-reviews/ https://github.com/crev-dev/cargo-crev


Interesting. Do those reviews apply to packages as a whole, or different versions of a specific package? Edit: Yes, the reviews can apply to specific versions.

I'm personally a fan of using Debian/Ubuntu packages, because generally code goes through a human before it gets published. That human has already been trusted by the Debian or Ubuntu organization.


This aims to explicitly solve the problem of "okay, but most maintainers just skim the code at best and spend time on packaging", plus it aims to parallelize it.

And while some packages have been distroized (eg. a lot of old perl packages, a lot of python packages, some java/node packages) I have no idea if any rust package is distro packaged separately. (Since rust is static linked there's no real reason to package source code. Maybe as source package. But crates.io is already immutable.)


They can still delete the package from NPM can't they?



Even if they did they're not 'breaking your app in minutes', as if all live apps which use that package are suddenly going to poll npm for deleted packages. That's absurd.


Of course that's absurd, that's not really the core of the argument though. I would still consider it breaking my app if I now need to go replace that package somehow, or pull it from some archive, before I can re-deploy my application.


Are these really RCE vulnerabilities? Looking at it systematically I only see this as an RCE vector if you're doing one or more things very wrong. This assumes that packages are immutable and an author can't update a version that's already there. This is how NuGet works, and IMO is how any remotely sane package manager will work. There's no reason for a version to be mutable in this context.

Pegging to a specific version limits exposure. Syncing these packages to your own on-prem/isolated environment limits it further. Deploying all changes to a test/staging environment where they're reviewed first limits it even more.

I mean yeah if your build process takes @latest of all your packages and then pushes it right into production, that opens you up to a lot of risk. It's also incredibly stupid for anything beyond a personal project (and probably even those).

This doesn't strike me as a weakness in package management, it strikes me as a weakness in doing package management wrong.


The tool should take some blame here. I agree that it’s ultimately the developers fault for allowing code to be automatically injected from not fully trusted sources on minor updates, but the package manager makes it way too easy to do.

For example, when I npm install a package, it defaults to specifying a semver compatible version in package.json, rather than doing the secure thing and pinning a version.

But whether this default behaviour should change is not is also a security tradeoff. Pinning versions means that you will keep using an insecure version of a dependency until you update, whereas using a semver compatible version allows you to “automatically” pick up a fixed and compatible version. In practice however with lock files and local caches, the developer always needs to update for security patches anyway.

However, given the current NPM landscape (with packages having numerous small dependencies from a large variety of authors), going towards the former instead of the latter is definitely makes a lot more sense.


> it defaults to specifying a semver compatible version in package.json, rather than doing the secure thing and pinning a version

Note that if you have a package-lock.json (which you will by default), it will prevent any surprise updates even within the semver range specified. You have to manually run `npm update` to get the latest versions that match your semver. Personally I think this is the best middle-ground.


This is, unfortunately, not true by default. I had a case where I did `yarn install` and there were updates installed.

To make this work correctly, you need to do `yarn install --frozen-lock-file` or `npm ci`.

It’s absolutely _insane_ that this is the case. Gemfile.lock, Cargo.lock, and every other lock file format that I have used in packaging does this correctly.


It used to be true. npm install used to do what npm ci does. It was super annoying to learn that the hard way.

One of the core issues of NPM style package management is package bloat means you absolutely can't review all release notes for every module in your tree. So you just trust the top level packages, and pray they would mention something if their dependencies change how they themselves work. Practically I rarely see anyone read release updates for even those top level packages, they just update everything and test then send it up to prod is very typical.

If you are cool with that, rad, but it's the pinnacle of the fast food tech ethos literring software right now. Everyone is moving so fast that you barely get to learn something properly or maintain it well enough before it's defunct and we are on to the next thing. I might have a slightly bias view of it, working mostly for agencies I see a lot of projects.


Some orgs are much more in line with GP’s suggestion. Marketing sites may feel low risk and in my view the iteration speed required justifyies having a trusty stack with known good versions to start from. Personally my method of construction is very conservative and I thrive in B2B SaaS environments, where in Consumer front-end orgs I can be seen as a dinosaur at times. I love new and shiny things as much as the next dev, and enough incidents will hopefully create a more conservative culture of using free lunch-looking stuff more cautiously. Race to the bottom dynamics in a sense, lacking any regulation. The expectation is move fast and break things, I get that, because of the first to market/time is money bias/truth. Inexperienced devs won’t have the scars to push back if there are upstream changes to review while their boss expects the feature updates to be live ASAP. I imagine that with decades regulation will force certain processes—not that I want it more than the next dev who loves shiny stuff and delivering results fast/delighting my boss.


Go does this well. It chooses the minimum viable version that satisfies the constraints for each package.

The minor security updates are solved well by periodically running security linters and scanners. There's even a recent GitHub feature for it. That will alert you that you need to update a package.


> Pinning versions means that you will keep using an insecure version of a dependency until you update

Which is why you schedule time each sprint/release to check your dependencies and upgrade them in a controlled fashion.


Have you ever run npm audit on a project more than a week old? My high score is 3k new vulns in a single week…


> I agree that it’s ultimately the developers fault for allowing code to be automatically injected

Let's not do victim blaming here.

This is ultimately the fault of the person deliberately updating their package to break other people's software.


Nah, open source software is "use at your own risk" and there's 0 guarantee for anything. All responsibility lies with the user. If you don't like that responsibility, don't use open source software without reviewing it first.


It’s one side of the coin.

The other is, to do anything at all of practical use in 98% of jobs, day 1 is installing a tonne of OS stuff.

It’s not practical to expect pretty much every dev to inspect 100% of that, even if that’s what they implicitly agree to do in the license.


We're not talking about "a ton of OS stuff," we're talking about NPM packages.

If you have your package manager set up in a way that allows it to automatically upgrade/break your code, that's 100% on you.


I have a medium-sized data science project in Python. Nothing crazy. It's 180 packages, apparently, and 2.9M lines of code (whitespace, comments and all). Charitably let's call it 1m SLOC.

Seriously, you expect anyone to audit all this? It's basically impossible for any solo dev / small org, and as I say, it's not even a big project. A vulnerability is like half a line, or sometimes a typo.

Clearly, very different proposal for a large org, but even then, no small task.


"0 guarantee" might apply to accidental bugs, but a developer maliciously sabotaging their packages?


"Victim blaming" is a little harsh when it's literally a developer not doing their job and letting arbitrary code get inserted into their product.

Do your job and make sure the code that's running is what you expect. There's no valid excuse not to.


> This is ultimately the fault of the person deliberately updating their package to break other people's software.

Ultimately the 2017 Equifax data breach was the fault of the people who hacked into Equifax's website.

We need systems in place to defend against people doing malicious things, but yes ideally individual developers shouldn't be the ones tasked with reviewing all of their dependencies' code.

Operating System provided packages, for example, are generally reviewed by someone other than the author, which can lead to a more secure supply chain.

Rust's cargo-crev review system also seems like a possible solution the problem.


To boost this: it's worth reading about the difference between "npm install" and "npm clean-install". The "ignore-scripts" flag/configuration setting can also be valuable.


You can pin the direct dependency, but what if the packages you depend on don't pin their own dependencies? The standard (default behavior) is to use ^, which will automatically install new minor versions. Package.lock helps, but there's no sane way manage upgrades. Just running "npm audit fix" could result in pulling down a bad package.


If you pin your direct dependency doesn't that mean it can not change versions of its dependencies?

The same version number of a package should always link to the same version numbers of both its direct and nested dependencies. No?


Pretty sure pinning only pins the direct dependency. And most libraries do not "pin" their own dependencies, because it's more work to maintain. Security & bugs fixes that would otherwise be resolved via minor patches must be manually addressed. It also helps with resolving shared dependencies.

NPM is highly optimized to make sharing code as easily as possible, but that comes at a heavy price.


If you are not updating daily any typical webapp tree will accumulate known vulns.

Holding back updates is not a sane strategy either. You must review all the new patched versions of all dependencies daily or, failing that, do the work once to drop all the deps you can not afford to maintain reviews for.


Security auditor here.

Every time I see a client importing unsigned code with no evidence anyone they trust has reviewed it, I flag it as a supply chain attack vector in their audit and recommend mitigations.

Some roll their eyes, but I will continue to defend it is a serious issue almost every company has, particularly since I have exploited this multiple times to prove a point by buying a lapsed domain name that mirrors JS many companies import ;)


If we are willing to admit that repositories like npm are useful, what can be done to mitigate these issues?

Is there some tooling we can build?


If projects are importing tens or hundreds of third party libs without any kind of validation or review the process is fatally flawed.

Whatever the language or repository system reusing libraries like React, Requests, Apache commons, or lodash make sense after reviewing the pros and cons (functionality, security, size, performance etc). But blindly adding small repositories to your packages file without understanding the implications is only increasing the risk of trouble.

Node and npm for some reason seems to have encouraged this - remember leftpad.


A meta repository that lists versions reviewed by a trusted group of people? It would ad latency to bug fixes and limit the amount of available libraries, but would prevent single developers from taking down the ecosystem on a whim.


This is what `npm audit` and GitHub "DependendaBot" are both doing (originally in parallel with their own meta-databases, though now that GitHub owns npm things are lot tighter integrated, it sounds like).

Admittedly:

A) Both of these meta-repository tools are reactive rather than proactive: they flag bad versions rather than known good versions.

B) It doesn't take too many HN searches to find people don't trust `npm audit` or DependaBot either because both have provided a lot of false positives and false negatives over the years.

C) If someone does trust one or both, often the easiest course of action is to automate the acceptance of their recommendations and just blindly accept them leaving us about where we started and just blurring the lines between what is repository and what is "meta-repository". (Even the "Bot" in DependaBot's name implies this acceptance automation is its natural state, and the bot's primary "interface" is automated Pull Requests).


That is more or less what Arch Linux does. There are oficial repos (core and extra) maintained by Arch Linux developers, an unsupported packages collection (AUR) where anyone can upload a package recipe and an intermediary between those two called community repository that is mantained by trusted users.


Use something like crev to do distributed code review:

https://github.com/crev-dev/


Maybe limit the capabilities of software e.g. dictate what permissions are reasonable. Maybe require certain "standard libs" for things like console output that limit what can be output.

Also, no auto-update of packages.


You don't need to dictate a standard set of permissions, you just need to remove a single very common anti-pattern called "rights amplification".

Why is a program able to concoct a random string that conveys no authority, into a file handle that conveys monstrous authority potentially over an entire operating system, ie. file_open : string -> File.

That's just crazy if you think about it: a program that only has access to a string can amplify its own permissions into access to your passwords file. This anti-pattern is unfortunately quite pervasive, but it's a library design issue that can be tackled in most existing languages by using better object oriented-design: don't use primitive types, use more domain specific types, and don't expose stdlib functions whereby code can convert an object that conveys few permissions into one that conveys more permissions.

This means deeper parts of a program necessarily have fewer permissions, and the top-level/entry point typically has the most permissions. It makes maintenance and auditing easier to boot.


Pay the maintainers of the libraries you use, and have a contract with them that states their obligation to maintain and support your use of their code


The solution to fix FLOSS is for it not to be FLOSS?


How does a maintenance contract make the software not-FLOSS? It's a working option if you need more promises than the license gives you.


Because you haven't solved the problem for FLOSS, you've solved the problem for non-FLOSS that might also happen to also be FLOSS aka contribute somehow to a FLOSS version of the project - but the solution doesn't help those under the FLOSS licence, and complicates incentives to contribute to FLOSS/"community" versions.

Great for corporates who can buy the support contract, but is also suspiciously similar to the "freemium" model were FLOSS devs are suddenly incentivised to make the FLOSS offering insecure in comparison to the paid licence.

In this case, Marak is his own bad-actor/saboteur, how would the support contract help? It would be far more likely to make free-users 2nd class users, and as such it might be better to simply keep the products managerially separate due to that conflict of interest.

And lets be honest here - when does something stop being a reasonable "maintenance fee", and start to become rent-seeking / extortion? I think if you want to get paid, you simply don't work with MIT/GPL, or fork to a different licence; Changing your mind halfway through isn't reasonable IMHO.

MIT basically means "anyone working on this codebase agrees to MIT terms for their code, and as such authorship isn't so important". If you change your mind, you broke your agreement. If suddenly your authorship matters, what about every other author who stuck to their MIT agreement?


The F in FLOSS is supposed to stand for Free.

Many people interpret it as "Free as in beer", not "Free as in speech", so expecting people to pay for it disqualifies it as FLOSS.


Again, it's still under a FLOSS license. you can use it for free as in beer, under the terms of the license. If you want further expectations satisfied (i.e. ongoing maintenance of the software), that's what money is paid for. You're not paying for the software, you're paying for services around it. (and the nice thing with Open Source is that if the original creator isn't available for whatever reason, you can pay somebody else for them, which is a lot harder with not-open software)


> Many people interpret it as "Free as in beer", not "Free as in speech", so expecting people to pay for it disqualifies it as FLOSS.

Only for people with the wrong expectations. Just because there's many of them doesn't make them right.


Do you have advice for projects that use Maven? I know every package on Maven Central has a PGP signature, but as far as I know, Maven doesn't verify them.


You can say the same thing about the entire Linux stack


Not really, individual package developers don't have as much inmediate control over the repository's state as they do with NPM. Packages go through a review by one of the trusted developers and sometimes automated QA and testing (including as of late reproducibility testing, i.e. does the source match the binary?), before being uploaded to the repository.

If you can't trust the team behind the distro, then sure, your supply chain is compromised, but it's significantly less likely for a single package developer to cause any damage, as all the big distros have rather extensive policy and procedures to prevent such things.


I use Gentoo which uses portage the package manager and the way portage works is it pulls source then compiles. Source is rarely checked by everyone. Small packages exist as well. Many Linux distro simply barrow binaries from "trusted" sources. The entire eco system is really a deck of cards.


> Many Linux distro simply barrow binaries from "trusted" sources.

The crappy ones maybe. Proper distros build everything from source.


This is a false equivalence brought up every time anyone mentions how vulnerable the npm/gems/pip ecosystems are to supply chain attacks.

Linux code is always reviewed before deployment, goes through many eyeballs, people are careful about this. The same is not true of npm, or any of the other services (as this event clearly shows).


Eh that's not true. I use Gentoo so trust me most things are run by little dictators of their own little fiefdoms.

I'm talking about not just the kernel but all the various other things from libraries to servers to tools and everything in between.


OK, but none of those little fiefdoms are "Linux".


I literally said the Linux stack which includes everything from the kernel to init to libs. You can't run just the kernel.


It's still a false equivalence. You'll agree that all the important bits of the Linux Stack are audited and reviewed by multiple people, right?


Parts of the Linux stack equivalent to colors and faker are carefully audited and reviewed by multiple people? That sounds to me like elevating them to important bits in a false equivalence.


When it comes to security (among other things), one simply cannot say that all the important bits are in the kernel. If that were the case, there would not be an issue to discuss here.


Lol hell no. You're joking right?


any operating system, really, if you want to play that game


Unless you're using LFS, of course.

The problem you describe isn't Linux, it's Linux Distributions.

Where would you draw the line?

Source packages are available, and if the binaries don't match the code a distro would soon be outed a la "many eyes" thinking.

We have to trust some or none.

Get the top off that chip, see if the factory put an extra core in for the NSA (IME).


No, serious Linux distributions audit their code.


Dependencies are a major attack vector now.

Tread carefully with all the supply chain attacks out there, it might not even be the authors doing these. We are entering a dependency attack massive war.

Dependencies are a balance but also a sign of weakness of a system in the modern day. There at least needs to be delayed, dependency bot like analysis before you integrate. Even then, they just leave your systems open to worse than DLL hell, telemetry tracking/data, and attack vectors that can take down or target many, many systems.


How about using dependencies but pinning the version and only updating if you know what the update contains?

I'm still continually baffled that we ended up in a world where automatically accepting updates from every dev and their dog is not just the norm but recommended practice.


> if you know what the update contains?

I think anyone who thinks they're doing this is fooling themselves. You can review code for accidental vulnerabilities but if someone is trying to slip in a backdoor it shouldn't be hard to do so in a stealthy manner.

The reality is that the entire dependency concept is just broken. There is an implicit trust that all dependencies are equally trusted. Your logging package is just as capable of performing file and network operations as your http package, even if you assume it won't.

That's silly.

It is up to programming languages and package managers to solve these problems. They're also not that hard to solve, in my opinion. "Run arbitrary code on a computer" is a model we've been securing for decades with web browsers, both in terms of web pages and extensions, and now too with mobile.

Solving "this code can do X but that code shouldn't be able to" is similarly easy to solve with languages that support effects or capabilities.

It just hasn't been done yet.


Adding permissions is a reasonable step, but I don't think it solves the problem. We know, it's very hard to get granularity right with permission systems and there is a strong temptation to just give everything all permissions.

Dependencies with dangerous but necessary permissions can still abuse them: Your network library will still be able to add a bitcoin miner.

What happened if an update requests a new permission?

Also, how would that have prevented the current situation? Infinite loops are famously hard to detect and prevent automatically.


> Adding permissions is a reasonable step, but I don't think it solves the problem. We know, it's very hard to get granularity right with permission systems and there is a strong temptation to just give everything all permissions.

A lot of that stems from permissions systems being implemented outside of the code they constrain. In theory a compiler knows every reachable system call and all points of data input that could reach them, and as such it could constrain the program's capabilities accordingly.

In fact, compilers already do this for control flow integrity - it would just be a more advanced system.

> What happened if an update requests a new permission?

It's going to depend on the system. For browser extensions the new permission means a new prompt, so you'd get a CI failure until a human updated a lockfile.

> Also, how would that have prevented the current situation? Infinite loops are famously hard to detect and prevent automatically.

It really depends on the system. You could have a CPU capability that restricts cycles or forces preemption, etc.

I'm not saying you can solve literally all security problems but you can reduce risk considerably. If "infinite loop" is the scariest thing a dependency can do we're in a pretty good position. An unconditional infinite loop should break your CI tests.


> In theory a compiler knows every reachable system call and all points of data input that could reach them

Sorta yes, sorta no.

Imagine I'm making a chat client, and I want users to be able to drag and drop images to share. But the OS doesn't have an "open drag-and-dropped file, extension .png or .jpg" function call, it only has "open file" which lets me open ~/.ssh/id_rsa too.

Or if I'm making a web browser and I want to support U2F tokens. But there's no OS "talk to U2F token" call - the browser needs access to the system calls for "talk to arbitrary USB devices".

Sandboxing PC software is tough.


If you end up with "X can open any file" that's something that's worth noting to a consumer. The sandbox capabilities don't have to be perfect in order to expose a scary situation.

Further, you can restrict a process to only open specific files in a number of ways on Linux, including based on path. There's room for improvement, though.


> But the OS doesn't have an "open drag-and-dropped file, extension .png or .jpg" function call, it only has "open file" which lets me open ~/.ssh/id_rsa too.

A programming language doesn't have to expose system calls directly. It arguably it shouldn't, in fact, for exactly this reason.


Permissions inside a programs own code seems incredibly difficult without radical change.


Pony's object capabilities are one example of an existing implementation. I don't think there's any "inventing" To do here, it's all just implementation work.


It's actually trivially easy once you remove ambient authority, which is the real source of these security problems. Consider how a program could modify your files if it cannot willy-nilly turn any old string into a file handle.


Adding permissions would not have caught this case, though, because there is no need for permissions to run an infinite loop.


> I'm still continually baffled that we ended up in a world where automatically accepting updates from every dev and their dog is not just the norm but recommended practice.

I think it follows from two things:

1. We use open source software for everything. This is also true of our dependencies, so we get hundreds or thousands of transitive dependencies. Many of which are presumably written by dogs, because (i) no one can tell if you're a dog on the internet, and (ii) OSS maintainers are so overworked they ask their dogs for help.

2. Languages and libraries are full of footguns, software is full of bugs and therefore vulnerabilities, and no one cares enough to go through the enormous effort to fix things. And this is true through the whole stack. So the only practical way to stay secure-ish is to reactively patch software as vulnerabilities are publicly discovered. And also defence in depth. (I would distinguish between the publicly known time that a vulnerability is discovered, and the first time it was discovered. You hope the two are the same but for many vulnerabilities, if a clever adversary found them first we'd never know.)

With these two things together, you have a ton of questionable dependencies, and you need to update them all the time for security reasons.


I see systems with various system-level vulnerabilities all the time for work, and try to assist my clients (internal project teams) in prioritizing fixes. Besides the usual CVSS scores, I try to focus first on what is being used or exposed. Network services, file-input processes come to mind. Vulns not in this list should also be fixed when found, but my thought is centered on what might be primarily exploitable.

This leads to some thoughts on statically-compiled applications; while they might have some vulnerable dependency, I suspect that it's harder when the attacker is limited to the app's "baked-in" functionality that defines how those dependencies get used.

Edit: Also, I should note that while I would greatly prefer, and do advise, that they base their environments on minimal OS distributions, this seems rare. The base system patching would be much easier to manage if it started from some BSD-like minimal state, or Alpine Linux, and included only what it needs. Instead, any infrastructure vulnerability assessment leads the teams to chasing down numerous patches in things they have, but never use.


Yeah I guess then we can just shrug and go on because there is nothing we can do to stop our app from randomly breaking tomorrow.

> So the only practical way to stay secure-ish is to reactively patch software as vulnerabilities are publicly discovered.

But this is different from just blindly accepting any update that upstream gives you.

> And also defence in depth.

This sounds increasingly like security theater. You can always more layers obstacles to make things harder for malware that is already on your system, but it's not clear to me how much this actually reduces your atrack surface.


> This sounds increasingly like security theater.

It does help. Quite a bit.

Each layer (e.g. firewall rules that require that all internet access go through a proxy), adds non-trivial amount of work for the hacker to get anything useful done.

1. best case - hacker will give up.

2. good case - you have more time to notice and react.

How much layer cost you, how much does it cost for hacker to overcome it.

Things to remember:

1. Not all hackers are nation states. Most are not.

2. We must accept that no security measure is absolute even against script kidies. Given enough time and luck/misfortune js sandbox will do "rm /sensitive/file".

Recent Log4shell example shows that one can follow all best practices and still get bit in unexpected way.


Defense in depth implies a whole lot of things, and is certainly not security theater. Usually it boils down to three major themes:

1. Reduce blast radius: assuming component X is compromised, how far and wide can it be felt?

2: Principle of least privilege: once compromised, what can X do or access? Extend to the credentials X carries or has access to.

3: Detection: how early and how well can you detect the compromise in previous two steps?

You can never prevent a compromise, but you can make it easier to notice when it has happened, and you can limit what the attackers can do afterwards.


> only updating if you know what the update contains

People suck at this. What this actually tends to do is mean "no updates, ever" unless you have a particularly rigorous culture of dependency management.


Or we get a culture where upstream writes in more detail what an upstream is supposed to contain and downstream verifies that the update indeed does what they write. If this leads to fewer updates overall, I have no problem with that.


If you change "contain" to "do", then this is the MAC security model as implemented by SELinux.

a culture where upstream writes in more detail what their code is supposed to do and downstream enforces that the software indeed does (not do anything beyond) what they specified

It didn't lead to fewer updates, it led to less usage of SELinux.


You also have to rely on all of your dependencies doing that for their dependencies and so on. It’s really a mindset/vigilance you need for the whole ecosystem.


Transitive dependencies are also your dependencies, even if you didn't consciously include them. So in an ideal world, you should vet all changes to dependencies of your codebase, including transitive dependencies.

Whether or not this would be compatible with the way dependencies are used today is another question.


I'm not even sure it's not a fool's errand with the current software ecosystem.

I think at some point it will have to be a language level feature. The ability to sandbox or provide permissions to packages/functions. Just like our OS had to, just like browsers had to, just like phones had to.

Our code is the platform, the packages the apps. It's a similar use case.

If I could download a module, and tell the compiler this module, and everything it uses (including packages that I also use, but through a different call tree) will never access the network or write to disk, it'd help grant some small peace of mind in terms of security at least.


> I think at some point it will have to be a language level feature. The ability to sandbox or provide permissions to packages/functions.

> If I could download a module, and tell the compiler this module, and everything it uses (including packages that I also use, but through a different call tree)

Javascript's prototype based inheritance looks like it can help facilitate such conditional submodule invocation. But, and partially for performance reasons, static compiling would be necessary. So Javascript and its dominant NPM package ecosystem can never go in a direction like this.

If only C++ or Python (dynamically typed, I know) had prototypes instead of class based inheritance.

Edit:

Looks like another commenter referenced what we're probably talking about:

> Now, about the technical solution to this. We have this, for well defined programming languages (read: statically typed ones, or dynamically typed ones with a clear structure).

> It's a linker. Tech from the 1950s.

> Link (include) just the stuff you want, "tree shake"/"remote dead code" whatever you don't.

Can we create an open source linker for JavaScript and NPM packages?


Doesn't deno take this approach? The runtime does kinda force the question by only supporting imports via fully qualified URLs.


It might, I'm not familiar but after a quick look it seems to operate on a vetted trust model i.e. you can use these because we checked and they are compatible. So you could miss out on a lot of the ecosystem.

I was leaning more towards the web approach where we assume everyone is out to get us, but they can't unless we give them that one permission they need. If it's a statically typed language then it'd even allow dependency walking to see what permissions are used at a granular level and we can decide not to bring in anything that's too loose. This of course won't solve cases like logic bugs, but it'd help mitigate the impact.

I'm just not sure if it's even feasible?


The checked and compatible stdlib is an extra provided by the project.

Deno runs code in a sandbox where you need to give permissions to scripts/modules for them to access local files, the network, etc:

https://deno.land/manual@v1.17.2/getting_started/permissions


Yes, but it's not granular. You either let all modules have permission X, or none of them.


I was thinking of their scoped permissions model described at https://medium.com/deno-tutorial/deno-security-65af9811d9c9

Not sure if you can scope down permissions as part of an module import or if it only works when you initialize the interpreter


Yeah I also don’t understand. But then again.. the js ecosystem is one big pile of turds..

Tech cycles with people who reinvent the wheel and keep making the same mistakes

All these problems have long been solved


The problem of new versions of dependencies breaking old code happens all the time in a variety of ecosystems. It's not exactly "solved," it's a continual problem similar to picking what you will eat for dinner. There are pros and cons to each approach, in this case the con is that every now and then you have to manually pin a version. If we did it the other way, we would have to manually upgrade versions to get obvious and easy improvements. As with most things people happily call "turds," it's actually a tradeoff and not as simple as just being bad.

As for reinventing the wheel, are wheels really settled science? As far as I know, new kinds of wheels are being created all the time. It's not just that new people are creating them, there are new things wheels need to do every day and new sets of requirements that the old existing wheel designs don't fulfill. Look at wheels from 50 years ago and they are nothing like the wheels of today. The wheels on cars are nothing like the wheels on aircraft, which in turn are nothing like the wheels on trains.


They were solved in a way that slowed progress.

So invariably, people discovered that if they threw out the complexities of the solutions, they could make faster progress.

Then they eventually ran into the corner cases.

That's the time loop that keeps happening.


Yeah, this seems less like actual progress and more that people wanted to drive in circles faster.


"Pinning the version" is inadequate. Do a shasum on the package contents. (Thanks, Poetry.)


Because dependencies of dependencies exist, and if you use a large framework of some sort, you could end up with literally over 1,000 dependencies.

Manually checking before updating does not scale.


Crazy what happens when you decide to freeload off a stranger’s code who you have no contract or agreement with whatsoever, beyond a license you must accept to use the software which disclaims any warranty whatsoever, even fitness for any purpose.

I have zero sympathy for anyone complaining they were hurt by this. I think Marak is teaching an important and principled lesson here.


Lesson learned:

- Don't rely on open source software. It's all FUD. Always use software from companys you can sue by a contract.

- If you need an an open source library to color your console output, pay them 6 figures per year.

SCNR


> I think Marak is teaching an important and principled lesson here.

What lesson is that?


Never rely on the Javascript community/ecosystem.


freeload is a fairly loaded term. "disclaims any warranty" isn't the same as malicious action; I don't have to pay you if your house burns down if you don't have an insurance policy (contract) with me but I'm still liable if I commit arson.

Also, AFAIK, Marak is not the original author; Is he also a freeloader for attempting to commercialise this code?

> teaching an important and principled lesson here

The history of this issue speaks differently to their intentions, but even so, there is a way to "teach lessons" and that's by doing something alarming but harmless. AFAIK Marak wanted to cause harm, and acted in a way to do so.


> Packages are literally remote code exec vulns in the hands of package authors

Something mentioned in this article caught my eye:

> While searching for Marak’s libraries, I found this npm-test-access library. This library seems to be used for what the name describes: to test access to NPM. Marak seems like a very capable software engineer, and it’s unclear to me why he’d need a package like this. So, this make me personally doubt a little bit if Marak is really behind all of this, or if maybe his account got compromised, or if something else it at play.

— [0] https://jworks.io/the-faker-js-saga-continues/

If you wanted to take over other people’s NPM packages by pushing a compromised update to one of your own widely used packages, the first thing it would do is check to see if the victim had access to publish to NPM.

Marek, who has just started publishing malicious updates to his own widely used packages, has just created a package to check for access to NPM.


This seems like irresponsible speculation and insinuating that Marek is about to commit a felony?

I’d rather skip the character assassination based on hypothetical future actions please, and focus on what’s actually happened.


> Packages are literally remote code exec vulns in the hands of package authors

There are 20m+ weekly downloads of the colors package alone. He has what amounts to remote execution privileges to people using that package. When the subject of compromised packages comes up and he’s demonstrated that he’s willing to publish malicious updates, it’s completely fair to wonder what else he’s willing to do with that level of access to that many systems. It’s irresponsible not to consider what his packages can do to your systems.


> This seems like irresponsible speculation and insinuating that Marek is about to commit a felony?

No, it's speculation that Marek didn't do any of this, but that instead his account was hacked. You either responded to the wrong comment by mistake, or totally misunderstood the one you replied to.


To resolve such issues the central maven repo, for example, makes artifacts immutable when you publish them


This is true for npm. After the incident with leftpad, you can't unpublish anymore. You can, however, publish a new patch update that completely breaks everything.


> This is true for npm. After the incident with leftpad, you can't unpublish anymore. You can, however, publish a new patch update that completely breaks everything.

You absolutely can unpublish, it just requires more steps. If NPM gets a DMCA takedown request they will absolutely have to fulfill it.


> If NPM gets a DMCA takedown request they will absolutely have to fulfill it.

Assuming the package is released under a Free Software licence, what grounds would there be for a DMCA takedown?

I suppose a developer could include the lyrics to a pop song in their code (possibly encrypted), and then tell the copyright holder about it (since I don't think you can make a DMCA request on behalf of a copyright holder without their permission), but I would hope that such a poison-pill would be caught long before the package became widely depended on.

Perhaps you're thinking someone would risk perjury(?) charges for making a false DMCA request against their package, and NPM would act on the request without questioning it; but remember that NPM is owned by Microsoft and they have previously stood up to frivolous DMCA requests (after a fashion)[0]. That article has the lede: "Software warehouse also pledges to review claims better, $1m defense fund for open-source coders".

[0] https://www.theregister.com/2020/11/16/github_restores_youtu...


> I don't think you can make a DMCA request on behalf of a copyright holder without their permission

In theory, you're right. In practice, there's never any actual consequences for filing a false DMCA claim. Worst case is that the thing doesn't get taken down, but that's no worse than if they didn't file it at all.


Corps don’t care about DMCA takedowns from natural persons. I sent a takedown once, the CEO replied that he was sorry it had come to that, but they still distributed it for years under a license I did not grant. This CEO is licensed to practice law in California, btw.


Anyone is free to ignore DMCA notifications.

Some parties that are distributing other peoples' stuff lose a safe-harbor protection from liability themselves if they ignore it.

This means intermediaries who don't benefit much directly from distributing a given bit of content will immediately comply with the DMCA takedown process. But this does nothing if you send the notice to someone who is actually using it.

The correct move is to send DMCA to the infringer's ISP/host. Then the ISP has to take it down unless counter-notified that they say they're not infringing. In turn, that counter-notification improves your position for any litigation that may ensue.


Someone could have added non-free third-party code into the package (intentionally or inadvertently, it doesn't really matter).


> but I would hope that such a poison-pill would be caught long before the package became widely depended on.

I'm not sure what about the current open source ecosystem makes you think anyone would catch something like this.


Funny, my company couldn't use Webpack 1 because a dependency of a dependency... depended on an ancient package from the days when it was common to not bother with attaching a license.

Legally, that meant that noone could use it. In practice, nobody but our legal department cared, so we had to wait for version 2 when the dependency chain was updated to remove it.


You couldn't override the package locally? Or was too much of that code actually needed?


> I don't think you can make a DMCA request on behalf of a copyright holder without their permission

Tell that you Youtube's copyright trolls


The people trolling YouTube over copyright are either making false Content ID claims[0] (not DMCA takedown requests), or claiming infringement based on an incorrect match of something they genuinely hold the copyright on.[1]

You're probably right, though, that there is enough imprecision in the system for someone to claim that someone else's code snippet infringes on the copyright of a code snippet the claimant had previously published.

[0] https://torrentfreak.com/u-s-indicts-two-men-for-running-a-2...

[1] https://freebeacon.com/culture/google-youtube-algorithm-copy...


> Assuming the package is released under a Free Software licence, what grounds would there be for a DMCA takedown?

Noncompliance with the license, e.g. by removing required copyright notices/attribution in the code (this has happened in the past). Or straight-up uploading someone else's non-free code.


The developer can DCMA claiming the code doesn’t follow his license as famously happened with Bukkit (the Minecraft server tool).


I didn't remember that particular legal complication, so thanks for prompting me to look it up. It seems that his argument was that Bukkit couldn't be distributed because it contained Mojang's proprietary code, but the fact that it also contained some of his code meant that he was a copyright holder for the purposes of the DMCA.[0]

This seems like an edge case that wasn't anticipated by the DMCA, but I can see the argument that mixing GPL code with proprietary code is creating and distributing a derivative work, in violation of the GPL. Without proprietary code being present, though, I don't think a developer can DMCA takedown their own GPL software.

[0] "As the Minecraft Server software is included in CraftBukkit, and the original code has not been provided or its use authorized, this is a violation of my copyright." https://github.com/github/dmca/blob/master/2014/2014-09-05-C...


Not only does it require more steps, it also has to meet the following criteria[1]:

* no other packages in the npm Public Registry depend on

* had less than 300 downloads over the last week

* has a single owner/maintainer

So while your point is taken that unpublishing is possible under some circumstances, it is not for popular packages that are in use today.

[1] https://docs.npmjs.com/policies/unpublish


None of these points have any legal standing, from a copyright perspective.

https://news.ycombinator.com/item?id=29868199


You are technically correct. The best kind of correct! In practical terms, it depends on the license used. Since most licenses used in open source will prevent you from making these kind of requests, this consequence isn't likely to have any practical implications.


You are assuming that the true rights holders of all the code in the package actually agreed to the given license. Someone unrelated to the package development can still claim it includes an illegally-copied, unlicensed version of their code.


Despite the need to keep it clear, copyright does not reign supreme.


> Despite the need to keep it clear, copyright does not reign supreme.

neither do NPM TOS, or whatever Microsoft thinks they are entitled to, since NPM is owned by Microsoft.


Which is not what I argued :^)


> If NPM gets a DMCA takedown request they will absolutely have to fulfill it.

No, they don't. Honoring DMCA takedowns allow benefit from an additional safe harbor from any existing infringement liability for the alleged infringing content, but are not mandatory in their own.


Except if they have reason to believe the code was uploaded with the permission of the copyright holder.

The they have gotten the right for npm to distribute the source code in context of npm.


> The they have gotten the right for npm to distribute the source code in context of npm.

There is absolutely no copyright or publishing right transfer that takes place when one "publishes" a package on NPM (or on Github). None.

The original author is absolutely entitled to a DMCA takedown notice and NPM would have to oblige him.


Your Content belongs to you. You decide whether and how to license it. But at a minimum, you license npm to provide Your Content to users of npm Services when you share Your Content. That special license allows npm to copy, publish, and analyze Your Content, and to share its analyses with others. npm may run computer code in Your Content to analyze it, but npm's special license alone does not give npm the right to run code for its functionality in npm products or services.


First you agreed to ToS when you uploaded things to npm. I haven't read the terms but it should be enough for npm to publish on npm no matter the license.

Secondly and as important if you publish something under an Open Source license(1) then you _cannot unpublish it_. You granted copyright to _everyone_ for and existing both now and in the future to distribute and use it(2) (legally it's a bit more complex but that's what it boils down to).

(1): Assuming you had the legal right to do so, but if not you are liable for any fall out, not npm (because ToS, they still need to take it down reasonable fast, but they might be able to sue you).

(2): Within the constraints of the license.


You can't legally retract opening up software source code under most if not all popular open source licenses.


It is indeed all, even if you ignore the "popular" qualifier. If a license could be unilaterally revoked, it would fail to meet the Open Source Definition for that reason.


open source =/= free software.

That's the first mistake you are making.


The differences between the two are extremely minimal, basically only relating to patent rights relating to the software. Go read https://www.gnu.org/philosophy/free-sw.en.html - the FSF's Free Software definition, and https://opensource.org/osd - the Open Source Definition (both by the respective parties that coined the terms and maintain them to this day) and see what the actual differences are. They're not many.


While they are indeed sightly different, I fail to see how the differences are at all relevant in this context.


Why does a new version break projects without action by the project owners? In Go you would have to explicitly update to the broken version.


Because npm install has the insane default behavior of adding a fuzzy qualifier to your package.json, for example ^6.0.2 means all of the following versions are accepted: 6.0.2, 6.0.9, 6.7.84


It’s not particularly insane. package.json and package-lock.json have different purposes, namely package.json specified intent e.g. I want a version that satisfies >=5.2.3 && < 6.0.0 and package-lock.json records the exact resolved version.

Off the top of my head Bundler, CocoaPods, Cargo, SPM, Pipfile(and various other Python dependency managers), and composer also all work like this.

Cargo even makes it implicit that a version like “1” means “^1.0.0” in Cargo.toml.


That's not an issue, that assists in quickly viewing wanted package upgrades. The problem is in not using a lockfile.


Welcome to JavaScript, where every division is a bad one


Very often, package installation is automated as part of a build pipeline. So if you want to build and deploy a new version of your software, you'll kick off the pipeline and that could potentially download a newer version of a package than was previously being used.

Incidents like this highlight that this may not be the best idea.


If you're using NPM without lockfiles, you're gonna have a bad time with discrepancies between trying things on your dev machine and building things in CI machines.

When you have a package-lock.json NPM will install exactly the same version of everything in your dependency tree, making the CI builds much more like what's on your dev machine (modulo architecture/environment changes)


Because of version locks. Normally you install “^X.Y.Z” which means any version at major X with at least minor Y and revision Z. For more conservative codebases you install “~X.Y.Z” which also locks the minor.

npm install will traditionally install the most recent packages that match your constraints. You need “npm ci” to use true version locks


*I revoke my comment. Child comment is correct.


That's not true.

The first line of NPM install's documentation[0] says(emphasis mine):

> This command installs a package, and any packages that it depends on. If the package has a *package-lock or shrinkwrap file, the installation of dependencies will be driven by that*, with an npm-shrinkwrap.json taking precedence if both files exist. See package-lock.json and npm shrinkwrap.

What does happen is: if you have added a new package in package.json it will be installed based on the semver pattern specified there, or if you run npm install some-package@^x.y.z the same thing happens. Further, if you modify package.json by changing the semver pattern for an existing package that will also cause this behaviour.

Running `npm install` in a package that already has a package-lock.json will simply install what's in package-lock.json. `npm install` only changes the lock file to add/remove/update dependencies when it detects that package.json and package-lock.json disagrees about the specified dependenices and their semver patterns e.g. having foo@^2.3.1 in package.json and foo@1.8.3 in package-lock.json will cause foo to be update when running `npm install`.

0: https://docs.npmjs.com/cli/v6/commands/npm-install


THat's why you specify the exact version of your dependencies.


You're never going to be able to prevent that at a technical level. You can prevent it with workflow, though: 1) sync packages locally and build from those versions; 2) peg to a specific version and don't auto-update; 3) deploy to a test environment and not directly to production.


A key difference with Maven projects is that you specify exact dependency versions instead of “always use latest” or some variant of that, as is pretty common in the Node world.


This is not necessarily true, there are version ranges: https://www.baeldung.com/maven-dependency-latest-version

Admittedly, I don't think it has nearly as wide a usage as it has in the NPM world. Dependabot (I know I'm not the first to mention it, here, today) is probably more of a factor.

Still, it strikes me that this sort of "attack" (or mishap) is exceedingly rare in the Java ecosystem, while it's pretty common in the NPM world, and I don't immediately understand why that would be so.


I was not aware of that feature. To call it rare would be an understatement I think.

> while it's pretty common in the NPM world, and I don't immediately understand why that would be so.

I think it boils down to Node projects typically specifying dependencies in the form “any version >= X”, effectively “always use the latest.” Dependencies can therefore get bumped silently just by rebuilding, essentially. Whereas in the Java world updating dependencies is a deliberate process.


We abuse jitpack.io and MASTER-snapshot to keep out Minecraft maven builds up to date.


With lock files, you will always be stuck with whatever version you first installed until you explicitly ask npm to upgrade, or delete your lockfile.


npm does as well, they made this change after left-pad.

You also can't unpublish once a single person has downloaded the package, I believe.


Immutability feels like the best approach here. Go's module system is pretty good in this respect: "proxy" is just a proxy that serves module code, and "sum" is an append-only transparency log of the hashes of all published versions. You can't "unpublish" from the log, but you can get code hosted on proxy removed for various reasons... which users can protect themselves against by running their own proxy. Go's module version resolution strategy means that the chosen module version never changes without explicit input from the user so no "publish a new version that breaks everyone's CI" issue.

All together I don't see how GP's "email php files around" is as any better than this system in any way.


How does that solve the issue here of new broken versions of packages being published?


That's another JS ecosystem widespread malpractice.

Autobumping versions, or version ranges as they're called in Maven land.

Dependencies should only use fixed versions and all updates should be manual.

You should only use auto-upgradable versions during development, and the package manager should warn you that you're using them (or your dependencies are).


If package A depends on package C at version 1.0 but package B depends on C at version 1.1, what version of C will be pulled in?

Dependency management is not as simple as only upgrading one direct dependency at a time after careful review.

The NPM ecosystem is particularly difficult to work with as it has deep and broad transitive dependency trees, many small packages, and a very high rate of change.

You either freeze everything and hope you don't have an unpatched vulnerability somewhere or update everything and hope you don't introduce a vulnerability somewhere.


> Dependency management is not as simple as only upgrading one direct dependency at a time after careful review.

Most package managers won't allow these stunts and conflicts have to be resolved UPSTREAM. NPM chose to go the "YOLO" way and will fetch every single version of a package that meets the dependency demands. Terrible design, but the purpose of that was growth for NPM, the company, not the best interest of the ecosystem.


There are package exclusions, package forcing and of course, full dependency tree checks where you review what everything pulls in.

The JS ecosystem will probably have to change but because it's so decentralized, that change will be orders of magnitude harder than, for example, PHPs transition from 3 (4, 5) to 7.


> The JS ecosystem will probably have to change but because it's so decentralized,

Is it? Everybody is pulling from Microsoft owned servers now, as Microsoft owns both Github and NPM.


You're right in the package storage sense.

I don't think you're right in the builder/building practices sense.


I'm sorry but this is completely wrong. NPM has lock files which explicitly lockdown the version you have downloaded after your first install. These are commited to source control, so all subsequent installs will use the exact same version of dependencies, and nested dependencies too.

You need to ask npm to upgrade or delete your lock file and node modules to run into this issue.


You shouldn't blindly pull updates into production, how do you know if a non-malicious update breaks your app if you don't do any basic testing first?


> I don't want that level of control over other people's projects, it's scary

How far do you take this though? The average GNU Linux distro ships with a whole pile of packages already installed, from a multitude of different authors.


Packing is most of the point of a distro. They are specifically taking on that responsibility. They also have a better perspective to handle overall compatibility.

In a sense you've pointed out the alternative to having the programmer handling the packaging -- having some third party package and distribute. And this separation of responsibility turns out be almost always be a better solution. Distribution and coding are, after all, two full jobs (without 100% skill overlap). Plus, hopefully it indicates that at least two sets of eyeballs have at least glanced at the code (not the full desired many-eyeball outcome, but as good as we can expect sometimes).


Given that Debian (and its descendants...) packages a shitload of npm packages, it's a wide stretch to say there is more QA for these packages from the Debian side than there is from the npm side.

The one thing that Debian provides is that in the case there is a security issue, admins worldwide only need to do "apt update && apt upgrade" and they are safe, without having to check all of the software that runs on their servers (as long as said software comes from Debian, that is!).


I do think people would be served, generally, by being more aware of the fact that distros are not some doing some hardcore security vetting. But the alternative is just to use whatever was pushed up to NPM, right? In that case, Debian packager+NPM push > NPM push by definition, unless the Debian packager somehow provides negative QA, which seems unlikely. (Also, on the incredibly unlikely offchance that some Debian packager reads this comment -- your work is incredibly useful and I very much appreciate it, just trying to be realistic about what exactly is provided by your group!)


> unless the Debian packager somehow provides negative QA, which seems unlikely

It has happened before. Last time there was anything major was over a decade ago though.

https://lists.debian.org/debian-security-announce/2008/msg00...


Yes, but AFAIK, those are heavily tested or audited in some manner. That's different from including code written by randos in your app that they can remotely change at any time.


> Yes, but AFAIK, those are heavily tested or audited in some manner. That's different from including code written by randos in your app that they can remotely change at any time.

It seems like the problem here is more cultural than technical, specifically that the JavaScript community has fully embraced packages that are "written by randos" that are "wrappers around three-line Stack Overflow answers."

I use packages, but I wouldn't use any that are developed by a rando with no reputation or "institutional oversight." An important part of choosing to use one is evaluating the maintainers.

Does the JavaScript ecosystem have anything like Apache Commons? I'm guessing not, but it probably should.


It's both technical and cultural.

Javascript is used on the front end. Front end devs obsess (or at least used to obsess) over download sized. So you'd have crazy stuff like custom builds of Underscore (https://underscorejs.org/) with just the functions you wanted. Think manual sandboxing, if that makes any sense. You could get a package of Underscore with just map, filter and reduceRight, if you wanted to.

Now, when Node came around, people wanted as much as possible to have the same libraries available on the front end, so the same obsession with size was carried over.

Ergo the micro-milli-nano-packages they make.

Now, about the technical solution to this. We have this, for well defined programming languages (read: statically typed ones, or dynamically typed ones with a clear structure).

It's a linker. Tech from the 1950s.

Link (include) just the stuff you want, "tree shake"/"remote dead code" whatever you don't.

https://www.joelonsoftware.com/2004/01/28/please-sir-may-i-h...

Java's largely to blame for this, Sun REALLY, REALLY hated stuff that could be hooked into any OS and wasn't portable, so they didn't provide a linker. Everything was supposed to be on their JVM and you were going to install their JVM everywhere (2 billion devices!!!) and to hell with small stuff or heaven forbid, including native libraries. Javascript followed (on top of the Java restrictions they added: dynamic, poorly defined language, that would have made linking with tree shaking really hard, anyway). .Net also followed.

Almost 3 decades later we're trying to undo that damage.


The problem with tree shaking has been twofold:

- JavaScript is a very dynamic language with dynamic property access and a few other features that make it hard to guarantee that the linker won't accidentally remove too much

- historically there was no standardized "module" format until ESM (ES modules) came up (with some time in between with few competing non-standardized proposals), so statically analyzing exports/imports was difficult; in frontend you'd long rely on just creating and reading global variables (i.e. side-effects).

Hence it's been "safer"/easier to create small packages.

But it's not only this. Once you put a mega-package in your repo, it's easy to gradually start relying more and more on the things it gives you. Even if it supported perfect tree shaking, you'd call one method here, one method there, and with each build your bundle size would balloon (which is not good if you could write one line of code while lib method's code is 1000 lines because it supports IE4 and 17 parameters).

Whereas when you rely on small packages, you need to make a conscious choice each time to pick another dependency.

You probably don't care about this on servers written in C++ or Java that much; but on frontend it's a big deal; hell, even when building native apps for Android/iOS you have size limits for the stores submission / limits for the number of methods (tech limitation in Android). Big companies invest crazy money to shrink their native bundle sizes (https://blog.pragmaticengineer.com/uber-app-rewrite-yolo/).


I think history (20+ years from now) will prove that for all but the smallest, almost toy, systems, dynamic typing from the 80s and 90s was a mistake.

The maintenance burdens these languages are creating will make Cobol look like a kiddie bike with training wheels next to monster trucks.


I think that dynamic languages played an important role in pushing for the development of mainstream static typing that didn't suck. ML's been around for a very long time, but there was seemingly little interest in pervasive type inference in languages actually used in industry until they had to compete with the concision of dynamic typing.

Actually building large systems in dynamic languages? Probably going to turn out to be a mistake though.


I really don’t think it is fair to blame it on Java. Having very little native dependency is a huge plus to an ecosystem (just look at what Java will be able to do with Loom thanks to the almost all-Java dependencies). Also, Java was particularly keen on downloading class files at runtime, so linking everything was not even possible.

And it is not even a difficult thing to fix without going the linker way: java’s modules essentially solve it (as well as javascript modules could/can) — just specify what is visible outside a package and both ecosystems can “tree-shake” non-used code (though I dislike this nonstandard term)


> Does the JavaScript ecosystem have anything like Apache Commons? I'm guessing not, but it probably should.

This isn't a Javascript problem, this is a node problem. Node is just a Javascript platform among many. The fact that the node community decided to go with all these "nano packages" has absolutely nothing to do with Javascript. Nothing forced Node, the distribution, to come with such a barebone standard library. Absolutely nothing... but the idea of being dependent of NPM which was orchestrated by NPM founders, that's how NPM, a private business made money and eventually sold to Microsoft.


Not just that they've fully embraced the "written by randos" but even worse: "as soon as the rando publishes an update or change, use it!" They seem to fully automate updates because packages are so poorly written (and frankly, it probably helps with revenue stream, if their client's websites occasionally break and need them to fix it.)

...and meanwhile NPM's idea of vetting packages is basically "YOLO, BRO!"


In practical terms, can they really be audited?

This is at least obvious DoS, I’m sure it’s easy to slip in an innocuous line that, dunno, ships your ssh keys to some rando server.


look at diffs?


I actually do this, on occasion. (Not 100%, and not to a degree I'd say "yeah, I'd've caught this colors/fakers thing." But enough to say that I've seen a decent sample.)

There is literally, on average, no difference in average quality between commits on FOSS projects, and commits on projects we pay external entities for. Some paid projects are just crap code, and some FOSS code is extremely high quality.

I've had to roll-back / hard-pin dependencies from both low-quality FOSS & low-quality paid projects because of commits that — once you find & read them — are just bananas.

(I have no idea how to solve the root problem here, honestly.)


> I have no idea how to solve the root problem here

I'm not even sure what the problem is. If it's "updating dependencies introduces severe side effects" wich, I think, should be accounted for in the process


Can you really say, with a straight face, that you inspect the diffs of your entire dependency closure every time you deploy an update? With the level of scrutiny required to detect a maliciously-obfuscated security exploit?

If you can, you're an infinitely more diligent developer than I am, that's for sure.


Fortunately the problem could become more tractable if something like SES / Endo takes off:

"Endo protects program integrity both in-process and in distributed systems. SES protects local integrity, defending an application against supply chain attacks: hacks that enter through upgrades to third-party dependencies. Endo does this by encouraging the Principle of Least Authority. ... Endo uses LavaMoat to automatically generate reviewable policies that determine what capabilities will be distributed to third party dependencies."

https://github.com/endojs/endo


Nice link, thanks! Good to see Mark Miller is still working in this space.


I look at gits diffs to see what's changed between versions, but that's for fun and not diligence.

> With the level of scrutiny required to detect a maliciously-obfuscated security exploit

Nope. Not paid to do that and I have not been given any such responsibility.

That said, I think the attack vector on this is very low.

Packages are rarely updated to the latest version.

We don't use a lot of packages.

We mostly use packages from trusted sources.

We use packages that are open source.


And frankly, not just diffs. You’d need to inspect the initial state, and any new dependency added. That’s potentially hundreds of thousands of LoC.


There’s no reason people can’t keep local caches of these libs if it is a major concern. This seems like a non issue.


Stale libraries are more likely to contain known security vulnerabilities.


I know it's bad practice, but I just checkin vendor files/libs to source control. Makes auditing new releases of libraries a bit easier. Assuming they aren't binaries of course.


I also like doing this, but with node, you have a massive tree of thousands of files. It's crazy and gross.


Yarn 2 pnp kinda fixes this


I don't recommend this approach


I haven't had issues yet, but it's considered bad practice for a reason. What headaches am I in store for?


You're committing something that is not a part of your source code into your version control system, assuming your source code is git, this is irrevocable without rewriting history.

It's mainly a nuisance. It takes up unnecessary space. Introduces possible annoying merge conflicts etc etc and it's not trivial to remove it.

As reference, I migrated repositories from TFVC to git. One team relies on checking in packages into source control, another one does so far less. One repo is significantly nimbler.

Checking packages into source control is making your VCS a package manager. Presumably you have one. Don't hammer nails with your screwdriver


The benefit of having your dependencies vendored is that you have everything needed to build your application without having to download stuff from the internet. You get to ensure what exactly makes it into your application. Yes, it will increase the repository size, but I don't see why merge conflicts would be a problem since you are just replacing a file with a new version.


> you have everything needed to build your application without having to download stuff from the internet

that's using version control to act as a proxy. AFAIK, a lot of package managers already cache local copies

> You get to ensure what exactly makes it into your application

sorry but i don't follow

> I don't see why merge conflicts would be a problem since you are just replacing a file with a new version

Are you working alone?


>AFAIK, a lot of package managers already cache local copies

But this cache is usually not easily transferable to someone compared to them just cloning a repo.

>sorry but i don't follow

You have the source code to all of the dependencies in your application.

>Are you working alone?

How many forks of a dependency do you use? Just using the master branch and upgrading along that should be good enough for 99% of your dependency.


> But this cache is usually not easily transferable to someone compared to them just cloning a repo

right, so they need to download "stuff from the internet". it Doesn't matter much if that stuff is from a remote repo or hosted by a package repository. Except if it's architecture dependent, in which case you definitely don't want to share across architectures. Not to mention they may already have a viable copy in a proxy or cache

> You have the source code to all of the dependencies in your application

I'm afraid I still don't follow

> How many forks of a dependency do you use? Just using the master branch and upgrading along that should be good enough for 99% of your dependency

Well, if I was expecting things to not break I'd never follow upstream master for a dependency.

But the question pertained to merge conflicts. If several people track the same remote and check in dependencies into VCS I'd expect annoying merge conflicts

Or are we perhaps misunderstanding each other? I'm not sure I follow what you mean by forks. Releases are typically on different branches or tags


I imagine your commit log would be polluted with commits just for changes in the packages.


You would have one anyways for changing the lock file or whatever. Changing your dependencies is something that you may want to be able to undo. It's useful to be able to go back to a known working version of your program with versions of your dependencies that you know work.


Git submodule documentation explicitly says[1] it's designed for adding 3rd party libraries to a project in this exact scenario.

[1] https://git-scm.com/book/en/v2/Git-Tools-Submodules#:~:text=....


I’ve been doing it witb Yarn 2 / pnp and it has been great so far.

You checkout a project and start it. No downloading required, and no 10000s of files.


You’re free to update as you like. It’s entirely possible to audit your local packages.

Letting maintainers update your projects is a convenient feature. If it is a liability in your use case you can work around it. Yes it will take more effort, but your use case justifies it.


I’m a self taught Python programmer. I haven’t don’t much front end.

Why do some JS devs import tiny packages to do simple things? I don’t feel like I’ve seen this behavior in Python. Is it because browsers are an awful environment?


They took the Unix philosophy of doing one thing well and drove it off a cliff.


I literally lol’d. Thank you for the laugh


In browser land, the less code you ship, the better. Removing dead js code is hard, because of its dynamic nature and using common js imports making it harder for treeshaking algorithms.

So, people had incentive to write and use smaller packages.

Now, the situation has improved. If you use esmodules all the way, and only import what you need, then your bundler can remove unused modules from final build


> In browser land, the less code you ship, the better

That's kinda funny. These node tools I run into these days usually take forever to install, waste a lot of space due to creating their own package mirror and generally are prone to break because of dependencies.

There is usually nothing tiny about them, even thought they only have a few lines of code.


usually the "node_modules" folder (with tons of files) is not deployed on production, for front-end application that are going to run in the browser

when we make a build for the browser we bundle all dependencies into fewer files that only contain the code that is used


stdlib of JS vs python or php is absolutely tiny. It's improving over time, but it's still playing catchup.


And if you're optimizing for the browser, you can't count on the improvements being there. So you still want to use third-party libraries for their polyfills.


As someone who does front-end JS stuff and uses a bunch of packages here is why I do it:

I got tired of copying and pasting the same classes between projects. The worse part was I'd add new features to the newer projects and when I would have to go back to work on something from a year or two ago I'd have to spend time backporting all the new code. I also don't like how bloated a bunch of the "popular" packages are. Why do something in 40kB of JS when you can do it in 3kB. Smaller is faster which is important to me because one of my main selling points is that I build modern-looking marketing websites that load and render under 5 seconds on a slow 3G connection.


> I got tired of copying and pasting the same classes between projects. The worse part was I'd add new features to the newer projects and when I would have to go back to work on something from a year or two ago I'd have to spend time backporting all the new code.

Why not create your own common library and publish it to a private repo? There's a lot of options between using a stranger's package and what you're describing.


> Why not create your own common library and publish it to a private repo? There's a lot of options between using a stranger's package and what you're describing.

Exactly what I thought. Fascinating how they can have missed this obvious solution.


I prefer to look at the code, decide if A. it's worth it, B. it wouldn't be funner/better to just clone it as a 'plugin' in my own code, and C. it looks like it has a good team/support around it.


I think it's more because dependencies are generally used less because all the tooling around it is much worse than in other languages.


npm (or yarn) for the most part works much better than python package managers


They don’t know better.


I often find myself... ripping out a lot of what I 'need' into something that maybe isn't always well-maintainable, but it's my fuckfest of code, and if something breaks it's because I choose to eff it up myself. Esp, when it's something API related, most php api sdk's are poorly maintained anyways and need updating as I go, plus I usually learn the api pretty well as I rebuild and test the new classes.

Ironically, I'm working on a laravel package myself that i'm hoping to maintain (and turn into a viable side project) that's basically jetstream with SaaS components, and UI elements... (think ui component libraries + laravel jetstream + extra SaaS/ERP things like tenancy beyond just teams but..like Org which can have teams, projects, employees, and each user can belong to multiple orgs, teams, projects, and have attached profiles to each.

For a lot of the UI stuff, I've basically repacked MIT stuff for tailwindcss components, and laravel/livewire added some extra configurations and options, and made it so you don't need Jetstream, just this thing... so a lot of it is actually other's packaged code pulled into one package so, ideally there's one dependency that could even be easily forked and repurposed for a team's needs but cover a lot of boilerplate possibilities.


I’m surprised the AWS SDK doesn’t pin its dependencies and put new versions through its paces before letting end users possibly use a compromised utility with catastrophic results.


> At the very least, it takes them under a minute to break your app, simply by deleting their package.

Not really, if you pin your versions exactly and don't do auto-updates. This also means that you have to update your packages manually and inspect what the latest versions are doing -- which is good practive anyway. NPM packages can not be depublished any longer as far as I know.


True enough overall, but I'm surprised no one has let you know that NPM actually doesn't allow you to delete packages anymore (after the left-pad controversy). You have to email them and then you are judged by how many downloads you have. If you have users relying on your package, they do not let you delete it.


I have worked with several banks as a devops consultant and have helped implement self-hosted proxy repositories for NPM, NuGet, etc. These proxies save a copy of every package downloaded and store it locally for as long as the bank wants. Developers are then blocked from downloading from any other repository than these proxies. This solves the problem of packages being taken offline, but it does very little against malicious code on new versions of the package. However it also provides the ability to reproduce code X years into the past, as can be required of banks by financial regulators in various countries.

As always it is a cost/benefit trade-off: what is the benefit to auditing every package vs. the cost of auditing every package?


IMO this is what forking is for. You don't have to rewrite it, you simply copy it somewhere that it isn't going to change unless you make the change happen, because some third party screwing around with your production code is just Bad News even if they have the best of intentions.

You should still look over it and make sure it's not obviously malicious, but simply using forks or local repos of open source packages would probably save 98% of these kinds of headaches (with a 2% allowance for insecure/malicious open source code).


a peer-viewed standard library is key, just like what glibc or libstdcpp for c/c++ that covers 80% of the normal use cases, the rest you're in charge for its quality check.

with javascript/node, you can write 100 lines of code then pulling 100 modules, it's quite different and hard to assure its quality and safety over long period of time.

I heard rust also has a very small stdlib, that gave me concerns, but I don't code rust, I do hope though all languages can have a stdlib that is 20% the size covers 80% of normal needs.


I wholeheartedly agree with this commentary. Any insight into why this is so much the case with npm but not seemingly as bad in other ecosystems (dependency trees in npm are huge).

I feel like the implicit trust makes even using popular packages such as react seem a bit sketchy. I’m betting react devs audit upstream packages, but I don’t know if any formal statements that they do. Multiply that by all the other common projects and you have a huge auditability issue.


>> Any insight into why this is so much the case with npm but not seemingly as bad in other ecosystems (dependency trees in npm are huge).

I would think that the sheer popularity of the JavaScript (and therefore Node) ecosystems contributes partially to it - there's a massive industry out there about skilling new developers up in JavaScript, Node, and some front-end frameworks. But it definitely doesn't explain all of it.


I actually attribute it more to the micro package architecture, but maybe I don’t know it well enough. I don’t know any other ecosystem with a left-padding package for instance.


I noticed a few years ago that my bank didn't use much in the way of dependencies for their website (possibly just jQuery) - clearly they agree that depending on React opens you up to depending on... who knows what.


I’ve considered that, there are a few scenarios:

- some sites favor security over UI/UX

- some organisations have the funding to review packages such as react

In the future I think security bureaucracy will prevent security conscious organisations from having nice new things. This happens in places like the military (who were known to use WinXP long after public EOL).


Yeah I'm sure a bank could afford to review React but even a minor version bump would then become a very expensive auditing operation.


Totally agree with you, i think it's time to think carefully and immediately start using services like Vulert(https://bit.ly/336DZub) that tracks your open-source softwares for free and notifies you in real-time if any seccurity issue is found within your applciation. it's free.

atleast in this way we can secure ourselves from supply chain attacks


Extending your logic to extreme you'll not be able to use any OS, compiler, processor unless you build it. We can't build practically anything of use without any 3rd party library in any language/platform.

The problem here is not packages but lack of stdlib and tendency of package writers to have shit ton of further dependencies.


I wish there was a package manager for node.js that is made for "static" or offline usage, and is able to compare headers of libraries before upgrading them.

But here we are, 10 years in, with nobody giving a damn about semantic versioning.

Life could be so much easier with an actual package manager that isn't just some git clone replacement.


Omg y re making list for packages. It's not a technical cure, but it connects developers and library consumers and creates far more accountability.

GitHub issues can work in theory but in practice, developers are often slow to respond i.e. GI is where problems go to die.


You don't have to always pull the latest release on utility packages if you want to evade such problems. Sure, you would need to audit a lot of packages in certain languages...

But yes, I prefer to "vendor" my dependencies too, especially on large projects.


Security vulns are fixed daily in most webapp dependency trees.

If you do not update you are vulnerable to piles of issues anyone can look up.

If you update blindly you may import new obvious supply chain attacks.

The solution is actually doing code review. If you can not afford to review 2000 dependencies then you can not afford 2000 dependencies. The extra effort to use a minimal framework and some cherry picked functions may be worth it for most orgs.


In my opinion, golang does this correctly. You can just git submodule the source code of your dependencies. That way, you're always in control over what gets updated and when.


That's true of most package managers too though. It's very trivial to vendor the code yourself in both NPM and Cargo for example.


Yes ! Re-inventing the wheel is literally perfectly fine. You take a round piece of wood, make it roll, and BAM you re-invented the wheel. No need to go to Toyota, buy their wheel making machinery for millions and run that to make a little thing roll.

It's the same with packages, it's FINE to have to redo a bit thousand separator logic, do you truly need a transitive dependency hell with ^1.1.1 in the package list that auto upgrade at random !!? I've had several cases where the whole company is all hands on deck because some dep somewhere moved up and all subsequent builds fail - what are people doing in node, we never had these issues in Java.


How complicated it is to just clone the repo and point your apps to your forked version? That way this would not have happened.


> Packages are literally remote code exec vulns in the hands of package authors

Reminder: this is why traditional Linux distributions exist.


the whole point of the internet

> are literally remote code exec vulns


I applaud you ) HTML-based WEB1.0 was so much safer and faster...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: