"However, this is just a piece of an overall solution, and it brings with it a lot of the baggage that comes along whenever GnuPG or PGP get involved. Without a web of trust (sigh), a PKI (ugh), or some other mechanism to tie identities to trust metrics, this is essentially a complicated, very expensive, and fragile version of the shasum check npm already has."
I really like how the NPM simultaneously insults two legends in crypto and does _nothing_ to protect the node ecosystem, deferring to "better solutions" that don't exist and will never exist.
They've done literally nothing.
Last discussion was > 1 year ago. They simply do not care. Security isn't even an afterthought.
But they're right. What exactly would PKI do here? Someone is generating confusion.
You could argue that maybe a PKI solution could be used to inform the UI such that users are less likely to make mistakes, but browbeating NPM over this is silly. Maven has this problem (people really concerned built their own tools: https://github.com/whispersystems/gradle-witness), choclatey has this problem, pip has this problem, everyone has this problem.
The big difference is that the NPM ecosystem is just an order of magnitude bigger than most others, and its model of many small packages can hide many more key packages in the noise.
If someone gains access to my account and tries to modify my package without my private key it will not be accepted into the repo.
And actually, I'm pretty sure maven signs not just the code but also the documentation and everything else within your package.
Light-years ahead of NPM, even after their publicized issues.
In this case, you're concerned about signatures. I'll stick to that. If I believe I need crossenv, I see it advertises email@example.com as the author. I look up Kent Dodds. I find various bits of information about him. I find his twitter @kentcdodds for instance. He may even have a page published @doddsfamily.com to verify his key.
If the key has been around for a while, that's probably sufficient for me. If he doesn't have such a page, I send him a polite message, "Hi Kent. I'm evaluating your software for use in my project. Can I get you to verify your key please?" and Kent is a professional, so he agrees. I send firstname.lastname@example.org an encrypted message and he sends me an encrypted reply. Done. I'm satisfied that he controls the key.
At this point... oh, wow. That's not my key! Did you say crossenv? My package is cross-env! Alert the Node authorities! There's a malicious package pretending to be mine!
Is that 100% infallible? No, but here's the great thing. Even though my verification system may not be bulletproof, others with resources like DoD or FBI are out there verifying keys. They'll go see Mr Dodds, in person, if they have to. In this way, key verification is a bit like herd immunity. And the older the key is, the more trustworthy it probably is.
People who are here arguing against this system have a few things in common. They don't propose a better system, because they don't have one. They're also a bit like anti-vaxers. They irrationally refuse to participate in such a system to everyone elses' detriment. They're a bit like congress as well. They can't just look at a working system like single payer, and copy it. They refuse to accept that this is the best available option, so they fold their arms and actively fight any attempt to implement it, as is the case on that github 4016 issue.
So now that I've maybe offended everyone, strike me down with down votes. That's the basics of how I would go about it though.
The problem is one of addressing. People want to get packages by a utf8 and natural language string, followed by a version boundary check.
If people referred to packages by their proper hash (as one does when referencing values from IPFS), then we wouldn't have this problem. If people had a public key and added a key fingerprint that would also work, but would not provide any additional verification to the code (1).
But people don't want to do this. They want to address packages by relatively simple and memorable tuples. That is the problem.
> They're also a bit like anti-vaxers. They irrationally refuse to participate in such a system to everyone elses' detriment. They're a bit like congress as well. They can't just look at a working system like single payer, and copy it.
You aren't even in Keybase. Have you personally participated in any keysigning parties? Can I find your public key details here?
I can answer yes to all 3 above questions.
> So now that I've maybe offended everyone, strike me down with down votes. That's the basics of how I would go about it though.
You haven't actually offended everyone. What you haven't done is say anything that anyone else doesn't know already. We all know how signature verification worksheet and we all have seen it's problems in real world implementations.
Please reconsider paragraphs like this.
(1) :: The thing this would do is make it arguably more secure contacting the author or maintainers of the code, although given GPG's failure to achive widespread adoption or integration in popular tooling, it seems unlikely it'll see much utility.
The hash naturally changes with every release. The key fingerprint doesn't. Updating your dependencies to new releases is much, much more frequent than adding a new dependency; people are willing to put more effort into the latter than the former.
> You aren't even in Keybase. Have you personally participated in any keysigning parties? Can I find your public key details here?
Keybase encourages poor practices and as such I avoid it. But I've been to keysigning parties and my key's details are published any number of places. (Since adding one more is all to the good, the fingerprint is 400A C7D2 E7A1 802A AE2C C459 B1E5 712A 6D03 3D61)
It's true that a signature is based around a component with longer lifespan than a hash. However the management and trust of that component damages this argument severely.
I am unaware of any web of trust in active use that could operate a npm scale existing today. Could you share one with me?
Is that not obvious from scrolling?
All package submissions are hand-curated, which should catch typosquatters. There's a clearly laid out pattern for what package space you're allowed to use based on website or company ownership.
The system is highly automated but you have to wait to get your namespace approved. And it's not unrealistic to do this with npm, maven has somewhere over a million packages.
NPM is such a massive package repository, it's sort of a testament to the community that these sorts of things don't happen more often.
Npm is a public internet login with a password of your choosing, probably the most insecure form of authentication there is bar doing nothing. I could be brute forcing your login credentials right now and you wouldn't even know.
Also, users are safe even if maven itself is compromised because attackers still can't validate my private key. I don't think you could say the same about npm...
Yes, there is a difference, released must be digitally signed on maven. They are not on NPM, so a hacker can hijack your packages just by obtaining your npm credentials.
That's crypto 101 and you can't tell the difference?
Couldn't they 'just' steal your GPG key as well? If someone can trivially steal a strong password from you, I'd worry about key material as well.
> That's crypto 101 and you can't tell the difference?
Please reconsider sentences like these.
It is comparatively trivial to steal a strong password when it is transmitted over the internet vs local key material.
Nation states with bad certs (china has done this) can steal your password, npm can steal your password, npm can be hacked and leak passwords, npm can be subject to a NSL and forced to hand over passwords, etc, etc.
There is a big difference between passwords and keys.
> It is comparatively trivial to steal a strong password when it is transmitted over the internet vs local key material.
"Comparatively trivial?" I don't really understand this. I think you're suggesting that SSL cert attacks are easier than evil maid attacks due to this paragraph:
> Nation states with bad certs (china has done this) can steal your password, npm can steal your password, npm can be hacked and leak passwords, npm can be subject to a NSL and forced to hand over passwords, etc, etc.
Pardon me, but we're so far afield of the actual reality and the logistics that plague it that I decline to continue this conversation. If your adversary is nation states, you need a code audit (preferably several). You can't trust the signatures precisely because nation states are so well equipped to perform evil maid attacks.
It is 2017, evil maid attacks are the reality of the nation-state intelligence industry. We're seeing examples of them leaking out. They're in active use not just in a targetted mechanism, but speculatively! We have evidence intelligence agencies sell comprimised USB keys in bulk near the physical locations targets are likely to enter just to see if they can catch them in a moment of bad opsec.
Honestly this all feels like an attempt to say, "We should use GPG because it is good." Maybe it is. I'm not so sure, given the logistical reality.
But it wouldn't prevent the attack we see in the tweet above. It wouldn't make the root post of this thread correct. And I'm not sure it would accomplish what you're saying.
I'm sorry, I simply don't have time to chase this degree of abstracted hypothetical on a day old thread. I'm writing this final message here as a courtesy to those involved.
Now we can't trust any package published from China.
This is a bigger deal than a targeted evil maid. Distributed trust is better than centralized trust.
Same principle as saying, "I will make my website more reliable by adding more servers." In fact you make it less reliable by doing so, you just change the severity of the problems.
And we've seen economic mechanisms compromise signed star-shaped trust graphs. E.g., all these atom plugins with features/spyware circulating because a small handful of companies think it's a business model. That's literally just buying the property from folks and ruining it. That's a very powerful attack against a web of trust and often cheaper than the fake cert attack, which is actually something you can guard against if you feel inclined to do so.
It's not like you'd have to write a bunch of software. Also, other mission-critical open source repos have been doing this for at least a decade, so you don't even have to invent and validate a new set of processes for this.
For me, this calls into question the reliability and quality of the whole npm infrastructure and the packages it hosts.
Also, blaming users because npm (knowingly, through willful negligence) hosts malicious packages that typo squat on legitimate packages doesn't seem appropriate to me.
It was typo squatting.
Name canonicalization gets you part of the way there. But unless you want to go full-on namespacing then you must realize you are fighting a pointless battle.
You cannot reasonably expect to save users from themselves in every way.
cross-env has had 1.3 million downloads in the last month. How many of those "hey, I am evaluating your library" emails can Dodds field?
Most node projects have hundreds of dependencies, if you include transitive dependencies. How many of those can you test?
If simple solutions worked, they'd be used.
But yeah, everyone contacing an author directly doesn't scale at all.
Websites, Twitter, Github, Keybase, etc..
It would be pretty hard for a bad actor to overtake the real author's entire Google-findable presence (assuming it's a reasonably popular package - why would you typosquat anything obscure).
If all you do is send an email, then you haven't really done "due diligence" in any acceptable form.
This idea that everyone will just verify with the author is insane.
It's the same way you gain trust in something in real life, by watching actual behavior of someone over time. It is just assisted by technology.
Then you can ignore the issue of email completely, because you're not basing your trust on authority of the author, but on his track record as determined by you.
I bet you'll not find many attackers who would maintain some hijacked package for a few months, before launching their attack. Original author would probably notice something fishy too, given enough time.
> Which part of the statement implied "vilgilate justice."
To put my own 2 cents in:
Doxxing someone is generally considered an attack, at least in some internet circles. It's pretty vigilante if you ask me, especially when we've seen some pretty striking examples of doxxing gone wrong in the past.
It's absolutely an attack, but it's an "attack" of a kind that is acting to end misuse and widespread tampering. It's difficult to imagine a coherent ethical system that gives the author of malicious software an expectation of privacy as they attack other people, violating similar rights.
I'm fine sharing with law enforcement, but sharing with other service providers seems to be a slippery slope. I imagine a dev losing access to their github account because they used a shitty password on their npm account and got compromised. That would suck.
I'd much rather we invent a better UI for dealing with software dependencies, but alas.
The problem does not lie in attacking bad people, the problem is that there is a high risk that you THINK you've identified who the bad actor is but actually the person you decide to "retaliate" against had nothing to do with what was done to you. That's why we leave law enforcement to the law enforcement officials and justice to the justice system. Even they make a lot of mistakes but at least there is a process that gives a chance for the truth to be found.
But sharing info about a suspect with law enforcement is what you should do yes.
It's unfortunate that so many people don't know what the word means, because now we're redefining the word to a very specific and malicious definition that makes communication about nuances around the intersection of rights here more difficult.
> there is a high risk that you THINK you've identified who the bad actor is but actually the person you decide to "retaliate" against had nothing to do with what was done to you.
I mean, you'll know their IP address, login, email, ISP and whatnot at a minimum. If the target is a comprimised computer, notifying them is the bare minimum you should do. So I'm sort of confused what kind of final consequence you're imagining here.
I think folks just see the word "doxxing" and their pattern matching misfires.
Or maybe you're trying to weasel out of what you said and are now going for broke.
Linking once again to define words, we go to Wikipedia:
> Doxing is the Internet-based practice of researching and broadcasting private or identifiable information
> Doxing may be carried out for various reasons, including to aid law enforcement, business analysis, extortion, coercion, harassment, online shaming, AND VIGILANTE JUSTICE.
I can see this is going to be a constructive dialogue. If I had wanted to "weasel" I would have deleted the post last night when it passed under the negative point threshold.
I have absolutely 0 moral and ethical problems with publishing any details I have on a person who is using my system to attack other users. I think in fact this is a responsible thing to do, and necessary. In this specific case, I might be careful about the timing of the disclosure to try and round up any nasty packages in other systems they might have generated.
But I'd publish it. Happily. Gleefully even. I have 0 moral or ethical obligations not to. I have a clear ethical imperative to do so.
I guess fortunately for this scammer, I don't own NPM.
I wasn't going to accuse you of being a weasel, but this is the most weasel-y thing I've ever seen.
Perhaps you can cite a history of the word, then maybe I'll trust your definition over some other.
You asked for clarification to avoid future misunderstanding and then proceed to reject our clarifications as if there's some nerd-word central authority that we're not aware of. We can't even agree on one 'x' or two.
When you're the one person in a conversation who has a totally different definition, you just might be wrong.
You're wrong. Give it up.
I've said I have 0 problems publishing their data publicly. I'm happy to own even the stronger model of doxxing you lay out. I've put a few time qualifiers on it you didn't like.
But I have no problem burning the the identity people who think they can use me or my infrastructure to defraud others. Quite the opposite.
Vilifying them online isn't reasonable. It's how you end up with harassment, death threats, swatting, people going after your job, your family, etc...
It's a really shitty thing to do.
Is a hell of a lot closer to "vigilante justice" than the version you just said. Had you made the sane posting first (or not pretended you did the second time) I wouldn't have said anything and just upvoted in agreement.
Just so I can modify my words to avoid future misunderstanding.
I can't believe it wasn't obvious, but that's the part which implies vigilante justice.
A vigilante is a civilian or organization acting in a law enforcement capacity (or in the pursuit of self-perceived justice) without legal authority.
Doxing has a very specific meaning and it directly implies taking matters into your own hands, aka vigilante.
In the case of malicious scammers, doxxing them so they can be cut off from other code repositories seems less like "vigilante justice" and more like "a public responsibility."
That something hasn't been updated in two years because it's feature complete and does what it's supposed to: verifies the integrity of dependencies for Signal .
> If someone gains access to my account and tries to modify my package without my private key it will not be accepted into the repo.
This isn't true. Sonatype/Maven Central requires PGP signatures on all new artifacts, but there is no requirement to use the same key. It will happily accept a signature from _any_ key for new releases.
The work is shifted to client, but there's currently no standardized way on how to verify dependencies and plugins.
There's an issue in the Maven bug tracker with the idea to extend the POM to allow trust information:
I once did a mvn build on a southwest flight and got stuff like "Syntax error: <h1>Click here for free TV..." all over my console.
This was ~4 years ago. If I remember right, maven "supported" package validation, but it was certainly not the de facto standard.
The default in maven client is usually to download via http. The default is usually to _not_ check the hash. There is not a great way to pin a library to a repository which, when coupled with the ease of third-party repositories slipping into your project, means that you can download things like your crypto oauth library from some random server on the web.
Many of these issues can be mitigated by running your own repository that mirrors what you need. Most big corporate shops do this. I think that approach works for any package management system. I guess open source devs and hobbiests are screwed?
There's also this,
Because you want to know when a dependency has a vulnerability, even when the developers are legit.
I'm asking about the specific attack detailed in the tweet.
In practice I think most developers would be running their project on the same exact box as they use for building it, which nullifies the separation of build/runtime environment. The reason that we don't typically see egregious typosquatting in the Java ecosystem is that Sonatype has a manual check on the claimed namespace for the organization publishing a project (among other checks). npm, Inc. could do this, but they so far have chosen not to.
It's true it doesn't immediately build on site, but it sure could run in the developer's machine.
What malicious packages have been found on Maven central?
Heck, I read that you can't even make a GPG key without proving your virtue to Richard Stallman in festivus-style competition.
No, they don't. Some people actually implemented the "web of trust (sigh)".
Security is hard; the answer is not to just go "too hard, kick the can down the road" and then make excuses when the time comes for damage control. NPM being an order of magnitude larger means that more focus should be given to security, not less, since it has that extra noise acting as another way to hide malicious activity.
- macOS apps need to be signed (to run without extra work). The keypair is associated with a developer ID account that has a credit card on file. Abuse is still possible (stolen credit card, stolen certificate), but a lot harder.
- Some open source project have their own WoT. For example, IIRC NetBSD required new developers to meet with one or two existing developers in person to verify their identity. (Pretty much like a regular PGP WoT.)
These are more work, but they also make the world safer for users.
Debian also requires OpenPGP keys and WoT for all developers.
A web of trust implies transitive trust.
I'm pretty sure the same is true of Debian, but I don't know about the others. But these are NOT webs of trust.
Most people won't be that strict in informal development, but that's not really what this is about.
The wild west model scares the bejesus out of me to be honest.
Are you characterizing people who verify keys as crazy? It's not like you can't just reach out to @kentcdodds and get an answer in under 5 hours as Oscar Blomsten just did in the OP.
Also, some of those 39 dependencies are things I wrote (various general-purpose ORM type mappers for different styles of datastore), so I guess that number could be a bit lower.
Either way, there is a lot of verification to do here. And there's two different kinds of attacks. The dangerous attack that automated key lookups and signing can protect against would be someone publishing a malicious version n+1 of a library I wrote or rely on (either hacking the npm credentials or just buying access as in the case of the minimap / autocomplete-python debacle from last week). In that case, the key change might be noted, but then again, people lose access to keys so you'd want to have ways for package authors to revoke and reissue keys (in which case compromised or purchased credentials might not be noted).
Another type of attack is the one we're seeing here - a typo attack. Without protections of the type discussed in maybekatz's tweet (in that thread), it's pretty hard to see you're in a typo attack. The malicious publisher can still sign a malicious package, and if you accidentally install crossenv instead of cross-env, there's no protection from signing. Again, unless you are manually auditing all 500+ dependencies in your tree by figuring out who the author should be, since you're viewing the npm-reported information as potentially compromised you'll need to find other connections to that person.
At this point, you've basically recreated a web-of-trust architecture, with all the challenges that go with it. This isn't simple, I don't think isaacs and the rest of the npm crew are maliciously ignoring obvious answers. It's more likely that the actual answers are hard to find and harder to scale.
The answer ultimately is: we need to audit all the code we use, or have someone else we trust audit the code we use. And if we're not auditing it ourselves, we probably have to pay someone else to do that, and that's not cheap. Walled gardens have their benefits, but explosive growth and rapid invention / iteration / elaboration is not one of them.
> my simple webapp has 514 de-duped dependencies
We have different ideas of 'crazy'.
every approach has weaknesses. I'm pretty sure there's tradeoffs everywhere: ergonomics vs speed, security vs inclusivity, etc. I'm also pretty sure it's uncool to make implications about my mental health in public.
I'm unconvinced that 514 is crazy.
In fact, the only unusual thing I see there is that the author knows that number.
Back in then 1990's PHP was very popular. To use it, you had to compile it yourself, which involved compiling Apache 1.3 with modules. There were also various image libraries, font libraries etc. It wouldn't surprise me at all if the dependency tree of that included hundreds of libraries.
There's no secure list of "good people" and there's no secure list that provides a mapping of who should be signing each package. Especially for things maintained by multiple people, I wouldn't have (and shouldn't have to have) any idea of which particular people are the proper signers.
It not only works after-the-fact, but is usually also quite effective as a deterrent to stop people from publishing malware in the first place (except state-level actors who can afford to create and burn real identities for the sake of cyberwar.)
It's almost as if most of NPM should be replaced by some kind of ... self-contained encyclopedia of code. Maybe it could even be maintained by a single group of people that get along with each other and adhere to a release schedule. And perhaps there is some way it could be organized into modules with consistent documentation. While we're talking about this amazing world of tomorrow: maybe those docs could even be on the cloud, with hyperlinks between sections!
Okay, sorry, getting ahead of myself. It's crazy-talk, I know.
They don't tell you that the package was signed by someone you think should be authorized to produce that package.
Linux distros can get away with signing everything because there's typically a very small set of people the distro's organizational structure trusts to make packages, and thus a very small set of keys and real-world identities to verify.
Open-to-the-public package systems cannot hope to verify the identity of every person who creates a package, and thus cannot provide you with the web-of-trust model you want (since what you seem to want is not "is this signed by a PGP key" but rather "is this signed by a PGP key I personally think should be authorized to make packages").
Considering that signature checking would not have prevented this attack that has actually happened, I would say that not having signed packages is not in-fact the bigger threat.
Or can you point us to a prior example of a successful attack that could have been thwarted with proper signature checking?
I guess that TOTP-based 2FA challenges would be annoying in the case where CI performs the "publish" step.
You can write a fairly basic script to automate the signing of stuff as well. Here's an example (it even signs the source archive): https://github.com/openSUSE/umoci/blob/master/hack/release.s...
they can if npm enforces the usage of TOTP for publishing.
As a user who uses both a GPG key to sign commits and a 2FA token to authenticate to all sites where this is possible, I can assure you that dealing with TOTP token is more fun than dealing with GPG keys.
So what is going to happen to all of the packages published before TOTP is turned on? Not to mention that there have been many cases where second-factors have been bypassed (even Google's authentication). Which means I'm forced to trust that there are no exploits in NPM's authentication system, as opposed to trusting that PGP signatures are not broken. I know which one I would bet on.
As for dealing with PGP keys, come on. We all know that GPG's interfaces are bad for normal users, but all it takes to be able to sign things is:
% gpg --generate-key
same as what happens with all the package that were uploaded before the hypothetical GPG support was added to npam an packages could be signed.
>Which means I'm forced to trust that there are no exploits in NPM's authentication system
with signatures you are forced to trust NPM's authentication system to make sure that nobody has stripped a signature of a published package or changed the signature of an existing package.
Alternatively, it's up to you to keep track of all previously used signing identities of all your dependencies and to manually check the whole dependency tree if any of the keys in the tree have expired and been replaced.
> but all it takes to be able to sign things is[…]
unless you have more than one machine. If you do, you have to sync your keys between machines and just putting ~/.gpg on Dropbox (which would be ok as the keys are encrypted) won't do because there are still two maintained forks of GPG out there that work differently and require different config settings.
> And answering the interactive prompts
of which depending on GPG version some give bad advice with regards to key compatibility and strength and none of these prompts will help you deal with an expired key in the future (and yet, these prompts recommend you create one that expires after only a year).
Just stating `gpg --gnerate-key` as the complete solution will put people in position where in case of an emergency release they won't be able to publish that release because of previous administrative failure. That's a risky proposition.
And finally, the same malware that steals your 2FA token can also steal your ~/.gnupg and the passphrase once you enter it.
What I'm getting at is that gpg is actually significantly harder to use and maintain for users, requires significant updates to npm on both the server and client end, will cause false positives due to key changes and doesn't provide much more security than enforced 2FA authentication for publishing packages which would just require a small server-side change.
I get that you personally are totally willing to deal with a the maintainer's key of a dependency of a dependency of a dependency of yours having expired and thus being replaced with a new key and I also totally get that you yourself are willing to manually check the signatures of the whole dependency tree for changes (you're not willing to trust NPM itself as a public key repository, I get that, so you'll have to manually keep all previously used public keys around), but don't expect this same due-diligence from everybody else.
Once you trust NPM.com to manage identities (which is the only way to halfway conveniently deal with key rotation), everything hinges on NPM's authentication system again and at that point we're back to square one.
> I really like how the NPM simultaneously insults two legends in crypto and does _nothing_ to protect the node ecosystem, deferring to "better solutions" that don't exist and will never exist.
npm is hands down the best package manager I've used and they actually do improve with every version.
You poor poor bastard
Blog post: http://incolumitas.com/2016/06/08/typosquatting-package-mana...
Discussion: https://news.ycombinator.com/item?id=11862217 https://www.reddit.com/r/netsec/comments/4n4w2h/
The paper also discusses possible mitigation measures, including prohibiting registering new packages within a certain Levenshtein distance of existing packages and using additional namespacing.
Even if NPM isn't prohibiting packages, you'd imagine they'd have internal security alerting for Levenshtein distance from the names of very popular npm packages. Such an alerting script wouldn't take terribly long to write (or to run). It'd let them catch this type of abuse much faster even if they decided (for some inane reason) that banning the names outright would break UX.
`npm install` runs code as part of the initial step.
Also, `npm install foo` will of course not just run code from `foo` but from all its dependencies and their dependencies dependencies as well.
npm ls | grep -E "babelcli|crossenv|cross-env.js|d3.js|fabric-js|ffmepg|gruntcli|http-proxy.js|jquery.js|mariadb|mongose|mssql.js|mssql-node|mysqljs|nodecaffe|nodefabric|node-fabric|nodeffmpeg|nodemailer-js|nodemailer.js|nodemssql|node-opencv|node-opensl|node-openssl|noderequest|nodesass|nodesqlite|node-sqlite|node-tkinter|opencv.js|openssl.js|proxy.js|shadowsock|smb|sqlite.js|sqliter|sqlserver|tkinter"
npm ls | grep -E "babelcli|crossenv|cross-env\.js|d3\.js|fabric-js|ffmepg|gruntcli|http-proxy\.js|jquery.js|mariadb|mongose|mssql\.js|mssql-node|mysqljs|nodecaffe|nodefabric|node-fabric|nodeffmpeg|nodemailer-js|nodemailer\.js|nodemssql|node-opencv|node-opensl|node-openssl|noderequest|nodesass|nodesqlite|node-sqlite|node-tkinter|opencv\.js|openssl\.js|proxy\.js|shadowsock|smb|sqlite\.js|sqliter|sqlserver|tkinter"
npm ls | grep -E '(babelcli|crossenv|cross-env\.js|d3\.js|fabric-js|ffmepg|gruntcli|http-proxy\.js|jquery\.js|mariadb|mongose|mssql\.js|nodecaffe|nodefabric|node-fabric|nodeffmpeg|nodemailer-js|nodemailer\.js|nodemssql|node-opencv|node-opensl|node-openssl|noderequest|nodesass|nodesqlite|node-sqlite|node-tkinter|opencv\.js|openssl\.js|proxy\.js|shadowsock|smb|sqlite\.js|sqliter|sqlserver|tkinter)@'
grep -r -i --include package.json -E '(babelcli|crossenv|cross-env\.js|d3\.js|fabric-js|ffmepg|gruntcli|http-proxy\.js|jquery\.js|mariadb|mongose|mssql\.js|nodecaffe|nodefabric|node-fabric|nodeffmpeg|nodemailer-js|nodemailer\.js|nodemssql|node-opencv|node-opensl|node-openssl|noderequest|nodesass|nodesqlite|node-sqlite|node-tkinter|opencv\.js|openssl\.js|proxy\.js|shadowsock|smb|sqlite\.js|sqliter|sqlserver|tkinter)@'
Does this mean everyone whose used those have been affected?
EDIT: I just realized these packages are misspelled or meant to mimic the legitimate packages.
Wow some of these are incredibly similar, if they get indexed by google (as most NPM packages do) it's really easy to mistakenly add the incorrect package while searching for the legitimate one.
I would like to see a curated set of popular libraries that are stabilized and blessed, and a core group that handles security updates and upgrading packages in the blessed set.
Part of the excitement of js dev is that there's always really useful libraries being created and distributed (ramda, rxjs, react to just name three things that start with R).
Not sure there is a good solution here - we want tons of value, but I suspect nobody's willing to actually pay for it. Libraries like cross-env, ramda, and so on all are excellent, useful, well-written, and the authors are responsive.
No they aren't.
It's true that NPM is the biggest package repository by some way, but it's only ~2x the count of the Maven repository (492135 vs 194954). PHP and Ruby also have the same order of magnitude.
tied to technology that's in relatively extreme flux with a massive amount of users
Maven/Java has similar challenges to some extent with both Android and server side development being extremely common in the same language. The large number of users and the extreme technology flux is also similar.
1. That's a Gradle plugin. I do not use Gradle. I use Maven.
2. I don't need to "manually verify everything I whitelist" (which is not exactly how Maven works but w/e) because everything in Central is cryptographically signed and can't be replaced simply because someone deleted all of their projects in a fit of pique, and some other person swooped in and used the exact same name.
3. I don't have to worry about typos, because I can't add a dependency to my project without editing the project's POM, so fat fingering on the command line cannot add random packages to the project. Plus, Maven artifacts are namespaced, so I would have to be pretty drunk to make so many typos that a different, existing package downloaded.
3. Fat fingering your pom is no different, this argument is as wasteful as 0 and 1 and you know it.
If you actually knew what you what you were talking about, you would know that you can't specify a dependency in a Maven POM without specifying a version, so everything is effectively version pinned and doesn't magically change unless you do it explictly. And you can't change an artifact after it has been released; you have to issue a new release with different coordinates. These are features of Maven that npm is missing that would make it vastly more secure than it is now.
And fat fingering a pom is different, because it requires many more mistakes to download some other software package than the one intended. That's what this discussion is about right? That it's absurdly easy to download the wrong package on npm but not other package managers, because npm is an insecure package manager.
Also, if you read the docs you'll notice that the deploy plugins encourage users to put their central passwords in an xml file in the plain AND their pgp key passphrase in another, in the plain. No joke. Read the docs.
And it's patently false to say that the deploy plugin documentation encourages you to include that information. They very clearly preface that with "if your repository is secured" and use the explicit example of an internal repository within a corporate context. People smart enough to read that documentation in detail should also be smart enough to determine when it is and isn't a good idea to do that.
Edit: react, webpack, babel, babel-preset-env bring in 1,257 dependencies. Try vetting all those by hand.
You know how hard it would be for me to vet the dependency if I did have a dependency problem? "mvn dependency:tree > deps.txt." It would take half the day, but I could vet them all.
I also have a Django project on the side. It has 8 dependencies. I have vetted all of them very carefully. It's smaller than the work project, but that's not the reason why it has fewer dependencies than a typical JS mess. It has fewer dependencies, because Pythonistas have a philosophy and cultural practices that produces quality software with lots of functionality included.
Your deps.txt is barely different than a lock file here.
Reading the code from dependencies is not really hard anywhere here.
The software works as intended and I know that my dependencies have not changed without me explictly changing them.
> Your deps.txt is barely different than a lock file here.
That command produces a tree of all dependencies and their dependencies. I don't know what that has to do with a lock file.
It is certainly out of control though. I just checked the node_modules of a fresh create-react-app generated app and there are 877 packages! Tons of duplication here, like having both array-uniq and array-unique (not to mention that's a feature built into languages like Python).
npx crossenv foo
npx cross-env foo
`npx` doesn't change that dynamic. It's no different than installing an RPM or Debian package over the internet.
I know my systrems system wouldn't last 5 minutes.
This should be signal boosted as hard as it can be managed, because this is rough stuff.
And yes, I get the irony of adding another dependency to help with the security mess caused by the node ecosystem's bent towards external untrusted / unverified dependencies.
I wouldn't doubt that there are package names that would collide because of such a change, but that's probably a good thing.
Does npm normalize package names with unicode in them? Would "сrοѕѕ-еnν" be considered equivalent? (Although this would only work if users copy/paste the name).
(It's the first time I've published anything to npm so let me know if I have done anything wrong...)
It uses the list of package names from the all-the-package-names package and returns the 10 packages with the most similar names to the supplied parameter (using Levenshtein distance)
It also displays their rank based on dependent packages to give an idea of how they compare in usage.
I'm sure there are improvements that could be made - PRs welcome on the github repository.
The problem is not unique to the npm ecosystem, the main problem here is "web of trust" whether through GPG or even just things like 'download counts', etc.
1. For every big, important package, you can probably count on number of downloads/stars a library has to attest its trustworthiness.
2. For small packages, you should always look at the code directly. Search npm, see the GitHub repository link, click, read the source to see if it more-or-less does what you want. I think a lot of people do this already.
3. Typosquatting is still the only unsolved problem, but an addition to the npm CLI that checks if there are packages with similar names when you're downloading and alerts you -- maybe even suggesting the package that has much more downloads/stars -- should solve that.
Instead of publishing as cross-env you publish as @guy/cross-env
That makes typosquatting harder, and can help give users some ideas of packages which are by the same authors.
NPM could help by allowing packages to be published both to the "global namespace" AND as a scoped package automatically. (In other words, always allow accessing any global package by it's scoped name)
I would rather have some GitHub integration in place, so I could `npm install github.com/someone/somepackage`, like Golang forces us to do, for example.
I don't do that for all packages automatically nowadays because there is this bizarre culture of people publishing different things to npm and GitHub. To npm they send only "built" files from ES7 to ES5-compatible mode, while to GitHub go only the unbuilt sources which will not run anywhere.
A solution to that would be for an automatic builder to be run on every `git push`. A third-party service, somehow, someone, somewhere. Travis CI, maybe? I'm waiting for someone to have an insight and solve this problem in these lines.
Not that that's an ideal system, but it's an option for some packages.
I found it at https://github.com/leanone/v1/blob/2980984c003d8016ac48d3f87... redirects to https://dev.hacktask.org/p/58ad07f57e25ce001b19f776/
I created account (use google translate) and played with it, use this link to show me who you are https://dev.hacktask.org/client/59815d7f5ff1a2001b9ee398/
It'd be interesting to write a tool that monitors as packages are added to npm, compare them against the existing list, and check for potential typo-squatting. Like, remove dashes, check Levenshtein distance, etc.
I mean, NPM themselves should be doing that but ... since they aren't, might as well do it for them, ya?
looks like there is an api
The only thing you can do is be careful and listen for projects like node security.
You will, of course break production for a few people who just didn't listen.
Alternatively, instead of edit distance, allow users to report problematic packages and do a similar thing. Do not provide a explicit award to users who report, so nobody would create fake malware just to report it.
In both cases, either implicitly or explicitly you are using the wisdom of the crowd to figure out the bad packages.
Maybe have all packages be scoped under a namespace, and then globally require a minimal uniqueness for the package name itself.
since the user appears to have been nuked...
Everyone can share everything for free, safe and sound in a happy world.
Didn't happen ever in the "real world", won't happen here. It's idealistic bias.
I'm sure many things have been written on this, but this is essentially an issue rooted in human behaviour.
It always comes down to having a or multiple arbiter(s) to maintain a standard. The issue with this in these type off ecosystems is that it's simply too big and too dynamic unless devs and the curators are on common terms release wise.
By now you basically are threading being an organisation potentially elevating privileges with just a small portion off devs to realistically deal with the scale off things. In this centralized state it can swing the other way, mainting heavy arbiting and release standards (Apple for example), creating a potential more stable and secure but closed system.
Google autocomplete also suggests xss.hacktask.net
Looks like this guy is up to all kinds of no good.
We also don't want shards to become yet another system package manager, used for installing executables to your PATH. Shards should never be run as root, unlike npm and pip.
Would be interesting to know how many systems have potentially been hit by this, and if any leaked production credentials. I think it's unlikely to yield a lot of useful results due to then drag net nature of the project. A targeted attack might make more sense (e.g. On an open source library, targeting specific developers)
Couldn't the developer have at least chosen a less suspicious domain name? :)
I think they hold some responsibility in allowing an obviously malicious package to impersonate popular packages.
I would like to see an official response with action plan. I recall this attack vector being discussed in the aftermath of left-pad.
an unfortunate irony is that the current post on the npm blog is "Securing the npm registry" from 12hrs ago.
You trust NPM to be secure and serve exactly the code that the author published unmodified.
You trust the author to not act maliciously. Nothing you can really do if a user voluntaitally installs leet-virus.
yarn why crossenv
Is there a reason this isn't done, or has it just not been allocated the time to build it?
Also who's hacktask?
To publish an Android app, I need to verify my name by paying Google some money ($25?) and my code has to pass some automated checks.
It seems like anyone can publish just about anything anonymously on npm. That model has upsides, but it's not exactly state-of-the art in terms of QA (though you could argue whether QA is the right term here).
Of course they're not.
In fact, GNU/Linux distros with even minimal QA will disallow network access during builds. Also we do in fact manually audit quite a lot of stuff to make sure this sort of bullshit doesn't get uploaded to the archives.
The problem is JS that lack sensible standard library. Node is at fault.