Hacker News new | past | comments | ask | show | jobs | submit login
$ sudo rm -rf / === NPM install (ghuntley.com)
171 points by ghuntley 76 days ago | hide | past | favorite | 173 comments

If you are on a linux distro that supports apparmor (ubuntu/debian etc), such issues can be easily prevented by creating a sandbox profile and alias commands like node/npm to run using it.

  alias npm='aa-exec -p sandbox npm'
  alias node='aa-exec -p sandbox node'
sandbox profile:

  # /etc/apparmor.d/sandbox
  include <tunables/global>

  profile sandbox {
    #include <abstractions/base>
    #include <abstractions/consoles>
    #include <abstractions/nameservice>

    /sys/** r,
    /{usr/,}bin/* ixr,

    # nodejs install dir
    owner /home/*/nodejs/**/* rix,

    owner /tmp/**           rw,
    owner /tmp/**/          rw,

    owner /home/*/.npm/_* lrw,
    owner /home/*/.npm/_*/ lrw,
    owner /home/*/.npm/_*/** lrw,
    owner /home/*/.npmrc r,

    owner /home/**/node_modules/ rwix,
    owner /home/**/node_modules/** rwix,
    owner /home/**/package.json rw,
    owner /home/**/package-lock.json rw,

    owner /home/*/projects/**/ r,
    owner /home/*/projects/** r,
To load the profile:

sudo apparmor_parser -r /etc/apparmor.d/sandbox

edit: For completeness, added step to load the profile.

I think you have a definition of "easily" that many might not subscribe to.

If this policy isn't available and enabled by default in some package containing common policies, do you really expect your average developer to a) even know anything about AppArmor, or b) know how to write an AppArmor policy file without spending a few hours digging through docs, reading examples, and engaging in a bunch of trial and error?

(If your answer is "yes" to either of those questions, I'd suggest you might overestimate what is common knowledge.)

Fair enough. I meant easy as in without installing any additional software, minimal config and ease of use.

AppArmor has come a long way. It's worth learning about it. They have quite a few builtin profiles that you can take advantage of.

Only on HN can you use the word "easy" to describe a set of steps that follow directly after, and still end up having to clarify the word easy referred to your steps, not to the entire concept of the underlying technology.

People mean the steps are not easy. Because it's now about copying them but being able to create them yourself on the fly for whatever you need.

I really don't get what this is trying to say.

Like, the tone of the comment makes no sense, the hard part isn't on the OP, and OP wasn't saying AppArmor is easy, just the content of their comment is easy (which it is!)

It's like saying flipping a light switch to turn on a light is easy... the having someone fly in complaining that understanding industrial power generation is not easy at all.

> content of their comment is easy (which it is!)

It's not, it has subtle gaps in the sandbox which still allow some of the attacks it's supposedly protecting you from.

it's easy, just do 'rm rf /'

My point is: how do we know the instructions are actually safe?

Oh brother. It's like wrestling a pig in the mud.

If you don't trust it, don't run it... but I don't know how anyone in software development would be unable to function if they've never been able to trust someone else's config file suggestion.

We reached a state where code monkeys believe they are developers. Of course you should know about the machine you are developing on.

There's an insane amount of tooling with varying degrees of overlap. You've got SELinux, BPF, seccomp, AppArmor as different approaches to security. Each of these takes quite a bit of dedicated time to master and most of them have lots of nuance & there can be overlap depending on what you're trying to secure. In a work environment, I'd typically outsource this to security engineers/experts OR spend my time learning those things for what I need them for - knowing the tools are out there are sufficient starting points for research when I need that knowledge.

None of this applies to maintaining my development machine. You might want to revisit your gate keeping.

Im not a developer, but in the field I work in, a couple hours dedicated to learning a tool that would help you isnt much of an ask.

And the little demo script GP wrote seems pretty straight forward to me.

What other tasks are planned in your current sprint?

Then you are not a devolper but an assembly line worker.

Nah. I agree with the other guy. AppArmour takes some time to learn and isn't super mainstream yet. At least it's a step forward from SELinux, but it still feels uncooked and adhoc to me.

Something like bubblewrap (which uses Linux namespaces) or firejail would be a simpler way to do a sandbox.

That is neither simple nor does it prevent some of the most common npm supply chain attacks (I think, no export with appamore).

The most common attacks is to mess with thinks you build for release in ways which then allow attacking the consumer.

For example by tweaking package, package lock, or even the source code.

Only the last one is prevented here.

Then there is also the thing that installing (user wide or even global) system utilities with npm is a thing, and wouldn't work with your apparmore profile (all need their own profile)

Similar locally running node programs are an attack vector and your profile allow modifying the `node_modules` of such programs even if they don't use npm...

Also you need root rights for loading profiles which can make having a profile for each project a problem as the dev doesn't necessary has root rights. (E.g. in a company with company managed desktop systems this is the default.)

Don't get me wrong. I'm not saying you shouldn't use apparmore and other sandboxes. I even think it's the future of Linux.

My point is appamore isn't simple and it's one of the most simple solutions out there.

Or in other words we still have quite a way to go.

And it's not just about doing sand-boxing, but also about managing sandboxed programs, e.g. last time I looked the app stores of both snap and flatpack where a horror-house of not-properly maintained packages with security vulnerabilities, as well as doggy potentially malicious "unofficial" packages. Even if you sandbox is perfect you still care about security, e.g. you don't want a openoffic package to send copies of all documents you use somewhere else... and that's if the sandbox if used (... flatpack ... facepalm).

> I even think it's the future of Linux.

Let me correct it:

It MUST be the future of Linux or Linux will have serious problems.

This solves only `npm install` and not other ways the package can be compromised.

And you're still giving it plenty of access to any projects under your home dir, or opportunities to modify other npm packages that are being installed, to gain execution outside of sandbox later on.

This was how I knew OSX was a real Un*x....I was messing around with...I dunno...brew's predacessor? A package manager that installed linux apps on OSX...everything was based under /opt, so it was easy to just nuke and restart to clean up the cruft.

typed sudo rm -rf / opt

And let it rip.

Then got a weird error message, like a library couldn't load....then the icons in the dock went away, then the only thing that was working properly was the web browser windows that were still fully loaded in memory

then noticed the SPACE between '/' and 'opt'

rm -rf was rm -rf-ing from root.

Valuable lesson learned that day.

My "best" rm -rf story was when I created a system user account on a production system (router) to test something. What I didn't know is that for system users their home directory is set to / by default; this will become important later.

I finished what I was doing and wanted to clean up after, so after undoing my changes I ran userdel -r on that user (-r removes its home directory). The command took way longer than expected, at some point I terminated it to investigate as I assumed there was something unrelated going on (high CPU or IO contention).

At this point you probably figured out what happened. By the time I stopped the command the damage was done and a good part of the system has been nuked. Surprisingly (and thankfully I guess) the actual firewalling and routing is done at kernel level so it continued doing its job totally fine despite userspace being nothing more than a smoldering wasteland.

Back in 2002 I joined a software company where we each had personal folders on a shared 2TB volume. 2TB was a lot back then, but it was needed because we work working with gene and protein sequences so you end up with quite a bit of data.

People used these folders for builds of our systems which could be accessed from any of our various supported environments (basically every flavour of UNIX under the sun - no pun intended there). Lots of them would also use them for development work, since many people simply remoted into a convenient UNIX box and fired up emacs or vi. I was one of the few people using my local machine for development because I was working on a Java application, and running an IDE locally was simply very convenient.

We also had our own CI system that built everything for every supported system overnight, and ran huge suites of automated tests, which also used this 2TB volume.

The key word here is shared. I had my own folder but I could do `cd ..` and see everybody else's folders, and then go poking around in them with full read/write access.

You can see where this is going, can't you?

A handful of weeks before I joined the company somebody had updated a script in a test case (I forget whether it was a pre or post) that did some cleanup. The clean-up was basically an `rm -fR *` in the current directory. What they hadn't spotted before commiting the script is that they'd `cd`ed up one or two directories too far, meaning that they ran an `rm -fR *` in the root folder of the volume.

Everything was gone. Nobody could get anything done, and it took them a day or two to restore the volume from backups (which, fortunately, they had).

Some people lost a day or two's work so, fortunately, it wasn't a business ending event or anything like that. More a cautionary tale and an object lesson about the dangers of running commands like this with unrestricted access to volumes.

The day I started work, we had the ability to browse and restore backups on our Solaris system via a Windows GUI. It was useful to retrieve archived data and job state like old logs.

Within a month another graduate developer had accidentally restored the whole FS. We got to go home early, but had browse-only access from then on.

Of course, we still retained write access to the whole FS because prod and dev were just different root directories and our deployment process was "cp" if you had CLI skills, or copy/paste in Windows Explorer if you like GUIs. We got an rm -rf runaway a little later, though only on home directories IIRC. The early 2000s were wild.

> The early 2000s were wild.

They certainly were. In my prior job my Windows 98SE development PC had a public IP address.

Ouch. Even with shared storage access control and permissions should have saved the day there!

I bet they discussed those kinds of options afterwards.

Did the same thing, wiped out server (it was running stuff like nagios, nothing 'production)

No big deal, restore from backup, right?

It turned out nobody had connected the backup drive to the new server. It was still connected to the server it replaced, which was still in the rack, but turned off.

Hey at least your backup server wasn't SSHFS mounted on your server and wiped also!

MacOS's BSD roots are fascinating. You can still run 'leave +0005' to remind yourself to leave your Terminal session after 5 minutes, to avoid using too much mainframe time.

How long ago was this? AFAIK the `rm` command in almost every linux distro these days will NOT let you delete `/` unless you add `--no-preserve-root` parameter.

I think the key take away in the parent comment is:

"This was how I knew OSX was a real Un*x"

Will it still start recursing and deleting most of the stuff owned by your user? That's arguably more important than the system files!


    # rm -rf /
    rm: it is dangerous to operate recursively on '/'
    rm: use --no-preserve-root to override this failsafe
There's still lots of ways to screw up a rm, like "rm -rf /*" or deleting your entire home directory, but a space after the initial / was apparently common enough that they eventually put a big failsafe for that in the GNU version.

More serious than users fatfingering the command were scripts with code like

  rm -rf /$MYVAR
If you don't set $MYVAR and don't set bash to error on unset variables you are in for trouble.

I don't recall which one it is (u for unset perhaps?) but I cover this by starting every (bash, E and pipefail are not POSIX) script with `set -eEuo pipefail`. All should be the default IMO, I don't understand - other than back-compat - why you wouldn't want them.

It's -u.

You can also prevent this by never using $VAR, and instead always using ${VAR:?} to exit with error if VAR is unset (or use one of the other options to provide a default).

A lesson I learnt several years ago when I discovered that mktemp behaves differently on macOS versus the GNU version in Linux.

From that day onward I always make sure the first line of my bash scripts contains at least "set -e".

I don't think OSX uses GNU tools unless you install them through brew.

macOS doesn't use the GNU versions, though.

I've done something like this, dunno, may have been around 2000.

I think on linux it will refuse to run by default an you have to explicitly disable that protection. Of course running rm -rf /opt /* by accident will still delete everything.

Depends on the distribution I think. The ones I tried here just started deleting things right away: https://bellard.org/jslinux/

Seems to be specific to GNU rm then, the Fedora linux on that page refuses rm -rf / unless you add the --no-preserve-root flag, the other two have a BusyBox based rm.

If you want to run rm -rf I encourage you to run rm -rf >within Gitpod< as many times as you want!



rm is disallowed to remove . and .. under POSIX, so, for that reason, / needs to be treated specially too.

I did something similar working at the South Pole about 20-25 years ago:

rm -rf * .some-extension

Missed the extra space, hit Enter, lost several days of work. Much smaller blast radius, but still an expensive mistake for a trip like that. No, it wasn't in source code control - a few of us might have been using CVS at that time (this was before Git's time), but apparently I wasn't, or I wouldn't remember the episode decades later.

That's why when using potentially destructive shell commands I always make sure to autocomplete the path using TAB.

sudo zfs rollback mypc/ROOT/ubuntu@tuesday

Would the zfs command still even be installed on a system? I'd imagine that `rm -rf /` would remove /usr/bin and /usr/local/bin (or wherever the command was installed)

I find that making sure -v always runs is pretty helpful. The one or two times I've noticed what was being printed and immediately interrupted is well worth the cost of my console being polluted in other instances.

MacPorts, most likely! Even after Brew took over I still ran it for ages.

I believe Apple forked rm to provide protection for this case?

Also, if you're willing to lean into npm a bit, there's tools that give a layer of protection over rm such as https://github.com/sindresorhus/trash-cli

You might be thinking of the GNU version of rm (the version on any modern linux). There you need `rm -rf --no-preserve-root /` to delete everything, which prevents GP's typo (`rm -rf /*` might still work though).

Can confirm /* notation still works. It comes down to how things are interpreted. The shell expands that before rm ever gets argv

Not under Catalina. Don't ask me how I know that.

I just tried it in Big Sur and it wouldn't allow it.

Nah, it totally works. Yall should try it.

I just did it and it seems fi

Did it give you the this is dangerous message or a permissions error? If it’s the latter, you probably got stopped by System Integrity Protection.

You also can't mkdir there either

A fun experiment is deliberately running sudo rm -rf / on a decommissioned machine or throwaway virtual machine from a shell on it and then seeing what you still have access to. Bash has a surprising number of builtins and there are still things floating around under /proc.

Always specify in order "sudo rm -rf opt /" so that opt is fully deleted before the / deletion causes too many failures?

Some of this is solved with locked dependencies and an understanding of how "npm install" actually works, but I find this point really takes away any credibility for this article:

> This is just one of the reasons why I think by 2023 working with ephemeral cloud-based dev environments will be the standard. Just like CI/CD is today.

Over my dead body will I ever use a cloud environment to write code. tmate[0] might be the closest I ever get.

I get some developers don't care about what they throw aimlessly onto a cloud, but I don't know what's being logged, what's being stored, how long it's being stored, who has access to it, what third-parties have access to it, and so on. The business I own or the one I write code for might care if those files were exposed like that.

Use a local VM. Or a development proxy. The cloud isn't a solution for everything.

[0] https://tmate.io/

The cloud environment where the code runs might be ephemeral, but it would most likely have access to some not so ephemeral resources that should still be protected.

If one wants to go sandboxing, the right place for it would be in npm, where packages should not be able to modify anything but themselves.

Don’t think within a package manager is the right place. Fundamentally there needs to be an onion layer around the entire activity (ie. a virtual machine, qubeos, oci/container, gitpod, github codespaces, namespace jail).

Example of tasty file that a package manager should never be able to read: /Users/mxey/.ssh/ssh_rsa or /Users/mxey/.netrc

Lovely, I had not seen this post before. Thank you for sharing.

Great read. Thanks for sharing

ssh_rsa shouldn't be readable to browser, Spotify, Zoom or whatever is currently running. And browser files should be isolated from any other process etc.

Linux security model (and I suppose other OSes too, but I'm most familiar with) is fundamentally incompatible with desktop environments. Mobile OSes got it much better.

I love Snap because it confines processes to sandbox.

Linux _does_ have SELinux. It's just too confusing to use.

It's unusable for anyone but few highly qualified sysadmins.

If good security isn't easily obtainable for tech-savvy person in one hour, there is basically no security.

It has to be on the level of the package manager, if you want to avoid package A interfering with package B

I wrote a [similar post](https://btao.org/2021/09/09/npm-install-is-curl-bash/) recently, and I think it's worth sharing this part:

> When installing an untrusted package, run `npm install` or `yarn add` with the `--ignore-scripts flag`. If, like me, you tend to forget this, you can set npm/yarn to never run scripts with `{npm,yarn} config set ignore-scripts true`.

This disables install scripts, which is a primary attack vector for malware on npm. It also breaks some packages, though, but I've had this setting on for a while with no major problems.

You'll presumably be executing the code which could trivially include `childProcess.spawn()`. I think energy would be better spent vetting the author of the package. If you're concerned, install it in a VM. It's never safe to run code in any language if you think it might be suspect.

The problem is rarely the author going rogue, but rather their accounts getting hacked, and the hacker pushing a malicious update to an existing package. Vetting the author won't really help, you need to vet the updates.

Vetting the author of the package is neither very realistic in a lot of cases, nor a great solution to this problem unfortunately. A lot of people got impacted by this issue through Karma, which is extremely popular and trusted by the community. In this case, the bad actor also managed to access ua-parser-js author's NPM account and push the bad version directly, so not really the case of an author going rogue.

You don't just have to trust the package, but all of its transitive dependencies as well.

Node apps should be developed in isolated containers or VMs.

We need peer review of every update, imo.

Starting to think checking node_modules into an app repo, so you get update diffs, is sane.

Small plug for LavaMoat (https://github.com/LavaMoat/LavaMoat) which includes tools to more granularly disable dependency lifecycle scripts via @lavamoat/allow-scripts.

Just as a funny data point, when it turned out that Go's go get command allowed you to run “arbitrary” code during build (not directly, but through gcc and clang's -plugin and -ftplugin features), it was marked as a CVE[1], heh.

[1]: https://www.cvedetails.com/cve/CVE-2018-6574/

Rust's Cargo seems to suffer from the same. And the tooling to avoid the wild "grab things from the Internet" route is quite lacking (shameless plug for attention: https://github.com/rust-lang/cargo/issues/10045).

I don't have a problem with making the convenient route possible, but it's very troubling that the tooling actively fights you if you want to take a saner approach.

I worry about the exact same risks with things like brew and chocolatey, and even docker.

The challenge with a lot of these systems is that there _are_ "official" packages that have some higher level of checks and quality control, and these best case examples are generally talked about on the relevant marketing pages for these tools. But these vetted packages are generally accessible in the same way as unofficial "wild west" packages that would be high risk to use. The line is very blurry and you can easily be sucked into a false sense of security.

I'm amazed at how many times I see these systems being used as a matter of course without any consideration at all given to the risks of importing packages of unknown quality from unknown third-parties.

And it's very difficult to avoid using these systems too. I use brew, docker, npm, Nuget etc. - it's hard to be productive and to not do so as a developer in 2021.

I have a standard process I go through for taking on new development dependancies, of vetting the author of Nuget/NPM packages as best I can. I rarely add a dependancy without having gone through my little (likely very insufficient) "due dilligence" process first, but this doesn't cover downstream/implicit dependancies...

I'll also try not to bring in dependancies for simple things. I prefer reinventing the wheel where necessary and writing my own simple string formatter or parser utility as opposed to taking the easy way out and adding another dependancy for the matter of a few lines of code.

But I don't know how to solve the wider issue. The first step, I guess, is to make sure that everyone really knows what the risks are, so that at least we're all using these package mangers with our eyes fully open.

Its one thing if these practices are followed when building something with little to no consequence but I've been saying for years, code that handles safety critical tasks or handles sensitive information where there is a lot of liability needs to be treated like civil engineering. There should be licensure, rigorous industry best practices and validation. We don't have that right now and I think systemically its why you see Boeings flying themselves into the ground and Financial firms losing everyone's data day in and out.

Some things should be held to a higher standard and with those, tools like NPM that only serve to make the job faster but not make the end product better shouldn't be allowed.

I remember viewing the source on my bank's online banking site a few years ago - self-hosted jQuery was their only dependency. I guess the risk for them is high enough to not branch out into npm.

I think this is the author's most novel insight:

> Honestly, I would not be suprised by 2030 if insurance companies made the usage of ephemeral sandboxes (in whatever form: be that cloud, OCI, or firecracker) a condition of issuing cyber insurance.

Whatever form it takes, insurance companies' influence on the development process is a new and profound thing. Try to think of an industry where insurance is involved that doesn't go to great lengths to comply with their requirements; manufacturing, bulk transport, personal transport, health care, home ownership, civil engineering, chemical development, plant maintenance, entertainment, farming...it's an endless list. Devs can rant all they want (people in other industries sure do!) but if the insurer's mandates are imposed at a company level, they won't have much choice.

The industry seems fairly unsophisticated so far. Our local agents are determined to attach cyber/data riders on the coverage a building owner has for an entire building. When I point out that building owners DGAF about tenants' data and how well tenants protect that data, they have no idea what I'm talking about. They seem very far from designating e.g. sufficient backup procedures.

That's interesting to hear. I wonder if smaller, more savvy insurers will lead the way in establishing relevant standards and then be bought or copied by the big ones.

I've also come to this same conclusion about two years ago.

Given the pace of new compliance I've had to onboard in the last few years, I think it's coming a lot sooner than 2030.

Maybe 2024 or 2025.

What's this insurance for? Data breach liability?

I'm not even so sure it's going to come from the insurance industry, but just from B2B customers.

Banks and insurance companies have been implementing on their own and requiring rigorous standards from their vendors. Then there's meeting the requirements for FedRAMP certification if you want to sell to Uncle Sam.

If you look at the tech stack of a company like Progressive and how they do software deployments, they are _significantly advanced_ in comparison to the rest of industry. They also made their transformation extremely quickly by committing adequate internal resources to the problem.

I would like to learn more about how insurance companies do software engineering.

Not to excuse NPM and the entire "broken in new ways every time you look at it" javascript ecosystem, but any installation system that can run arbitrary scripts/commands as a part of the install has this same problem.

This anti-feature should be removed, and other safer workarounds provided.

> any installation system that can run arbitrary scripts/commands as a part of the install has this same problem. This anti-feature should be removed, and other safer workarounds provided.

1000% agreed. I might need to refactor the article to remove the focus on NPM. It was cited for two reasons:

- awareness that an important RFC that people need to vote on

- npm is related to ua-user agent-parser incident (mystery meat in a binary package)

As bin_bash pointed out [1], removing scripts/commands from the installation process won't achieve too much. The code could trivially include `childProcess.spawn()`, so you're only postponing the issue until you'll use the module.

[1]: https://news.ycombinator.com/item?id=29161298

So we should just leave this security hole because there's others?

It’s not a security hole. When you build whatever software on whatever language you execute build scripts. It’s a feature of every package manager that I know because, even for interpreted languages there is always the case of needing to compile some native code extensions and thus you need the possibility to run scripts.

Just don’t run npm install (or pip install) on a production server but install things exclusively from the distro packages or if they are not available use a container. On your PC… keep a backup on hand and possibly don’t save the keys that you use to access production systems on a txt file.

Also, how many people copy/paste commands on the terminal that don’t understand without thinking about it? And do you review all the build scripts of software built from source? Let alone the proprietary driver that you install, who knows what they contain…

Really npm is not that much of a deal. The problem is more for CI machines that can get compromised but for workstations… they are already insecure

I don't get why package install scripts are the focus here. Yes they can run malicious code on your computer, but even if you take the correct precautions and block these scripts, the rest of the code from the package is still in your application. It will run regardless, not just on your local machine but presumably also on your servers and/or your customers' devices. If the install script was malicious why on earth would the rest of the package be considered safe?

The one and only solution here is – don't pull code from sources you don't fully trust.

> The one and only solution here is – don't pull code from sources you don't fully trust.

An idea that is antithetical to the whole ecosystem surrounding npm/JavaScript. You can be careful with your direct dependencies, but their dependencies and those things dependencies? That's how a bunch of otherwise very respectable/well engineered projects ended up with left-pad in them.

About 10 years ago I saw a few emails regarding broken video clips. Ignored them as it’s always a local problem (like the volume isn’t plugged in)

Eventually it got passed to me. The files were missing. This was on a 16T file store, one of about 10, and investigating showed about 200G was missing.

Backup was an rsync job every hour to another server, with a —-delete option every week on a Monday morning.

Trawling though the auth logs eventually showed a second line engineer had elevate shimmed to root and managed to delete a bunch of files by running “find -delete” with a mangled name filter. He’d tried to solve a real problem by copying and pasting from google, failed, not realised why it failed, and given up.

Several dozen hours of archive material had been gone, forever

(There was also the time a different engineer at a different site transferred an archive with ftp. In ascii mode.)

Why were they gone forever? This was on a Monday afternoon?

It had been several days since they were deleted. By the time the problem has filtered through the layers of support it was too late.

> git rev-list --all | xargs git grep "ua-parser-js@" | cut -d@ -f2 | uniq

Shouldn't there be a sort in there before the uniq?

> Honestly, I would not be suprised by 2030 if insurance companies made the usage of ephemeral sandboxes (in whatever form: be that cloud, OCI, or firecracker) a condition of issuing cyber insurance. In this distributed world where remote development is now a norm moving towards ephemeral sandboxes is an important lever to counter the increasing threat of source integrity and supply chain attacks.

Note that we still need to address the underlying issue of source integrity and supply chain attacks, because many of these things people are blindly including via NPM and similar are not just used in development. They are often also used in production code that is dealing with real customers.

> Shouldn't there be a sort in there before the uniq?

Probably. uniq only removes duplicate lines when they're adjacent to each other. I doubt the git grep command output has all the matches adjacent.

I've read it's more efficient to use 'sort -u' instead of 'sort | uniq'. These days I only use the latter if I need uniq's -c to show the count of matches for each unique line.

Careful: "sort -u" and "sort | uniq" are not equivalent. The former works on keys and the latter works on lines. It should be the same if you are sorting based on the whole line, but if you are sorting on part of the line it could make a difference.

Consider this input

  1 foo
  2 foo
  1 foo
  2 bar
  1 bar
  2 bar
For that "sort -u" and "sort | uniq" would give the same thing:

  1 bar
  1 foo
  2 bar
  2 foo
But if you wanted to sort numerically, "sort -n -u" would not give the same result as "sort -n | uniq". The latter gives:

  1 bar
  1 foo
  2 bar
  2 foo
but the former gives:

  1 foo
  2 foo
The man page for GNU sort says of "-u":

> with -c, check for strict ordering; without -c, output only the first of an equal run

but "sort -n 1" gives:

  1 bar
  1 foo
  1 foo
  2 bar
  2 bar
  2 foo
and so the equal runs are "1 bar", "1 foo", "1 foo" and similarly for the "2" lines, so I'd expect the output to be "1 bar" and "2 bar", not the "1 foo" and "2 foo" that it actually gives.

The man page for BSD sort's explanation of "-u" explains what is going on:

> Unique keys. Suppress all lines that have a key that is equal to an already processed one. This option, similarly to -s, implies a stable sort. If used with -c or -C, sort also checks that there are no lines with duplicate keys.

Doing "sort -s -n 1" shows "1 foo" as the first "1" line and "2 foo" as the first "2" line, explaining why those are the two lines that make it past "-u".

> The thing about open-source software that’s too often forgotten, it’s AS-IS, no exceptions. There is absolutely no SLA. That detail is right there in the license! In business terms, open-source maintainers are unpaid and unsecured vendors.

By extension, people can learn some things about their employer by their opinions of using open source libraries in company products.

1. The company is against open source entirely, preferring to buy closed source alternatives, which tells you that the company thinks software development (and by extension you as the developer) is just a cost to be minimized that they don't care about otherwise.

2. The company is not against open source, but does a poor job of keeping tabs on security patches, which just tells you that the bosses are not very smart, I suppose.

3. The company does a good job of using open source, contributing back to the libraries they use to improve them, and keeps up with security patches and issues, which means the company is invested in its developers and development processes and is likely a good place to work.

Great article. The JS ecosystem/supply chain scares me. 94% of all active repos on GitHub rely on Javascript (according to the 2020 state of the octoverse report). There is a lot of motivation for black hats to insert Bitcoin miners into semi-popular npm packages so they can fly under the radar. Scary times.

JS land is a total shitshow, that's a given.

But this sort of thing happens quite often outside NPM as well. If you're an author of a popular web browser extension then you've probably received emails from random shady people offering to buy your extension. And of course, some people are offered a sum they can't refuse. Google and Apple app stores aren't immune from this either.

I just remembered the absolute saddest hijacking I ever witnessed. It was this blog I read one day. The guy had some interesting articles (business, I believe). I kept reading and he starts talking about his health. It gets worse and worse. Then suddenly the tone of the articles change. I notice weird links to vitamin and supplement crap stuck in articles that had nothing to do with it. The articles became incredibly generic. So I did some research. Turns out, the guy died from cancer, apparently his domain name expired, and some SEO spammer type took his domain and his content and repurposed it for shitty harvesting purposes.

Not sure about pypi but any non-trivial node app has exponentially more package dependencies compared to a similar sized ruby or go app, due to a combination of the "micropackage" fad and the language is a bit of a fragmented mess with a not great standard library. You end up with a lot of polyfills, normalization, and helper functions that are wrapped as their own library.

That's my experience, anyways, having previously worked on a large rails application, as well as large golang applications, and now spending most of my time in a typescript project.

It's not really a fad as much as a selection pressure. I wrote an article about it because there is a lot of confusion about why one-line npm packages are popular: https://erock.io/2021/03/27/my-love-letter-to-front-end-web-...

I may have missed it, but I didn't see any explanation in that article of why anyone would depend on something like leftpad instead of just writing the single LoC directly into their application.

Can I ask, is the Node ecosystem different in this respect to PyPI for example? Or is it not different at all and the same thing could (does?) happen there too?

They're really quite different in my experience, but the difference is not one of availability so much as it is Python developers (mostly) following traditional software engineering best practices. I write almost all my hobby projects in Python. Very rarely do I incur a dependency tree involving more than a dozen projects while doing so.

The rm -rf vulnerability is essentially a problem with a chain only being as strong as its weakest link. If any of the maintainers in the hundreds of node dependencies your project uses is malevolent, you're screwed. Hence, the security of the chain depends much more on the number of links it has rather than its innate strength.

Even large and complex dependencies tend to be well engineered in Python. BeautifulSoup is a widely used library for loosely parsing HTML. It requires only Python and an internal library. lxml is another HTML parser (which BS can optionally use), and it requires only Python and a couple of C libraries. Even an entire web framework (Flask) uses only 4 Python dependencies directly and only one of these (the Jinja template engine) has recursive dependencies on other Python packages. All told it's about 10 Python packages needed in total.

Or consider this, if you want an example of a trivial command line tool: the Python tldr[1] client uses only 3 libraries as recursive dependencies. The Rust client, tealdeer, has 119. The official nodejs client has, if I'm counting correctly, 603.

[1] https://tldr.sh/

Thanks for this.

I'm not sure what countermeasures that project has in place, but there are plenty of articles exploring malicious packages found there.

    me: hey, how can I do $x?
    random: $ sudo rm -rf 
I love stupid stuff like this that gullible and naive people follow. It reminded me of my early days in Counter Strike.

    XxXButtStuffXxX: how do I spray decal?
    random: press F10
    random2: press Alt+F4
    XxXButtStuffXxX left the game
It's classic, harmless trolling/pranking before it became the toxic beast it is today.

It'll slow down your firing rate but I suggest to all noobs that they `bind kill lmouse` to make sure they can compete on a level like the pros do.

Looks like the wrong way around. You want `bind lmouse kill`, probably?

Quite possibly - it's been about a decade since I last had to fiddle with valve keybindings.

Isn't it `bind "mouse1" "kill"` ?

Unrelated, but it is interesting to me that in some OSs you can make a file named "-rf" and if you "rm *" the filename becomes a switch argument.

That's perhaps the slightly wrong way of looking at it. "rm" is an executable that parses its command line; the asterisk is a special symbol that gets expanded to a list of all file names (except those starting with a period). So the arguments `rm` receives when you type `rm *`, is a list of all file names. It does not know that, because all arguments are just strings, and therefore it parses those strings the same way as if they had been typed, which makes that a filename like `-rf` can trigger the behavior of the "flag" `-rf`.

Are there any (Unix-likes) where you can’t? Aside from degenerate cases like Windows SFU/SUA (though I’d expect that one to allow it as well).

Globs getting interpreted as switches is a very unfortunate footgun, sure, but I don’t see how you could eliminate it while retaining the same level of smarts in the shell (tokenization but only into dumb strings), and DOS/Windows command line parsing is IMO essentially a reductio ad absurdum for the idea of having less (no tokenization, but the shell still splits commands). Except maybe require the -- between switches and arguments whenever there are arguments? But not only is that an incomplete fix, it also goes contrary to convenience of interactive usage, which is important: much of the tension in shell syntax and Unix utilities is due to balancing it with conventional programming, I feel, and I wouldn’t want to be writing even Perl or Tcl instead, let alone Powershell or the like.

Maybe. I hedged my statement with the weasel word "some" since I haven't used many Linux flavors and I anticipated someone coming back with 'BeaspokeLinux doesn't do this!!'. For what it is worth, it has been All that I have used...

For a similar reason, `touch ~/-i` can protect you from some problematic `rm -rf *`s.

The canonical writeup about this issue is here:


getopt-like libs stop parsing options after “--“, so “rm -- *” should be safe, afair.

Using double-dash solves almost all such issues. One notable exception: a file named dash will still be interpreted as stdin.

  echo a > -
  echo b | cat -- -
Output: b

What do you mean some OSs? It works fine on Linux and bash. With zsh I have to expand the * to make it work.

No OS is special here, nor is ZSH/bash... Tried this with bash on linux:

    $ touch -- -rf
    $ mkdir subdir1
    $ touch subdir1/test1
    $ rm *
    $ ls
With zsh it's almost the same, it just warns me first:

    $ touch -- -rf
    $ mkdir subdir1
    $ touch subdir1/test1
    $ rm *
    zsh: sure you want to delete all 2 files in /path/to/test [yn]? y
    $ ls

Since there was a file called -rf, `rm *` expands to `rm -rf subdir1`, and that's what it does. The actual file named -rf survives because -rf was interpreted as an option rather than a filename. It would only have been removed if I did `rm -rf -- *`.

Since there are so many "I screwed up my rm" command examples in here, let me recommend to you all `safe-rm`. It's in Homebrew and launchpad: https://formulae.brew.sh/formula/safe-rm and https://launchpad.net/safe-rm

Basically the entire implementation is defined by this one file, it's pretty simple, like you'd hope: https://git.launchpad.net/safe-rm/tree/src/main.rs

Re https://github.com/npm/rfcs/pull/488. Thank you for your service! I had this on my TODO for months. Actually I just use yarn now where ever possible and set `enableScripts: false` both globally and per package. Doesn't solve all of the npm ecosystem's sec issues (like allowing downloading binary blobs and other assets from arbitrary URLs, no cleaning on permission bits on archives, not blacklisting certain typo-esque package names, ...) but it's clear low hanging fruit and the push-back from maintainers is really crazy ...

This article brought back memories. When discussing disaster recovery with developers, they said we would just rebuild the application server from source control. I then asked what if you installed a malicious program that deleted all of our data? They said that's the DBAs problem. The DBAs said (at the time) we have so much data our "backup" consisted of swapping out our EMC SANs. So if we lost of our data, the other SAN contained last week's data. At the time, there was not realistic backup other than manually syncing the SANs and trucking the copy to a safe location. Today, now with DevOps, recovering lost data is also the developer's problem.

"whilst the internet has fundamentally changed over the years, one thing has not - the internet is still a dangerous place filled with bad actors."

I absolutely do not remember the internet being so characterised back in 1996. It was a place of wonder and hope.

IRC (well, EFNET) especially towards the later years was anarchy. If people weren’t instructing newbies to `rm -rf` they were running scripts that resulted in peoples modems hanging up upon joining.



I never understood the reason why anybody would sensibly install and run NPM (or even any other development tools) outside a sandboxed environment like container or VM on your workbench machine. Let alone cluttering the system with massive amounts of packages that will be a huge pain to remove once they are not needed anymore.

The reason I use a VM or container is so that I can get rid of it easily.

That's the funny thing about npm though, is that it's so easy to nuke all of the packages that the first thing anyone tells you to do when you are troubleshooting is rm -rf node_modules. Python can be harder if you're not paying attention to where you're pip installing.

Exactly. Software development which involves 3rd party code should be done in a sandbox.

and of course the sandbox is also third party code... uf... you now went from using 1 third party code party to two. good job.

Why I use Qubes and not Docker or run untrusted code on a local machine. Similar to if you are okay with and get cracked software from the internets why should use a dedicated, clean VM clone.

Mitigate risks. Please.

>As a member of the generation that is the Eternal September (ie. complete unawareness of pre 1993 internet etiquette) I launched right into asking my first question without saying hello.

This struck me, because in most help-oriented discords I've I've joined nowadays, the policy seems to be "don't beat around the bush and just ask your damn question". I've actually been chastised for not just asking my question immediately a few times. Funny how cultures change.

Last year,I was trying to download wine to do something on Linux... I was in a hurry and just copied a `sudo apt-get install wine-i386` command... I said yes, and it proceeded to uninstall my Linux install, because I am on amd, so it basically saw a conflict and removed all my operating system packages. I went back to windows after realizing there was no easy fix and I'd have to hard reformat now that I had uninstalled everything.

All modern software has "undo" functionality. Except package managers ...

How does this work when I want to take personal notes on source code, not push them anywhere? For those who'd ask why anybody would want or should be allowed such a thing: There's little distinction to me between thoughts in my head and notes in a file on disk, fundamentally and practically. I don't think it's a good thing to push all my thoughts that happen to be on disk to the cloud.

Or mount my local emacs configuration, which I sometimes tweak and currently just save to disk and carry on with my work (I'm not asking how to run emacs in gitpod, I know there was some work on that)?

Am I out of luck, gitpod/insurer/employer knows best? Or are there ways provided accessible to individual developers to 1. do arbitrary customization and 2. "mount" writeable directories from a laptop filesystem (or something similar)?

Hey thingification, Geoff here from Gitpod. I build Gitpod with Gitpod in Gitpod from an iPad. For taking notes I use the iPadOS15 notes feature by swiping from the bottom right hand hot corner.


Have you blogged about your method of note taking? Are they done in-line on the code? I see the mention of emacs (high five) but have you come across https://marketplace.visualstudio.com/items?itemName=vsls-con... yet? I wonder if such a thing exists yet for emacs…

I use emacs to take notes, yes. I haven't seen any other tool that has similar features for that purpose.

But more to the point, I don't want to have to justify every tiny aspect of my development environment. Does gitpod provide a way out of that sort of authoritarian system?

> For taking notes I use the iPadOS15 notes feature by swiping from the bottom right hand hot corner.

Honest question: what does swiping have to do with how to not have all my personal notes in the cloud?

Hm, I think perhaps you're saying one can keep notes that are not in the dev environment at all? Of course that's true, but I'm incredibly accustomed to working entirely in a heavily customized emacs, switching between org-mode buffers and code. Sometimes that's via org links to specific tiny fragments of code content (a link to e.g. a filename + the text "def my_function(" will take me right there even if the code changes a bit).

At first I lol'ed at https://www.youtube.com/watch?v=0t85TyH-h04 because of the idea that payment reduces burnout. I'm happy to write software at my own pace and focus on what interests me. Where payment implies deadlines and working on things someone else is interested in. I'd rather you keep your money and learn how to write your own software.

But then I realized he's still at the bottom of the hierarchy of needs, while I'm at the top, self-actualized. If he ever does get paid to produce drudgery, he will see it was not the solution he hoped.

I can't recall the software, but few years ago there was a famous typo in the update script, namely a space after '/usr' in the rm -fr command which effectively removed whole /usr. Does anybody remember what that was?

Bumblebee (related to NVIDIA hybrid graphics offload):


There so many comments on that commit that Microsoft Github's backend apparently cannot load the page -- I just get the "unicorn".

Anyway, this issue also points it out: https://github.com/MrMEEE/bumblebee-Old-and-abbandoned/issue...

I get the unicorn as well. So it seems like you could kill GitHub issues this way?

In 2015, Steam's Linux updater included

    rm -rf "$STEAMROOT/"*
and if $STEAMROOT wasn't set, that amounted to

    rm -rf *

That's why I always use :? with rm in bash scripts. eg

    rm -rf "${STEAMROOT:?}/"*
Will exit with error instead of running rm if STEAMROOT is unset. Even if used inside a conditional.

Or just `set -u`

This happened to Steam on Linux.

It also happened to squid proxy https://access.redhat.com/solutions/1391523

The good old hunter2 days.

The good old what days? All I'm seeing is '*******'

Last time there was widespread abuse of post-install scripts, NPM created `npm fund`. It's sort of like an abandoned street corner that overly idealistic OSS maintainers are ushered off to when they try to make money off their work.

Maybe if this also picks up steam, we will get `npm brick` as the solution from the NPM team.

while `rm -rf` is intimidating, it's `fdisk /dev/hda` that strikes fear into my heart

I always fear using the disk destroyer, dd.


- If=input file

- Of=output file

Never swap (you’ll make this mistake once and hopefully never again) the order and always do IF before OF (I comes before O in the alphabet)

Not to mention the whole house of cards comes crashing down when some poorly maintained project either throws a major release out of the blue or retags "latest" to something that suddenly has huge breaking changes...

I haven't tried it yet, but Ego [1] feels like an imperfect but good enough solution to run your dev environment.

[1] https://github.com/intgr/ego

My similar lesson was towards the end of the 90s, following a tutorial on how to install redhat on my windows machine. It went something like:

First, you need to create a partition: Run fdisk.

A few short seconds later I unfortunately no longer had a windows machine.

This is why I try to assume the worst case.

I.e use timeshift backups / cloud sync / src control for anything vital + treat your OS as a throw away, or at least, re-instate-able in <30 min.

"bash < (curl $joerandomurl)" is roughly in the same category.

Just do:

1) regular backups

2) use LVM for your filesystems and make a script to create regular CoW snapshots. I always have a snapshot running that lets me get back to previous, known state with very little effort.

Yeah, I have done it too.

I intended to delete a folder, but somehow `/` sneaked in and I pressed Enter...

I also think my PC back then looked exactly like the one in the picture from the article :)

Hah! I fell for the exact same thing, from a guy on IRC, back in maybe 1998. Learned a lot that day.

i worked with a jr. dev copy+pasted a crypto miner in js came asking why the code is slow

Why would you run npm install as root? That sounds like a bad idea.

Used to do that to noobs all the time.

Ah, so it was _you_. My name is Geoff. You killed my /etc/ppp/options. Prepare to die. </joke>

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact