Hacker News new | past | comments | ask | show | jobs | submit login
Detecting the use of “curl – bash” server side (2016) (idontplaydarts.com)
198 points by tosh on Dec 9, 2020 | hide | past | favorite | 132 comments



I wonder why there is such a focus on this `curl|bash` pattern. Meanwhile most of us are downloading hundreds of thousands of lines of code using all kinds of package managers and I don't see many inspecting all those downloaded files, especially not manually. I don't think anyone would ever get to doing anything other than checking if you really want to verify everything.

I'm not saying that downloading something from your official OS package repositories is the same as downloading a random URL from the internet. The thing I'm more thinking about is language specific package managers such as NPM, Composer and Cargo. Or user repositories, things like AUR, PPAs and non-official apt repositories, where any random person can put something up. The thing for those is that they almost look like they are something official and something to be trusted. Often times they are displayed on an official site, you download them from a trusted URL and they look like they are really secure, even with hashing and things like that built-in. Lots of package managers don't support any way of verifying the identity of the one uploading the files, and even if they do we often import signing keys into our chain of trust without a moment of thought or we don't use the signing mechanism at all.

And with something like NPM packages you are likely to download another few dozen of other packages which you didn't even intent on downloading. You will probably run a lot of code there that could be doing all kinds of horrible things.

At least with `curl|bash` I get some feedback of where the code is originating from, what URL will I be downloading something from and is that some place that I can trust. At least I get somewhat of an identity verification (albeit very very weak) as long as I trust the owner of the site to protect it adequately from preventing unauthorized uploads.


I would put the language-specific package managers in the same category as curl|bash because anyone can push code without anyone else checking it, but there is a real difference with your distribution: it acts as an independent third-party. In that sense they act somewhat similarly to Certification Authorities, in that I as a user will not blindly trust a self-signed certificate but will trust a certificate that was vetted by this third-party.

In practice when you install something from AUR with a helper, it's not that far from doing a curl|bash (except the helpers will nag you to inspect the content, but allow you to skip doing it by default). The difference is who you curl it from.

Edit: as a precision, I do differentiate official repos and "third-party" repos; the latter are definitely a more integrated curl|bash, the same precautions apply


It's more complicated than that.

curl http | bash - you're basically throwing caution to the wind as everyone between you and the server can rewrite the request, meaning a fairly large number of people who serve you something malicious.

curl https | bash - you're putting your trust in the server, and the PKI / CA infrastructure. A small amount of people can hurt you. If we park the PKI argument, only the owner of the server can attack you. The problem here is, the server owner can specifically detect your behaviour, and take advantage of your trust.

language specific - _generally_ you're pulling a hash from a repository that's public and many people can and do audit. You can't be spot served a slightly different, malicious version of a file, without it first being published for others to see. This means the vendor risks their reputation with this kind of attack, and you're likely to find out about it at some point down the line.

Obviously reality is slightly more complicated, but if your language package manager is relatively modern, pulls and checks via hash, and offers up a .lock file functionality, then it's quite a bit different from a curl http(s) | bash.


For starters I think we can agree that no-TLS is just out of the question in any case.

You are right that there is a difference, but to me the real threat model is different: in your comparison you assume that the original author is legit but the vendor can be malicious. I believe it's more accurate to assume the original author is malicious. In that case:

curl https | bash - you are compromised

language specific - the malicious author's content isn't checked before it is being pushed. They have a window of opportunity before being discovered by the community and be banned, but the hashes don't protect: the verification must be done manually

third-party repos - I'll only take the example of AUR because it's the one I know best: if the malicious author is also the packager, the situation is the same as the previous one. But, as is often the case, if the malicious author is not the packager, the latter has to be convinced to serve bad content and acts as a simple gateway


Agreed yeah.

There's a whole bunch of complexity that goes into whether or not your should trust an entity.

In general though, my opinion, if you use reasonably popular and thus regularly audited packages, you have protective monitoring, and a defense-in-depth framework, there is obviously still a risk of you being first to pick up a bad commit, but you can mitigate those relatively well.

Front end has different considerations. I believe you can defend against the magecarts of the world with CSP but it's not my own forte.

The big thing is, of course, if you're not willing to do your part scanning, reviewing and auditing, nobody else will. Tragedy of the commons and all that.


>it acts as an independent third-party. In that sense they act somewhat similarly to Certification Authorities, in that I as a user will not blindly trust a self-signed certificate but will trust a certificate that was vetted by this third-party.

I have no idea why anybody trusts CA's in the first place. People seem to imagine that there's some gate in play where Mr. D. Badguy doesn't get certs signed by Verisign. He absolutely does.

This has been an issue that "Web of Trust" doesn't really do anything to solve, and the delegation of worrying about this crypto non-sense going to Admins instead of users themselves just kicks the can down the road. Random code on the net is exactly like buying a blackbox in a Bazaar somewhere, If you don't have the skills to run/vet/sandbox it safely, no amount of Web of Trust nonsense will save you from it.

All it does is piss off users, devs, and admins alike when something goes wrong with certs, and gives a centralized authority a lever to pull to screw with you. Another brick in the monopolistic wall.


> All it does is piss off users, devs, and admins alike when something goes wrong with certs, and gives a centralized authority a lever to pull to screw with you. Another brick in the monopolistic wall.

Oh, c'mon. Bad certs do get issued, but it's rare. And blindly trusting an attestation from DigiCert that you're talking to Amazon.com is a whole lot better than most ways you'd check.

And then pinning, in turn, makes things a lot more resistant to many of the attack scenarios that remain, for users who visit you multiple times.


It's because security is not an on/off switch, it's a sliding rule. The further you push it, the less convenient it is. No one ever said Verisign as a CA is a perfect system; it's just better than assuming the server's certificate is legit. It reduces the risk, it doesn't remove it.

At some point you want to use the Service/see the content. As you said, you can't vet the whole stack from top to bottom, there is not enough time in a life for that. You have to start trusting someone, somewhere


Exactly this. There is precisely one entity[0] who can legitimately certify a particular public key as who `example.com` belongs to, and that is whichever entity controls the (definite article, globally unique) DNS servers for `com`, exclusively in a capacity not detectably distinct from the rest of the process of registering `example.com` as a resolvable domain name.

0: Mumble mumble namecoin, mumble mumble not technically a entity, but that's not particularly relevant for most cases.


I personally don't like `curl | bash` because I don't know _how_ something will install: 1) What are all of the directories that something will insert itself into? 2) What of my files (.bashrc, etc.) will it modify? 3) If it modifies those things, will it tell me?

The `curl | bash` install pattern means that it can do _anything_.

Using a package manager I know that the install will be "typical", and easy to uninstall (that's the case with most of the package managers that I use anyways). Each package manager has a different pattern, sure, but at least it will be predictable.


This isn't how I'd describe the guarantees provided by a package manager. In fact, most package managers don't really provide any guarantees at all; almost all of them support something like preinst.sh and postinst.sh scripts which can basically do anything. It's the package maintainers that are supposed to provide the guarantees you describe. Of course, they're only human, and their incentives might not line up with yours.

And if you stray outside the official channels, as most users must at least some of the time, then you're back to all-bets-are-off. Fetching and installing packages from a channel hosted by some third party really is no better from a security standpoint than running a (signed) shell script from that same party.

EDIT: I should add that there may be some new, advanced package management systems that do actually provide strong guarantees, like only putting files in certain directories, never setting the setuid/setgid bits on executable files, or perhaps ensuring that all files from a package are owned by a user:group associated with that package (the Linux From Scratch docs describe a package management scheme like that, it's worth checking out). I'm referring here to the majority of popular package managers, e.g. dpkg, which will run arbitrary code during installation.


You make some good points, but I want to follow up:

With dpkg packages for example, you do get a few guarantees.

1. The package will include a list of files which it installs 2. The package manager will not overwrite existing files which were installed by dpkg without an explicit diversion 3. When uninstalling, the package manager will remove any of those files, and the directories created for them (unless they are not empty or are also crated by another package) 4. It won't run as non-root (unless you've made some major changes to your system), and as such won't prompt you for, or try to take advantage of, sudo access.

Sure, that doesn't stop out-of-the-norm behaviour; the Oracle Java packages are a great example of this; the packages contain only a shell script which downloads, unpacks, copies, and symlinks the actual Oracle Java tarball from Oracle's website, and then (ideally) removes those packages if you're uninstalling. Still, it's far more of guarantees than curl|bash provides.


I don't think the guarantees you've numbered 1, 2, or 3 are true. Insofar as the package uses the standard mechanism for installing files, sure, it can guarantee that. But I don't believe it hooks a tracer up to the installer script to detect the betrayal of those guarantees. I think it just runs the install script, as root, trusting that the files list and uninstall scripts will do their job. The whole thing is based on implicit trust of the package maintainer, not guarantees in software.


You're right, and I've called that out in my post as well (re: Oracle Java, as an example).

That said, I've got far more trust in someone who's gone to the trouble of making a .deb file than someone who put a shell script on GitHub.


This is not exactly a fair comparison because it is documented and configurable but I've recently found out apt on its default settings does something unexpected (for me at least) when removing packages (purge + autoremove): Normally you (I) would expect all automatically installed dependencies (depends/recommends/suggests) to be gone after this, if no other package references them in its depends/recommends lists (which is what gets installed on the default settings).

However it turns out if a package suggests another package and that other package somehow gets installed, the suggested package will not be autoremoved anymore because autoremove honors suggests relationships as a reason for not removing automatically installed packages. While there are valid reasons for this (e.g. when installing something with --install-suggests) it also amounts to a lot of unwanted packages after a while of installing/uninstalling software. I don't know if this has an widespread name but I call it "suggestion congestion" for that.

Of course, one can turn this off by setting APT::AutoRemove::SuggestsImportant to "false". And really, that is an awful problem to solve since you have to deal with different users and package maintainers with different expectations. And apt still solves a lot more problems that it creates.

But I'm now convinced that there is no such thing as a clean uninstall. At least not until the year of the stateless ZFS snapshot rollback NixOS desktop.


You can also pass --no-install-recommends to apt for a one-off installation to avoid pulling in a ton of garbage from a specific installation.


Meanwhile, in Windows-land, literally everything is installed by clicking Setup.exe.


Not quite. Corporate machines will most likely have some kind of management like SCCM, and there are options like PatchMyPC or Ninite for home users.

There's also Chocolatey and OneGet or whatever it's called today, plus vcpkg and nuget over in developer land.


Chocolatey and OneGet packages are usually wrappers around setup.exe/setup.msi with commandline arguments to keep the installer quiet, nothing more.


> I wonder why there is such a focus on this `curl|bash` pattern.

Because it's easy to understand, it's a cheap way to look smart on the Internet by bashing people. Also on a lot of servers people might only run ditro packaged packages. More eyes has gone though them so ops people would bash someone for curl | bash on their servers while it's perfectly "acceptable" on client machines.


I have put several curl|bash things into production and it's done because it's the only way to run installers that work on everything but windows without having to maintain a .deb, .rpm, and brew formula or something.

Often I'll write something like: here's the install steps, or just enter this curl|bash line into your terminal. Guess which users prefer.

People who care can download the script first and/or run it as a different user or in a vm. It's not that scary.


i like it, can you make them available ?


This is the answer. It’s the “never use inline styles” of ops: A rule that was once taught for good reasons, and is easy for people who know little else to call out and enforce. Never mind that times have changed and the reasoning that caused people to create these rules in the first place no longer makes sense.


Agreed.

The truth is: We're downloading and executing code from the internet all the time and the amount of trust we can put into this is very fragile. Some risks can be mitigated by installing stuff in containers if you don't need them to interact with the rest of your system. It's conceivable that the whole situation could be improved by a combination of reproducible build and packaging processes, transparency logs etc., but none of that exists today in any way that would provide a reasonable level of protection.

Right now the curl|bash-pattern isn't any more problematic than downloading an installer from a random page and doing chmod +x;./install.sh or using a package manager installing an unreasonable amount of dependencies.


> Meanwhile most of us are downloading hundreds of thousands of lines of code using all kinds of package managers.

Depending of course on the Package Manager, but traditionally those are signed, usually by people who actually do inspect the code. (I used to maintain Fedora RPMs, we audited code before putting our signature on it)


curl|bash allow personalized attacks... If for example you have an IP address from a certain company. (if you have access to ad targeting data you can refine a lot further - just remember web site visits from an IP and match them to IP from curl command)

repos are mirrored, come with signing keys and any successful attacks are detected sooner or later and become public knowledge.


I wasn't arguing that official distro repositories are unsafe, I was actually saying user provided repos are almost as bad (or even worse in some ways, given that give the feeling of being way more secure) as `curl|bash`. Even if they are signed (such as AUR and PPA) most people will blindly add signing keys for people or organisations they do not know, giving them the feeling that they have secured themselves, but have they really?

I guess detecting attacks is easier if all files have to be uploaded to a central service, which does allow everyone to see the personalized attack (I mean adding `if (targetUser()) attackTarget()` isn't that hard, but it would be visible for everyone compared to doing that server-side). But then if I'm a sophisticated attacker I'd be sure to make that way less obvious in my code. My feeling is that it would be detected later rather than sooner if hidden well enough. And that is excluding things like non-official apt repositories.


Is `curl raw.githubusercontent.com/.. | bash` fine for you? I think most of the curl | bash uses github master branch. Using your own domain is actually scary as owner have to be sure that they never lose control of them.


I would never pipe something to bash.

Always download, inspect, run. (maybe even backup if something strange happens)


Really though? What are you looking for in this inspection?

This strikes me as one of those things where the “inspectors” underestimate the security of “curl|bash from a known HTTPS origin” and overestimate their ability to detect anything that could evade that security. At that point you’re dealing with a g0d level hacker, or your cert trust has been broken, and in either of those cases you were already pwned.


I read the script and see if I like what I see.

As example: https://sh.rustup.rs It's really easy to read and useful to understand what it does.

If it's too obfuscated and I can't understand it I don't run it and look for other install options or give up

If I do spot bugs, I'll go to their github and provide a PR.

If I spot something malicious I'll check the github to see who put it in and raise the problem. (if it's not on github then alarm bells)


> repos are mirrored, come with signing keys and any successful attacks are detected sooner or later and become public knowledge.

1. Not all package managers come with signing keys or actually check them.

2. "Sooner or later" - weasel words. Some of these breaches have been discovered years after the fact. Who really cares if they get discovered after 3 years? By that point all the harm has been done plus the attacker could have taken control of the systems in more varied ways so even removing the initial entry point won't save you.


> 1. Not all package managers come with signing keys or actually check them.

Seems like a very big problem with those package manager... Ubuntu as far as I'm aware does proper signing. (as any sane distro and hell, microsoft too)

I would not be using those package managers.

> 2. "Sooner or later" - weasel words.

What's your point?, I trust Ubuntu/Red Hat to keep their keys safe. I trust that google project zero and others would notice anything spooky.

I do not trust a random distro with only a few users to keep their keys safe and I do not use that.

It's also hard to do a proper attack when you have:

ubuntu -> (n) mirrors -> me

Ubuntu can't push a malicious package directed at me (I go via mirrors which can be picked at random)

Mirrors can't push a malicious package directed at me (they would need ubuntu signing keys, and someone would need to own all of them or be very lucky)

And if someone does compromise Ubuntu's keys, they're not going to go after me and risk getting detected that way.

There is a lot more security built into package managers then what I said compared to 0 you get on curl|bash.


That's true if you're using Ubuntu's repos. But a lot of software on Linux comes as a key that you need to tell apt to trust, and then a repo that uses that key. This is just as unsafe, if not more unsafe, than curl | bash - it gives me a way to not just send you malicious code today, but also any other time you apt upgrade.


We are talking on an article that highlights a major flaw of curl|bash.

The website owner can determine if you are just downloading to investigate script or if you are downloading and running.

In the last scenario the owner can decide to give you bad code and you won't know what happened / can't prove that the website owner did anything to you.

With APT the owner cannot see which case it is in, someone can always investigate what is being published by just downloading a package.

Otherwise, as you noted - if you trust the wrong person you will get owned either way, but curl|bash is inherently more dangerous due to easy targeting.

(I can push a package in apt via curl|bash too so it gets upgrade regularly)


While this technique allows an attacker to avoid revealing the exploit if you simply redirect the curl output to a file, it will contain tell-tale information (in this case, bufferloads of zero bytes) allowing one to discern that it is up to no good.

The author hints at other techniques for detecting curl|bash (http or dns callbacks), which would obfuscate but not completely mask the attacker's intentions.

Note that I'm not advocating for using curl|bash: it's a technique for gathering low-hanging fruit, and there's no point in putting yourself in that position.


Quick note - I've had this happen to me.

- browser crash

- I reload last website

- crash again

- I know that site has an exploit - so I try curl to get the payload - it's no longer there.

- I set up wireshark - open up in browser - exploit no longer there.

I'm now stuck with no way to figure out what happened, core dump is useful to prevent the crash but not find the code that triggered it.

So disconnect / fresh install OS.

This kind of targeting can happen now with curl|bash detecting if you install or just download.


It would require somewhat more sophistication on the attacker's part to detect curl|tee|bash being run in a VM, I think. Also, can you start bash with tracing on? Or put awk in the pipeline to turn it on, and also filter out attempts to turn it off?


Package managers include npm, bundler, maven, gradle, cargo, etc, not just distro ones.


and those package managers need to have security built into them as well.


Anyone that dedicated would probably more likely bash you over the head with a rock until you give up your password


why would they do that when they can run a script from halfway around the world and take profit without getting caught?

I'm thinking ransomware attacks, bitcoin mining farms set up on AWS accounts after stealing keys / racking up huge costs, bank account takeovers, stock market account takeovers...

Someone hitting you in the head is easier to avoid / easier to recover funds from. (and if you're in the US and have a gun, that person trying to hit you in the head is going to have a bad time)


I mean they'd more likely do that than target you personally with a curl|bash. That's a very noisy and blatant move, super unlikely to work on anyone techy enough to know what curl and bash is, probably the last resort. Other exploits are on the table and indeed much more likely too



Because piping curl into the bash is just an unnecessary risk that gives you very little benefits (you speed up a setup a bit), while package managers actually help keeping a project update-able and deployable in long terms. In the end we all end up with some sort of compromise between security and usability/maintainability - 100% secure doesn't exist. Trimming as many risks that you can do with out, while keeping the most of the useful functionality is a reasonable strategy for most projects.


Maybe it is because the website owner has full authority to change anything at their discression, while git packages usually exists in an ecosystem that can be observed and tracked.


git allows rewriting history. It doesn't seem unlikley one could come up with an attack which gives a malicious git clone to one user, and then rewrites history so all other users later don't see the maliciousness.


Rewriting history has absolutely nothing to do with this. In a VCS that doesn't allow this, I could just hand out repo1 and repo1+malicious-patch. In both cases (as with git as well), I can detect this by comparing hashes.


> And with something like NPM packages you are likely to download another few dozen of other packages which you didn't even intent on downloading. You will probably run a lot of code there that could be doing all kinds of horrible things.

This got me thinking - how would easy would it be to orchestrate a dependency based attack that would cripple a large number of applications - for example with the help of a maintainer of a popular open-source project gone rogue? Do large tech companies frequently audit the 3rd party code that goes into their applications or is it largely based on trusting the open-source maintainer?


Are you familiar with the left pad incident? One maintainer dropped a bunch of predominantly trivial repos that had a large impact on mom.


Note that for the leftpad incident, the impact was build faillures, not remote code execution.


There was a time when I made a point of only installing from source code, and never even using package managers. Although, of course, it wasn't possible to read through all of the source code and make sure that it wasn't doing anything malicious, this felt safer to me. I eventually had to give up, though, because troubleshooting a failed install from source is damned near impossible, and all the documentation you can find on anything assumes that you're using package managers to install everything.


Not that I personally care that much, but the idea is that using curl|bash you can get incomplete script because of a network error, and "incomplete" can end on any command, like instead of "rm -rf /home/user/.config/program/useless_dir" it could end on "rm -rf /home".


All composer packages are namespaced, and warns if the command is being run as root.

It has its own potential security issues with post-downloadn scripts, but knowing the namespace helps a bit.


curl | sudo bash (the typical use case) means that whatever ass-backwards method of installation the developer thinks makes sense just happens without you being able to put any reigns on it.

For example, Homebrew by default installs everything into /usr/local, but as your user. This is great for single-user systems, but everything goes all the way to hell when someone goes and installs it on a multi-user system and suddenly whatever versions of anything they've chosen to install become the default version for everyone on that system.

For Linux, if you have sudo permissions, it recommends you install it into /home/homebrew/.linuxbrew, which is completely nonsensical; it doesn't create a 'homebrew' user, and it shouldn't store local data in /home/<wherever>/ anyway (use /usr/lib/<wherever> or /var/lib/<wherever>).

Basically, the people who created HomeBrew don't seem to really understand the benefits of not making a complete mess of an existing system.

Compare that with, for example, MacPorts. They have an installer package that you can use on MacOS, or you can just clone the code and do './configure' and pass in whatever options you like. The first is great for the less technical, and the second is great for more technical. They install by default into /opt/local, which I've never seen anything else use, and they help you add the relevant paths to your path so that you can use it, but no one else does by default.

I've also seen other "install shell scripts" which do even worse things. One (I think from the Apache project?) would download a .deb package, if you were on Ubuntu, and then just manually unpack it over top of your existing filesystem. It wouldn't `dpkg -i foo.deb`, it would `dpkg-deb -x foo.deb /`, potentially overwriting anything that shared the same path, and making it impossible to uninstall. It's already a debian package! Just install it normally!

In other words, aside from encouraging the bad habit of "run code from the internet blindly as root", it's extremely, extremely rare that I come across a project which instructs me to do this but doesn't do something incredibly stupid in their script.

> At least with `curl|bash` I get some feedback of where the code is originating from, what URL will I be downloading something from and is that some place that I can trust. At least I get somewhat of an identity verification (albeit very very weak) as long as I trust the owner of the site to protect it adequately from preventing unauthorized uploads.

This isn't even remotely true. That shell script that you downloaded from https://llamasi.te/install might download and install arbitrary binary packages, binaries, config files, etc. from anywhere on the internet. It might install an older version of npm with security holes, overwrite your local node installation, and then download a bunch of npm packages with pinned versions full of exploits.

Unless you stop and read through their shell script to see specifically what they do, you have literally no idea what is going to happen with your system, and if you're going to stop and read their shell script it's probably significantly faster to just provide you with a list of prerequisites and a few commands to run, rather than make you read through a shell script full of if/else/fi to check which versions of sed and awk you have and where they are, just so that it can use them to parse out version information from other tools that you wouldn't need to use.

Basically, when you curl|bash, you're assuming that the other person is trustworthy and knows what they're doing, and while you can make the determination of #1 fairly quickly, it takes a lot more time and energy to determine #2.


I'm in no way advocating for curl | bash, but one way to defeat the detection would be to use "sponge" from moreutils[1].

    curl "https://example.com/install" | sponge | bash
[1] https://manpages.debian.org/testing/moreutils/sponge.1.en.ht...


> This allows constructing pipelines that read from and write to the same file.

TIL, how did I not know about sponge before?


sponge rules, but you need to be careful. Lets say you wanted to search/replace a file with sed and sponge, then this does not work:

   cat file.txt | sed "s/a/b/g" | sponge > file.txt
Because bash opens file.txt for writing and truncates it immediately, so "cat file.txt" wont output anything. You have to do this for it to work:

   cat file.txt | sed "s/a/b/g" | sponge file.txt
sponge wont open the file for writing until it's sponged everything up. If you remember this, sponge is an awesome tool in the UNIX toolbox.


Of course with sed you'd use the -i flag.


Unless you're on MacOS, in which case it's -i''. I always have to rewrite cross platform scripts at least once.


Right. The sed -i option is incompatible between MacOS & Linux, and it isn't in the POSIX standard. The POSIX folks don't want to add it because it's possible to do work without it, even though no one wants to. Maybe sponge should be in the POSIX standard.


I ran into this on FreeBSD. I do have gnu sed and other gnu utilities installed on my Mac to help me feel more at home.


This doesn't seem to have any real advantage over just using an intermediate file.


Of course you can always use an intermediate file, the point is that sponge is a very convenient way to just do it in a single command, without having to use intermediate files.

You don't have to use pipes either, you know. You can just save the output from every command to an intermediate file, then run the next command with that file as input.


Check out the other moreutils utilities as well, it's a gem. vidir is especially useful with vim, but you can use it with any other text editor.


it's like half of tee


I don't think tee buffers all its input before writing to its outputs.


Anyone dare to incur the wrath of the Rust contingent by linking to the elephant in the room? OK then, I will

https://www.rust-lang.org/tools/install


The suggestion of the website is certainly not great, but note that:

* It has been improved a little by the https and tls 1.2 parameters compared to earlier versions that didn't have these params

* You are not required to install rustup that way. There are many rustup packages available in distros today. Arch has one, nix OS has one. Debian is dragging their feet [0] but they aren't entirely opposed to the idea, so maybe something will change about this in the future.

* Even if the download process is secured, rustup doesn't really verify any signatures, it just downloads what's inside the manifest. So you can download potentially malicious code if the S3 bucket is hacked that rustup downloads the compiler from.

* Even if the entire process of obtaining rustc is secured, the standard way of using rust + cargo basically gives maintainers access to your local computer through build.rs and other methods. You can sandbox the cargo invocation, but who does that?

[0]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=955208


What, specifically is wrong with it?

The article writes this is obviously bad, but I think it needs to be expanded upon, it's not obvious.


`curl | bash` downloads an installation script and runs it as it's being downloaded, with no opportunity to read the script and ensure it does nothing malicious, and has no inadvertent bugs that could cause damage.


As opposed to downloading a binary and running it? We don't verify what the binaries we download contain, why do it for a bash script?


Binaries we download and run (as well as packages from package managers) are generally cryptographically signed by people you trust. Shell scripts you curl into bash are not. It is generally MUCH easier to compromise a website and insert your own malicious script than it is to compromise a developer's secret key, which is usually stored much more securely.

Put it like this: if an attacker compromised Rust's website and put in their own malicious Rust installer, macOS and Windows would both show a very scary warning that's like "This is unsigned! Don't run this!" (macOS would even refuse to run it unless you did the right-click to open trick). A Linux package manager would refuse to install such a package outright, unless you --forced it. Not so with curl-to-bash, it would just silently compromise your computer.

Basically, if you believe that code signing is a good thing for security (and I think we all do), curl-to-bash is awful security practice, and you should manually review the script.


> Binaries we download and run (as well as packages from package managers) are generally cryptographically signed by people you trust. Shell scripts you curl into bash are not.

There is no reason in principle why a shell could not implement digital signatures. bash could have a "--require-signature" option where it looks for a GPG signature appended to the end of the script, and refuses to run the script if the signature is missing or the signer is untrusted.

(No idea if the Bash maintainers would be willing to accept a patch adding such a feature if someone were to write one; but, signed scripts is something supported by PowerShell, although most people's experience with it is limited to turning off the default setting which requires all scripts to be signed.)

A related question – what's the difference between downloading an unsigned script over HTTPS versus downloading a signed script (whether over HTTPS or just plain HTTP)? Ideally, the code signing is done offline, so an attacker that compromises the HTTPS server gets the HTTPS private key but not the code signing private key. However, I suspect that ideal often isn't actually obtained; offline code signing can be a pain for CI. If someone has a CI pipeline which signs the code and then deploys it to the HTTPS server, then all an attacker has to do is compromise access to that CI pipeline and neither HTTPS nor code signing will stop them.


> There is no reason in principle why a shell could not implement digital signatures.

They currently don't, though, which is the point :)

The reason in principle this might not work is that different distros and os's trust different keys. Fedora trusts different keys than Ubuntu, than Arch, than macOS, etc.

> Ideally, the code signing is done offline, so an attacker that compromises the HTTPS server gets the HTTPS private key but not the code signing private key.

I don't think this is necessarily the case. Even if you gain access to the server, HTTPS keys are usually stored in such a way that only root can read them, right? So if you only compromised a non-root account, you wouldn't necessarily be able to read them (I might be wrong about this, I don't do this kind of thing professionally).

Also, you could imagine that the script is not stored as a file, but in a database or something, and then you could compromise it with SQL injection or a similar technique. That would allow you to change the script without any access to HTTPS keys.

> However, I suspect that ideal often isn't actually obtained; offline code signing can be a pain for CI.

You're probably right about this, but this is a great example of why you shouldn't use the same key for both code signing and HTTPS.


> The reason in principle this might not work is that different distros and os's trust different keys. Fedora trusts different keys than Ubuntu, than Arch, than macOS, etc.

Every distro has a keystore / certificate store to which root can add whatever keys/certificates they like. The only difference is in which keys/certificates are present OOTB.

> I don't think this is necessarily the case. Even if you gain access to the server, HTTPS keys are usually stored in such a way that only root can read them, right? So if you only compromised a non-root account, you wouldn't necessarily be able to read them (I might be wrong about this, I don't do this kind of thing professionally).

The purpose of the key is to vouch for the integrity of the data. If I can change the data the key vouches at will, I've effectively compromised the key, even if I never gain access to the bytes of the key itself.

> You're probably right about this, but this is a great example of why you shouldn't use the same key for both code signing and HTTPS.

If the CI pipeline has privilege both to (1) sign stuff with the code signing key (2) upload stuff to the HTTPS server, then even if the code signing key and the HTTPS key are different keys (which is the norm), compromising the CI pipeline is enough to effectively compromise both of them.


Pardon my ignorance, but couldn't we just cryptographically sign bash scripts as well?

Perhaps modify the installation script, something like `curl | some-verification-tool | bash`


You can. PowerShell already has a 15-year precedent of signed scripts - you generate a signature and embed it in the script in a specific way, and the shell can be configured to only run scripts if their signature is valid.

(PowerShell script signing is based on standard Windows codesigning certificates, but of course this hypothetical bash script signature verifier can use GPG keys instead.)


Those who do not understand package managers are doomed to rewrite them. Badly.


Not easily, and probably not in a way that would work on all different systems (different distros/os's trust different keys).


Not that it would fix anything, but wouldn't it be possible to:

- Store the install script on www.xyz.com/install.sh

- Store the hash of the install script on www.abc.com/hash.dat

Then in the installation command:

- curl the install script from www.xyz.com/install.sh

- hash it

- curl the hash from www.abc.com/hash.dat

- compare the computed hash with the "curl'd hash"

- only if they match, pipe www.xyz.com/install.sh to bash

In that case, a malicious actor would need to hack at least two locations to be successful.


The malicious actor would simply edit the install script to comment out the checksum verification step, since they already own the script.


Ah yes...good thing I don’t work in IT security...


No worries, I've made the same mistake and everyone in software security probably makes that mistake at some point.


Except if you're using Homebrew where nothing is signed, so signing is not as ubiquitous as you seem to think.

But signing is largely a moot point as people will follow the instructions on the website where a clever attacker would simply create their own cryptographic keys sign it and provide instructions for adding it to your keychain to the compromised website. Public key crypto does not solve establishing trust for an actor. It can only do so through delegation which as we have seen with the deprecation of EV certificates (CAs no longer being trusted to establish the identity of legal persons) is not something that is easy to get right.

Certainly it is better to trust the maintainers of your operating system/package manager than a random website. However they only sign things where they have done due-diligence on the project, which is a slow process, too slow for projects with faster release cycle such as Docker or Rust.


> Public key crypto does not solve establishing trust for an actor. It can only do so through delegation which as we have seen with the deprecation of EV certificates (CAs no longer being trusted to establish the identity of legal persons) is not something that is easy to get right.

Public crypto can help carry the established trust. Say you track some developer's work and the dev behaves consistently in a trustworthy manner, his signing key can be used to help carry this trust forward, so that after a while you don't need to keep checking and just rely on the assumption that dev is probably honest going forward + crypto to verify you're really using this developer's output.

No need for delegation. Delegation doesn't solve trust anyway, only identity verification.


The biggest issue is not bash script vs. binary. It is that “curl | bash” does not save the script locally.

If something goes wrong, you can’t examine the script afterwards to help figure out what happened and how to recover.


I suppose the difference is iff people do very script before they run, this could bypass it. But I think, those people would definitely download it, inspect it, then run the downloaded script.


I think it would take next to no effort to hide something in a bash script in such a way that it passed casual inspection even by an expert. See code obfuscation competitions for examples.


The whole point of the installation script is that it downloads and installs binaries obtained from a remote server though. You already need to trust the server if you're going to do that, don't you?


Just like if I'd done `curl > install.sh; bash install.sh` without reading install.sh or installed a binary package, both of which are done every day by people with work to do and without the time, skill and inclination to audit millions of lines of source code.


or curl --output instead of using stdout


In the meantime all the distributions (at least the major one) have Rust in their official repositories.


Unless you are on a rolling release distro, you can run into problems with the rust compiler package being out of date, as the rust ecosystem quickly starts requiring newer versions. 500 crates for a medium sized project are not unheard of, and if one of the maintainers decides to adopt a new feature, you run into issues.

That's why in order to write Rust, you'll likely need rustup, which is packaged on way less distros than the rust compiler is. Debian for example doesn't have an official rustup package.


> That's why in order to write Rust, you'll likely need rustup

It seems that way, which is a big disincentive to use Rust at all.


The same issues exist for other languages and runtimes so I'm not sure why Rust having a standard solution is a disincentive. How many ways are there to ruin a python environment?


The issue is worse with Rust, which is still a young and rapidly-changing language. Debian stable currently packages rustc 1.41.1 (February 2020), which is 11 releases behind the most recent 1.48.0 release. The older compiler's differences after 10 months include:

* doesn't have subslice patterns * has `Error::description`, which has since been deprecated * doesn't have x86 CPU feature detection * lacks fix for unsound typecasts between integers and floats * lacks fix for incomplete constant propagation * doesn't let you use conditionals, loops, match, $$, or || in constant fns, or cast arrays to slices in constant fns * has less-helpful error messages and backtraces * doesn't let you implement traits on arrays with lengths >32 * has worse type inference, this (valid since 1.43.0) code doesn't compile because 0.0 and &0.0 are f64s: let n: f32 = 0.0 + &0.0; * requires you to import standard libraries to use predefined numeric constants (like MAXINT or NaN) * doesn't support Control Flow Guard on Windows * doesn't generate documentation links from classpaths (you have to write relative file paths instead)

and numerous libraries being pre-stable.

For code that you're writing yourself, not having these features is mostly just a quality-of-life issue: you'll have to do things that you wouldn't have to do if you had the latest compiler. For code that you're collaborating on, having an 10-months-older compiler can be a showstopper: the code might just not compile at all, and very likely the only fix will be to uninstall your system-packaged rustc and install the version from rustup.

Contrast this with C/C++ development, where your system's pre-packaged compilers are much more likely to be sufficiently up-to-date. These languages are more mature than Rust: they change slower, and new features are adopted by programmers more gradually. For example, the Linux kernel only requires GCC 4.9 [1], which was released in 2013. Debian stable supplies GCC 8.3, which will probably be able to compile unmodified kernel source code for many years without being updated.

[1] https://www.kernel.org/doc/html/latest/process/changes.html


I can contrast with C/C++ development, where C++17/20 are not available by default on many distros because package managers lag so far behind the curve and new projects get tied to outdated versions.

I just don't really see this as a problem - Rust really isn't that much of a moving target, and the benefit is there is one solution to the problem: use Rustup to manage your Rust installation.

That is significantly more robust than "use your package manager."


Or avoid distributions like the former CentOS.


Yikes.


Those advocating we inspect the downloaded script surely wouldn't rely on plain old "cat" would they? :) https://www.infosecmatter.com/terminal-escape-injection/


Pfft, that would be a Useless Use of Cat[1].

[1]http://porkmail.org/era/unix/award.html#cat


Many examples of "useless uses of cat" are useful IMO because they improve clarity.

  cat some.file | a_process | another_process > output.file
is clearer than

  a_process < some.file | another process > output.file
IMO because it maintains a simple left-to-right flow. I can mentally translate between the two without thinking about it, as no doubt can you, but the clarity could be useful to others (including tired me after a long day!).

Yes it wastes a bit of processing time and memory, but if what you are doing is going to significantly benefit from that small saving is the shell the right place to be doing it anyway?


This argument has been done to death. For the record yes, I agree with all of that and secretly I use it all the time.

I only invoked the "useless use of cat" demon because in this case it makes an interesting difference. Less doesn't evaluate the escape sequences which make this attack work by default.


You can put the redirection before the command name, which keeps the same left to right flow.


Though the arrow still points the wrong way, if I'm being fussy.

I was also about to suggest that was shell specific so not available everywhere, but a quick test says it is supported by /bin/sh at least so might be POSIX and therefore widely available.


Does it matter if it's bad form if every terminal noob does it anyway? An attacker using this vector certainly wouldn't care - so long as the attack works.

There's definitely a place for poking fun at Unix noobs - I have learnt better tools than cat because of it - but it is not terribly relevant here.


It's not even a noob thing. I've been using using *NIX for 15+ years and I still don't subscribe to this "useless use of cat" pedantry.

Without cat I need to use redirections or find the parameter for each command I use, in order to get things from a file.

cat bla | command1 | command2

is much nicer to write and read, much more logical and much easier to modify if needed.

"Useless use of cat" is a micro-optimization. Use if you want or need, don't turn it into dogma or even better, newbie shaming (yay for gatekeepers!).


It's actually one of the few cases where there is a functional difference between using cat and the alternative. Using less instead wouldn't execute the escape sequences.


Unless the user has less aliased to `less - r` which many users do.

I wouldn't be surprised if there existed some obscure distro that ships this alias in the default bashrc



Slightly related, fun with copy and pasting into terminal: copy&paste&getpwned - https://web.archive.org/web/20170421210256/http://hysteria.c...


I already knew it was dangerous but TIL. Iirc there is an online service (I can't remember the name) detecting if a script is malicious does it work in this case?


That was my first thought too. Someone can return legit script when it is not "curl | bash" and return a malicious one otherwise.


That's probably why you should:

- download the bash script

- inspect the script / add a 'set -x' for tracing

- then run (the downloaded script) with log output somewhere.


1. Carefully inspect the 15 lines of the installer.

2. Run the million lines of code that it installed, without inspecting them.


The difference between a malicious and a legit bash script can be just a couple of characters. Eg. the script can have a deliberate injection vulnerability in a variable expansion...

If people commonly inspected bash scripts they downloaded, attackers would just write the script more carefully to make sure 15 minutes reading the script wouldn't find anything bad going on.


Fortunately it’s not common, so they don’t. If it became common and they did, I could inspect the script using automated tools that check for obfuscation, etc.

I think it’s not about eliminating, but making a little better. With lots of this stuff, it’s just about being faster than the other guys that the bear is chasing.

It’s kind of moot because all this is solved by “don’t download and run scripts from strangers.” Then you just have to worry about someone rooting homebrew.

I don’t like this line of thought because it ultimately gets us to a Microsoft world where everything is signed and there’s still tons of crap getting through.

I’d rather have it where scripts can be individually assessed and run without lots of expensive stuff in the middle. Even if that means some risk.


If you want to enable tracing you can just

  bash -x


set -x won't do much though. All it will tell you is what it did, but by the time you realise, the damage could already be done.


that's why you download and inspect the script first.

and you run the downloaded script, never curl|bash


And then just leave it to install whatever binary it did.


Cool concept. But "curl | less" will display those null chars, so I'd use many shell comment lines instead to fill tcp buffers. Which again will fill the reader's page and make the pager stop for me to read, which might take more than 10 seconds. Seems it not _that_ easy to be nasty?


You could as well send a different response the second time some IP+user-agent visits the resource. That would be almost as effective.

Just download the script, inspect the code, run the code.


Sounds like you could have bash sleep long enough for your server to time out and close the connection before the rest of the script gets downloaded.


One can also run `curl | pv > tmp.sh` to spot this pattern.


Are there any good http client classifier tools for servers?


I have spent a stupid amount of time trying to come up with a somewhat more secure approach and wound up with `scurl` which requires a cypher, a checksum, a url that points to an immutable file, a target path, always writes the output to disk, and will not write to the target path if the checksum does not match [0]. For an extra level of paranoia add a check to ensure that the target directory exists and is only writeable by the user currently running the command. As far as I can tell there isn't a simpler way to do this that doesn't have some hidden pitfall, and I'm sure even this way has pitfalls, because I only just realized that I had not accounted for the race condition on the target path. There are variants of this which can stream, and there was even some work to try to integrate this functionality into curl itself [1], but they are 1) not portable 2) more complex, longer, and harder to audit and 3) don't protect you anyway. Why not? Because you can't know that a file doesn't match until you read the last byte, at which point all of those bytes will already have been piped into bash. Oops!

I have also spend a stupid amount of time trying to figure out a reasonable workflow that streamlines auditing and update of the immutable links used in this pattern [2]. The same issue is faced by anyone who pins their dependencies. Auditing dependencies is a huge bottleneck, to the point where it is often neglected entirely and/or has the effect of preventing updating dependencies altogether.

Immutability of the data behind the link is critical to prevent checksum failures due to legitimate changes to the underlying data, but also means that you can't take advantage of being able to point users to a single unchanging url. For one type of security the name has to change with the content. For usability and another type of security it is vital to be able to quickly push updates. If you have users with an immutable url pointing to a bad version the only thing you can do is take down that version, but you can only do that if you control the server. If you point to a raw github url that bad version will never go away. Thus it is critical to have tooling in place that makes it easier for users (aka other developers) to find and update those immutable urls in a way that minimized the audit load.

The sandbox wont save you this time!

0. https://github.com/tgbugs/orgstrap/blob/master/get-emacs.org...

1. https://github.com/curl/curl/issues/1399

2. https://github.com/tgbugs/orgstrap/blob/master/reval.org

3. https://blogs.sciencemag.org/pipeline/archives/2008/02/26/sa...


i don't install anything that comes in shellscript download, this speaks volumes on security of the 'software' itself.


So it is safer to use `sh -c "$(curl https://whatever.com/install.sh)"`?


Only mildly so. This hides from the server the fact that you are not checking what is received, so prevents the server from adapting its results based on that, but still allows a server to just unconditionally serve a malicious script. The safe way is to save to a file, actually inspect the file to make sure there is nothing malicious in there, then run that file.


No. It’s safer to download the file, inspect it manually, and then run it.




Applications are open for YC Winter 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: