Hacker News new | past | comments | ask | show | jobs | submit login
"So that a truncated partial download doesn't end up executing half a script" (tailscale.com)
78 points by obi1kenobi 21 days ago | hide | past | favorite | 81 comments



The shell script starts with the following comment:

    # All the code is wrapped in a main function that gets called at the
    # bottom of the file, so that a truncated partial download doesn't end
    # up executing half a script.


Most of these scripts have been doing that for years.


I'm sure of it. And yet based on this rapidly getting to the front page, it seems like many of us are part of today's lucky ten thousand: https://xkcd.com/1053/

Might be more than ten thousand, even, based on the reactions :)


I had to do stuff like this over com ports using systems made before PC's.


Another trick I've also seen is to enclose the whole script in a code block

    { 
    code 
    }
That way, if the file is not fully loaded, the block will not end and the script will not parse


A simple (not perfect) approach could be to have a comment containing a "unique" string on the last line and grep for it as the first check to ensure that the entire script has downloaded.

    #!/usr/bin/env bash
    
    set -u
    
    grep -wq '^# asfewdq42d3@asd$' $0
    [ $? -ne 0 ] \
        && echo "script is not complete - re-download" \
        && exit 1
    
    echo "script is complete"
    
    # asfewdq42d3@asd


This doesn't work for something like `curl ... | sh`


I've seen scripts (self-extracting archives for Linux, for example) that checksum themselves either by some trickery, or just ignoring the first line after the shebang (which itself is the computed checksum of the rest of the file).


Incorporating an MD5 quine into a shellscript would be funny.


And a sha256 quine would be terrifying. :)


The problem is, the wrong party is doing the check (from a security point of view, not integrity).

When we download a script from a remote domain we don't trust, we have to validate its checksum against the known one; we can't leave that to the script, which we don't trust.


In this case we’re specifically talking about the possibility of a truncated script from a trusted source


99% of the time you are downloading from a domain that you do trust. This check is to detect corruption, not malice.

But yes, if you were downloading from an untrusted mirror you would want to check the signature or trusted hash before running the script at all.


That can be useful against download corruption but wouldn't do much against an actual attack (in this case, the attacker can just update the checksum).


This probably means you can edit the script while it's running without it falling over confusingly. Might cargo cult this pattern - I'm very prone to editing a build.sh while it runs.


Shot myself in the foot so many times with this!


    #!/bin/bash
    SHA512="485fe3502978ad95e99f865756fd64c729820d15884fc735039b76de1b5459d32f8fadd050b66daf80d929d1082ad8729620925fb434bb09455304a639c9bc87"
    # This line and everything later gets SHA512'ed and put in the above line.
    # To generate the sha512 simply: tail -n +3 [SCRIPTNAME].sh | sha512sum
    check_sha512() {
        # Compute the SHA512 hash of the script excluding the first two lines
        local current_sha=$(tail -n +3 "$0" | sha512sum | awk '{print $1}')
    
        # Compare the computed SHA512 hash with the predefined one
        if [[ "$current_sha" != "$SHA512" ]]; then
            echo "Error: SHA512 hash does not match!"
            exit 1
        fi
    }
    
    # Call the function to perform the hash check
    check_sha512
    
    # Rest of your script starts here
    echo "Script execution continues..."
The idea is simple: if the first line get's mangled (#!/bin/bash) the script probably won't execute at all. If the second line gets mangled than obviously the SHA512 comparison won't work (variable name or value).

Finally if the rest of the script gets mangled or truncated it won't SDHA512 the same and it'll cause the function to exit.

For bonus points you can add a check if first line of script is exactly "#!/bin/bash" as well.


If the file is truncated after the function's closing brace, it will succeed but do nothing.

If the file is truncated in the middle of the word `check_sha512` it will try to execute a hopefully-not-existing command.

Wrapping in simple { braces } should fix this - if the brace is missing, you get a syntax error, and if present, you can execute the full thing, regardless of whether a trailing newline is available. This is admittedly bash-specific, so won't work for the linked script, but (subshell) doesn't cause too many problems

Using a function and checking the SHA don't really add anything after these fixes.

Checking the shebang is hostile to environments that install bash elsewhere.

An almost-working possibility would be:

  exec some-interpreter -c 'commands' "$0" "$@" ""
which will fail if the second ' is missing. The child interpreter can then check for later truncation by checking that at least 2 arguments were passed and the last one is an empty string. However, this is still prone to truncation before the -c.


A serious question for any Linux-heads here, no insult intended.

How is it possible that there are ELEVEN different possible package managers that need to be supported by an installation script like this?

I can understand that some divergences in philosophical or concrete requirements could lead to two, three, or four opinionated varieties, but ELEVEN?

Does that mean that if I want to write an app that runs on Linux I should also be seeking to support 11 package managers? Or is there something unique about tailscale that would necessitate it?

edit: Thank you for the responses so far, but noone has yet answered the core question: WHY are there eleven of them?


Tailscale may be unique as a network appliance that users like to run on a lot of different equipment.

Used GPT-3.5 to summarize these and tried to edit the response for brevity. Pardon any hallucinations here. Looks like it's mostly just different OSs all running their own software publishing/distribution portals. Lot of NIH maybe.

""" 1. apt: Debian, Ubuntu.

2. yum: Red Hat / replaced by DNF.

3. dnf (Dandified YUM): Red Hat, Fedora / successor to yum.

4. tdnf (Tiny DNF): lightweight DNF for minimalist distros.

5. zypper: SUSE/openSUSE.

6. pacman: Arch Linux, Manjaro.

7. pkg: FreeBSD.

8. apk: Alpine Linux.

9. xbps Void Linux.

10. emerge: Gentoo Linux.

11. appstore: Apple / iOS, macOS. """


It's like anything else, it depends on how many people you want to get. Apt alone will get you 50%. Add pacman and that's another 30%. Yum is another 15%. Nix is another 2%. Foo is another 0.3%, bar 0.1%, and so on and so on. (Numbers are made up).

You don't have to do anything, it's just about how convenient you want to make it.


Surely pacman is nowhere near as popular as yum/dnf.


> edit: Thank you for the responses so far, but noone has yet answered the core question: WHY are there eleven of them?

Because Linux is "just" a kernel that happens to be used by different OS. Linux is just the program that runs binaries amongst your computer, it’s not a package manager or an OS.

There is no official cooperation between different Linux OS. People developing package management at Suse or Redhat (which are commercial companies) aren’t the same that are developing APT at Debian foundation. Odds are that they don’t even know each others.

Look, Android is based on Linux but they have their own package management because most of the existing ones weren’t compatible with what they wanted to achieve.

It’s the same with other vendors : while package managers looks like they are mostly doing the same thing, they all had their own requirements that justified their creation.

Anyway as a developer you mostly don’t have to package your app yourself. It’s the job of either the distribution developers themselves (in fact that’s most of the work in making a "distribution": they package and distribute software) or in some organizations like Suse or Arch, this job can also be done by the community which allows more up to date package.


It turns out my actual question is "Do we really need this many different Linux distributions" and I'm sure this will sound even more wrong but it really feels like the answer is no.

OK different ones for different hardware devices, sure.

But I asked ChatGPT of the difference between openSUSE and Debian (since you mentioned those) and everything listed seems like it could be just variants within the same OS not a fundamentally different OS

> Package Management: > openSUSE uses the Zypper package manager and supports the RPM package format. > Debian uses the APT (Advanced Package Tool) package manager and supports the DEB package format.

This comes back to my original question - they need different package managers because the OS's are different, the OS's aren't different because they wanted different package managers

> Release Model: > openSUSE typically follows a fixed release model with regular releases of a stable version, like openSUSE Leap, which has a predictable release cycle. > Debian follows a more flexible "when it's ready" release model. There are three main branches: stable, testing, and unstable.

you could have unstable, testing, stable, and then on top of that have a predictable stable release cycle. These aren't incompatible with each other.

> Philosophy: > openSUSE is known for its strong integration with the open-source community and its adherence to the principles of the Free Software Movement. It emphasizes stability and ease of use. > Debian is known for its commitment to free software principles, as outlined in the Debian Free Software Guidelines (DFSG). It prioritizes stability, security, and freedom.

This just feels like a ChatGPT hallucination. It sounds like both are focused on Free Software. That said, I suppose noone is stopped from creating a Linux OS extremely focused on FSM, a commercial one not-at-all interested in FSM, and one somewhere in between.

> Default Desktop Environment: > openSUSE offers various desktop environments, including KDE Plasma and GNOME, but its default desktop environment may vary depending on the edition (Leap or Tumbleweed). > Debian offers a wide range of desktop environments, including GNOME, KDE Plasma, Xfce, LXQt, and more. Its default desktop environment is GNOME.

Like the package manager, inverted cause and effect.

> Community and Support: > Both distributions have active and supportive communities, offering forums, mailing lists, documentation, and other resources for users seeking help or guidance.

> System Configuration: > openSUSE uses YaST (Yet Another Setup Tool), a comprehensive system configuration tool that allows users to manage various aspects of the system through a graphical interface. > Debian relies more on manual configuration files and command-line tools for system administration, although there are also some graphical tools available.

I wonder what YaST uses underneath, I bet it's a series of...configuration files :)

No reason why both couldn't work on the same OS.


If you write an app that runs on Linux, you should support flatpak, and (for bonus points) nix.

The rest should be done by the distros' maintainers.


For something like Tailscale, running as a Flatpak would require the user to relax the sanbox boundaries, assuming it would work at all!

It's probably easier to script up the installation via package manager. You also get the benefit of upgrades along with the rest of the system.

Furthermore, I haven't seen a single instance of Flatpak being used to install applications on headless servers.

I also don't know many sysadmins who would be happy that each application they install in their servers will come with a full set of dependencies rather than being dynamically linked to the base system.


Why? Because there is. Every package repository has a community of maintainers who make sure a package is compatible with their OS/distribution. That's just the way each linux distro solves this problem. It's worked out pretty well so far, despite the sheer number of redundant packages.

Why not? This kind of script is a generally bad idea, because it's hoarding the responsibility of package maintenance. The better solution is to maintain a working package (better yet, convince someone to maintain it for you) for each distro's public repository, good documentation on their wiki, and working links to that documentation in your readme.

Why not not? Despite being a bad idea, it's not a hard idea. The implementation of this script is likely easier to manage than doing it the proper way.

--

Now if you really want to feel upset, start looking at build systems...


Normally it's not your job to package it yourself the distro maintainers do that


If you're writing proprietary software and trying to get it out there so you can get paying users, it's your job.


Because there is unfortunately no "linux"

Sure, there is the linux kernel in different version and patch states, but everything else including on how to manage software(package manager) is something the distribution decides.

As there is no standard and for historic reasons, different distributions choose different package managers.

If you want to support linux you normally decide on which distribution you want to support and more importantly which version of them.

The big ones out there are probably ubuntu, fedora and arch.

Then you can decide between building packages for the different package managers or just build a static/dynamic binary that works on the distros and runs on them.

You can also use flatpack and snap which makes it easier to support different versions of the same distribution, but you run in a sandbox and afaik lower level access to the graphic stack(games) is a mess.

Yeah it is a mess, but at least most distributions have the same service/bootup manager


The only proper approach is to come up with a twelfth package manager that will encompass all the other eleven.

https://xkcd.com/927/


I read TFA. Why would a truncated partial download happen and still run the script?


I believe 'curl XXX | sh' will start executing any complete shell statements in the input, before the full script is downloaded.


The most obvious answer is `curl | sh`. But also perhaps a network blip interrupting curl/wget, but the user failing to notice and going ahead and executing the file anyway.


A truncated download might happen for all sorts of reasons, like your internet connection dropping while you download the script. If you don't notice you might accidentally run an incomplete script and leave your system in some broken or at least confusing state. They wrapped everything in a main function to prevent that from happening


Browsers typically emit downloads to temporary files until they are complete, then rename them into the final location, to prevent this kind of issue.


Tools like wget or curl often do not. And the shell doesn't when doing something like `curl ... >myscript.sh`.


Sure. Just pointing out that there's a good reason browsers do this.


I mean, could you put the main function at the top of the script, so that it calls later definitions?

The problem is that the script could be truncated in such a way that it executes successfully. It defines a bunch of functions and then quits.

If you're not checking for the success or failure of the download, you're probably not checking for the success or failure of the script; something is just going to assume the script worked.


If only there was a way to transactionally run shell scripts such that if they don't complete fully, the changes are automatically reverted.

Edit: cue the HN responses to use nix, and other solutions


Make curl | sh automatically upgrade the user's system to nix?


Nix, as the name implies, it's a moissanite, a fuck-o, a bullshit.


Potentially sounds like a job for a BTRFS snapshot?


Not really... although there have been attempts at doing this (e.g., Solaris) I've never seen it work out well for systems which could have multiple concurrent users. The main problem is that there are unrelated changes that could occur while performing an installation, for example adding a user while (but not part of) installing a package is a valid configuration right now but wouldn't be in the described scenario.

Solaris handled this by having a config file that you could specify which files to copy between boot environments. Don't have /var/mail,/etc/passwd,/etc/shadow,/etc/hosts,.... in a different FS ? Better remember to copy it, or you lose your emails,users,hosts,etc

The problem with this kind of failure mode is that it's silent and generally irreversible. There generally aren't tools to MERGE files that conflicted later on if you discover the issue while the old boot environment snapshot is still around.

On the other hand, this is sort of how Docker (and Solaris zones) work -- and why you can't upgrade a container in a container-like way, you must replace the entire container (i.e., any upper-COW is lost, unless you export it and build something new atop it outside of any sort of reconciled process).

On the other other hand, I've actually used BtrFS snapshots for exactly this successfully in exactly 1 case -- a read-only volume containing my OS files (12 files total; Kernel, Kernel Debug, Several Initramfs) for PXELINUX/EXTLINUX booting, for which can be atomically upgraded. Since it was read-only and the only operation supported on that volume was replacing the files, and it only had those 12 files, it was safe to do so.


Idempotence isn't guaranteed either


...isn't that a common pattern? I'm pretty sure most install scripts I download and run already do that, though I don't run those very often.


Don't pipe curl/wget a script to a shell without reading what you've downloaded. This should be common sense. Do `wget $url; most install.sh` and only if you're satisfied with what you read, execute `sh install.sh`.


Well, you're going to run the thing the script downloaded with exactly the same user and privileges as the script you're running. Unless you're doing a full audit on all the code and not only a cursory look on the installation script, this looks to me more like security theatre.


Because if looking at the script you've downloaded for obvious errors or issues is not 100% effective, then it's only theater. Is that what you're saying?


It is an awful habit of some open source projects to have the official way to install their software be to execute a shell script from the Internet. Nobody reads it, as they are usually quite complex and given the xz situation a well crafted shell script can seem harmless while being very dangerous.


How is this different to Windows users downloading a .exe file and running it?


Short answer: it's not, and that's the problem.

Long answer: Windows has a few conventions that make it "better", like a predictable place to install your files, a global authoritative "registry", and never having dynamically linked (and separately installed) dependencies. By sheer virtue of not having a good package manager, Windows has avoided dependency hell. That does, however, still leave it without the utility of a package manager.


Windows checks the code signing certificate of the exe, and if it isn't present and the binary not widely used shows you a big scary warning to discourage you from running it. And if the exe is signed that at least tells you where to send the police after you were infected.

Of course open source projects rarely sign their exes because those certificates are expensive ($300+/year).


> Windows checks the code signing certificate of the exe, and if it isn't present and the binary not widely used shows you a big scary warning to discourage you from running it.

Actually even if the file is correctly signed but is new users will see the warning banners. (Unless using the more expensive EV Code Signing certs)

> Of course open source projects rarely sign their exes because those certificates are expensive ($300+/year).

I'm not sure where the $300/ year comes from but one can get valid certs for less than 50 EUR a year (https://shop.certum.eu/open-source-code-signing-on-simplysig...). I got a physical key one for 65 EUR and it worked just fine.

If the open source project is widely recognizable I'd suggest contacting https://signpath.org/ to get code signing for free (as in beer) via simple Github Action workflow.


That's great to see that they are so cheap now for open source work. I must have remembered the price of EV certificates (which are handy for completely getting rid of the warning screen and for getting Windows Defender off your back)


I skim little python or bash scripts after downloading them. Therefore, there’s at least one person who does it… sometimes. Nobody checks an exe!

Mostly it is the same though shrug. There thankfully don’t seem to be many hackers going after the niche of desktop Linux users.


It's better because you almost never need to give root permissions to the installer, unlike on Windows


Yeah, usually it is just your user account: https://xkcd.com/1200/


Didn't the xz situation kinda prove that even reading the script is probably not gonna do you a lot of good if you're up against someone smart?


Exactly, also if you already go thinking in adversarial terms when using something, why would you even use the thing to begin with?

Maybe I'm too naïve.


> given the xz situation a well crafted shell script can seem harmless while being very dangerous

That’s exactly what they are saying.


The xz situation proved the opposite: if you're up against someone smart, you won't read the script (and you'll think you have).


The xz situation proved that while you didn't read the script, someone did detect the problem. It shows the benefit of many-eyes.


Everyone says this, but nearly nobody does it.

Just like security through open source, it's more a nice myth than a reality.


I do it; I like messing with install scripts and optimize them. My next big yak shaving project will be to optimize rkhunter. Did you know it is a 20K line POSIX shell script? I read through it a couple of times, and there's significant potential for improvement both in performance and in its security. For me it's a lot of fun because I like programming in Bourne-derived shells.


I skim little python or bash scripts when I download them. It depends on the project. (I try not to download much source code because this is a pain to do).


As opposed to downloading a binary install file?


Distributors usually give you a way to verify that what you've downloaded is correct, usually through checksums, PGP signatures, code signing... You forego that if you pipe the script to your shell. What if you make a typo and somehow pipe an HTML document to your shell? If you're unlucky this could wreak havoc.


From the threat vectors you presented, I assume you already trust the vendor. That means you trust their installation script. You are, after all, going to run their binary after the installation!

In this case, I assume the reason to inspect the script is not so much that the script might be doing something bad, but rather that you may have downloaded the wrong file to begin with.

With that in mind:

> Distributors usually give you a way to verify that what you've downloaded is correct

The first thing is that not all software is downloaded from Linux distribution repositories. This technique doesn't work if you're just downloading an installer from a website or Github releases page, etc. Sure, many also provide you with a checksum that you need to manually verify, but the shell script in question can also do the same. In fact, it can help by automating the check after it inevitably downloads the application's binary package.

In this case, the vector is you getting something different from what the vendor intended you to download. An example would be if your connection had been MITM'ed and a malicious package had been sent in its place.

This is largely a non-issue these days with TLS certs everywhere, SNI, OCSP stapling and other protections that more or less ensure you're connected to the right server.

> What if you make a typo and somehow pipe an HTML document to your shell?

That's quite the bad luck!

In this case, the user made a typo.

Most `curl | bash` commands are copy-pasted from a website rather than typed out, so this is _mostly_ a non-issue as well.

For those cases where the user typed the command and got it wrong, for it to become a problem, at least these 2 things need to be true:

  * the typoed URL actually downloaded something that the shell can interpret

  * there are commands in this downloaded document that actually wreak havoc to the system where they ran
I fail to see a scenario where that would happen. Not that it's impossible, but it's so unrealistic that if it happened to me I might just shutdown the computer and go buy a lottery ticket!


Having a distribution maintainer provide official packages is the best way.


Thats who pushed XZ out as far as it went.

Everyone is asleep at the wheel.


Distribution maintainers are also the people that pushed out a fix as quickly as the compromised version was released.

It's unrealistic to expect software to never have security holes, bugs, or vulnerabilities. How they're handled matters more than the fact that they were introduced to begin with.


A single incident. How many incidents in how many decades have there been?


Ask the bottles team how they feel about maintainers and the job they are doing.

A fair number of upstream developers are unhappy with what maintainers and how they deal with bugs and updates.

The real problem is that software packaging and distribution is so very broken. I had a systems admin say "I love containers, they are a circle of salt around demonically bad software"... He wasnt wrong.


I thought that the xz backdoor was only in bleeding edge distributions?


My obligatory yearly scream into the void that the preferred way to install Rust is still `curl https://sh.rustup.rs | sh`

But security!


While I agree, how does this mesh with standard operating procedure on Windows/Mac being to download binary executables and run them? Is the analogous advice "inspect any exe files with Ghidra and fully understand them before running"? Or "only run executables from official distribution channels of open source projects the code of which you've read and understand"? Where, generally, should we place our trust in terms of what code we run on our computers?


Windows and particularly macOS makes it difficult to run things that aren't code signed with trusted certificates. Same with packages in Linux package managers. That provides a large level of assurance that the thing you download is verified by a distributor that you presumably trust (otherwise why are you using their software?). Pipe to shell has no such guarantee: if a bad actor either MITMs you or gets access to their server and stuffs something bad in the script, you're out of luck.

Basically, if you believe that code signing is a good thing (and I hope we all can agree on that), curl to shell is not great security practice.


A far better advice would be to:

Unless you're read to fully analyze all the code and look for problems in the whole code, instead of just the install script, don't be an eager early adopter of every project you see posted somewhere, wait for it to have some social validation, give it sometime so smarter people with more time than us had their time looking for vulnerabilities in the whole code, not only on the script. Or if you really want to check it, use an isolated VM first.


She'll is complicated enough that I suspect it would be quite easy to trick anyone who does a cursory glance over the script, unless they are an absolute shell master.


    curl -fsSL https://ollama.com/install.sh | sh && ollama run llama3 "Why it is bad to curl | sh?"
...for details.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: