Hacker News new | past | comments | ask | show | jobs | submit login
Understanding the bin, sbin, usr/bin, usr/sbin split (2010) (busybox.net)
398 points by rohitpaulk 11 months ago | hide | past | favorite | 105 comments



For quite some time it was common for workstations to have a bare-bones OS installed on their small hard drives, but for /usr to be mounted via NFS to a large server with a large disk. This also, for a long time, justified the split.


Interesting this use case wasn't covered in the mailing list post, I too experienced this exact reason for them being split in the early 90s. In that era, a lot of university terminals were very low end (small disks) - they mounted shares if they ran as Netware clients as well, not just Unix/Linux/BSD. Some of them only had boot floppies.



In that article, he discusses mounting the machine-specific /etc (and the base root fs implied thereby) via NFS. I recall playing with the design for a small group of diskless of linux worstations (which I never actually built, because there was nothing I actually needed it for, it was just a fun idea). My design didn't mount root via NFS, but rather, each machine was supposed to be entirely stateless, booting from the network with PXE, getting a minimal rootFS and kernel from tftp, and then mounting /usr and /home via NFS.

I don't recall if these boxen were then supposed to act as X-terminals, with only GUI display happening locally, and all execution happening on another box on the network, or if that was its own hey-wouldn't-it-be-cool-to-build-this-useless-but-nifty-thing-I-don't-need kind of project.


I did an academic clustering project at one point with Cluster Knoppix, which booted over TFTP much like this. Shared data for the calculations was mounted with a script via NFS, and everything else was memory resident/stateless via PXE boot of a Live image.


This was one of those few cases were the split was necessary. You could ship a very small initrd via bootp (or floppy, or much later, CDROM), and once it came up it could mount /usr over NFS. In case there were problems with the NFS mount, you still had some utilities in /bin for troubleshooting. The alternative was 'remote root NFS', where the root filesystem itself was NFS, and any network problems hosed the machine entirely. Trying to attach some alternative troubleshooting tools into a thin-terminal, or customize the bootp server to boot a troubleshooting image just for the one malfunctioning box, was a lot more painful than just shipping troubleshooting tools in /bin.


Or just that you had to spread the local files over multiple drives, it wasn't really tied to the use of NFS.

The /bin and /sbin directories would be on the boot partition/disk and would contain the programs needed to get to multiuser state.

I still use multiple partitions, even on a 1TB disk.


Tangentially related, one of my favorite FreeBSD manual pages: hier(7)

https://www.freebsd.org/cgi/man.cgi?hier(7)


See also mtree(8):

> The default action, if not overridden by command line options, is to compare the file hierarchy rooted in the current directory against a specification read from the standard input. Messages are written to the standard output for any files whose characteristics do not match the specification, or which are missing from either the file hierarchy or the specification.

* https://www.freebsd.org/cgi/man.cgi?mtree

It can also be used to create a hierarchy:

* https://github.com/freebsd/freebsd/tree/master/etc/mtree


Recent Debian and Ubuntu systems symlink (by default, possibly with exceptions) /bin, /sbin, and /lib* to their counterparts in /usr. Older systems may be able to perform the merge by installing the "usrmerge" package.

I believe Arch takes it further and combines /bin and /sbin (and /usr/sbin) by symlinking them to /usr/bin.


Doesn't this seem backwards? Given that the usr prefix doesn't mean anything anymore, wouldn't it make more sense to just put everything in /bin and keep /usr/bin as a symlink for backwards compatibility and eventually drop it?


Putting everything in a single directory allows you to mount it from a remote for system with a single mount point. /usr means "the part of the OS that is the same across all hosts".


Everything is always in a single directory, it's called "/". I don't see why you think the files under /usr are particularly likely to be remote mounted.


If you remotely mount "/" then you ran out of namespace for non-remote files.


Isn’t that supposed to be share/ and maybe a few others? bin/ and lib/ are architecture specific.


Here are the wiki pages for the move in Fedora and Ubuntu:

https://fedoraproject.org/wiki/Features/UsrMove

https://wiki.debian.org/UsrMerge


Correct, ArchLinux symlinks /bin, /sbin and /usr/sbin to /usr/bin.


Also /lib to /usr/lib.


Confirmed on Manjaro (which is Arch-based) as well.


Do you know in which Ubuntu version this showed up? I just tested on my Ubuntu 16.04 (I know, I know) and they are not symlinked.


I believe Ubuntu started being usrmerged-by-default in 19.04 (Disco Dingo), although the release notes do not mention it.

On 16.04 you can usrmerge your system by installing the usrmerge package:

https://packages.ubuntu.com/xenial/usrmerge


18.04 here, /bin and /usr/bin are still different.


I think it was done in Debian 10 which was released last summer. So you'll probably need a Ubuntu version newer than that.


> I'm still waiting for /opt/local to show up...

Wait no more, MacPorts installs to this by default.


To be fair there is a useful difference between /usr and /opt being that /usr and /usr/local are prefixes so software designed to be installed in a prefix can commingle sanely.

/opt is for programs that don’t follow any sort of FHS and need to be installed in their own world. Google Chrome and Mathematicia come to mind.

/opt/local makes total logical sense as the counterpart to /usr/local.


And then there was Nix(OS). [1]

> A big implication of the way that Nix/NixOS stores packages is that there is no /bin, /sbin, /lib, /usr, and so on. Instead all packages are kept in /nix/store. (The only exception is a symlink /bin/sh to Bash in the Nix store.) Not using ‘global’ directories such as /bin is what allows multiple versions of a package to coexist. Nix does have a /etc to keep system-wide configuration files, but most files in that directory are symlinks to generated files in /nix/store.

[Insert obligatory xkcd quote about competing standards.]

[1] https://nixos.org/nixos/about.html


The idea of Nix is that you have a huge datastore somewhere with views, where a possible view is a Unix OS. So those differences are no more relevant than saying your data is kept on disk sectors too.


really? I have only used nix - not nixos - but when I'm using nix, i can tell you, that i don't see a huge datastore somewhere with views. The programs that i use that have been installed with nix just uses exclusively absolute paths.

So if bash wants to load up a library, it doesn't say "dearest operating system, please provide me with a convenient version of libdl.so.2". Instead, it says "dearest operating system, please provide me with a convenient version of /nix/store/blahblahblah-glibc-2.27/lib/libdl.so.2".

I do have a profile folder, symlinked as ~/.nix-profile where some programs I've specifically requested to install are symlinked, but that's not sufficient and all encompassing - every program there still refers to the hardcoded /nix/store locations. I guess they also launch programs from $PATH. But I've never seen or heard of a pseudo-FHS nix view of the store.


Well, ok, I've only used nixos, so I don't know all the details.

But on the case of nix, your view is the set of packages you installed. It can not be a full Unix for obvious reasons, and the dependencies are not part of the view (so the packages themselves are not installed in a normal Unix).


> The only exception is a symlink /bin/sh to Bash in the Nix store

I only briefly tried out Nix a while back, but doesn't /usr/bin/env also exist (to facilitate shebangs)?


I run Unstable, not Stable, but yes it exists.


I think software collections for Fedora/RHEL/CentOS may do this too. https://www.softwarecollections.org/en/

Edit: No, I’m wrong. /opt/<provider name>/...

https://www.softwarecollections.org/en/docs/guide/#sect-The_...


I wonder what the next step is then. /local? /usr/opt?


RedHat actually has /var/opt and /etc/opt which, while odd, makes some sense since /opt is a free-form /usr and still needs variable space and configs.


and which preserves the functional separations between the hier elements and doesn't require changing any backup procedures.


Well, /usr/local is set as its own partition on OpenBSD so it can have different mounting instructions.

  b1c4a5629703d4fd.d /usr ffs rw,nodev 1 2
  b1c4a5629703d4fd.e /usr/local ffs rw,wxallowed,nodev 1 2
I tend to do the evil thing of always creating an /opt just for my stuff that is system wide. So, I do have an /opt/bin/ so I don't mess with the system or package level stuff.



Effort to clean it up: UsrMove

https://fedoraproject.org/wiki/Features/UsrMove


And this has led to de facto standards, like '#!/bin/sh' is a 'standard' shell shebang, but not POSIX; if you want your script to be portable, you have to use '#!/usr/bin/env sh'. env is forever in '/usr/bin' probably because it first came about after new tools started moving to '/usr'. And I don't think env's file path is even in POSIX, it's just a de facto standard.

Another fun fact is that execve() (which executes shebangs) only allows a single argument to the shebang program. Meaning if you use '#!/usr/bin/env sh', you can't specify arguments to 'sh' in the shebang. In any case, the name of the file being executed is appended as an additional argument at the end of the shebang.


env hasn't always been in /usr/bin. I've worked on systems (probably SunOS 4.something) that had /bin/env but not /usr/bin/env. But you're unlikely to run into any reasonably modern Unix-like system that doesn't have /usr/bin/env .

See also my Stack Exchange answer on the advantages and disadvantages of the "#!/usr/bin/env ..." convention: https://unix.stackexchange.com/a/29620/10454

Quick summary: The advantage of "#!/usr/bin/env INTERP" is that it will use the first INTERP in the user's $PATH. The disadvantage is that it will use the first INTERP in the user's $PATH.


If neither /bin/sh nor /usr/bin/env are standard but both are defacto standards, then where is the relative value of `#!/usr/bin/env sh` over `#!/bin/sh` ?



As a data point, my phone has /bin/sh but not /usr/bin/env. But not many shell scripts need to target Android, probably.


env is part of POSIX, so I guess your phone just doesn’t make the symlink. (If it’s busybox, perhaps it still has the applet compiled in?)


It has /bin/env, it just doesn't have /usr.


> And I don't think env's file path is even in POSIX, it's just a de facto standard.

You would be correct about that. And that sucks.


There's some debate about this if I recall correctly from the last deep dive I did about the pros and cons of it.

If I am writing a script that needs to be that portable, then I'll use #!/usr/bin/env sh but otherwise /bin/sh will work in all the cases I need it to.

Since bash's path might vary more, I might consider calling env to locate it then more often.


If you're writing a script that needs to be that portable, you're doomed anyway and you might even find the autoconf manual's "Limitations of Shell Builtins" section useful.


If you are using /bin/sh you need to check sure it is a POSIX shell script and not using bash features [1]. On Debian and derivatives /bin/sh is ash not bash.

[1] https://linux.die.net/man/1/checkbashisms


I'm not aware of anywhere that /bin/sh won't work. /bin/bash is extremely non-portable though.


POSIX does not specify the path of anything. It says that "sh" is a shell, but doesn't say that "/bin/sh" is that shell. See: https://pubs.opengroup.org/onlinepubs/9699919799/

I believe that in Solaris before Solaris 10, /bin/sh was the original Bourne shell (which didn't comply with POSIX), and in Solaris 10 it's ksh (which isn't either, but is way closer): https://unix.stackexchange.com/questions/538844/does-posix-g...

Also, I don't think that "#!" is specified in POSIX. It's extremely common, it's just not in the specification. This is similar to find's "-print0" option, which also isn't in the POSIX specification but basically everyone implements it.

If you want a specification of "where things go", the closest one in practice (and one that most modern distributions try to follow) is the Filesystem Hierarchy Standard (FHS): https://refspecs.linuxfoundation.org/FHS_3.0/fhs/index.html


(providing more detailed references)

POSIX specifically calls out that it doesn't guarantee that the path is "/bin/sh":

    107271  Applications should note that the standard PATH to the shell cannot be assumed to be either
    107272  /bin/sh or /usr/bin/sh, and should be determined by interrogation of the PATH returned by
    107273  getconf PATH, ensuring that the returned pathname is an absolute pathname and not a shell
    107274  built-in.
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/s...

POSIX mentions "#!" several times, but doesn't require it; saying things like "on systems that support executable scripts (the "#!" construct)".

As the lack of a hard-codable path is frustrating, it recommends setting the shebang at install-time:

    107279  Furthermore, on systems that support executable scripts (the "#!" construct), it is
    107280  recommended that applications using executable scripts install them using getconf PATH to
    107281  determine the shell pathname and update the "#!" script appropriately as it is being installed
    107282  (for example, with sed). For example:


As I wrote at https://unix.stackexchange.com/a/567405/5132 , I have a strong suspicion that Arch Linux dropping the FHS is the tipping point, and adherence to the FHS is now the exception not the norm.


Why would Arch dropping the FHS be the tipping point? Surely Arch represents hackers and not mainstream Linux users. In much the same way I wouldn't cite Linux users are a tipping point in any computer system convention (but Mac OS or Android or Windows), I wouldn't cite Arch in a Linux system convention (but Debian or Ubuntu or Red Hat). Why do you consider Arch as the tipping point?


Because it would be, if my suspicion proves to be the case, the point where adherence to the FHS tips over from being the norm to the exception. MacOS, Windows, and who Arch users are have nothing to do with it. How many distribution creators have made each of the various choices, does. Arch's decision to use systemd had some influence (in both directions, c.f. Hyperbola Linux). Arch's decision to abandon the FHS in favour of the systemd file-hierarchy may well do likewise.


/bin/bash is popular in Linux systems, but not BSD or Mac or probably other Unixes which may not always have bash and if they do have it, it will be somewhere else. Also, a number of niche-ish Linux distributions are trying to replace bash with something else. It's no longer as portable as it once was.


It's not just about whether it exists but also the version.

For example, on my (Mac) laptop, /bin/bash is bash 3.2.57 (released in 2007), while "/usr/bin/env bash" gets the version installed by homebrew which is 4.4.19 (released in 2016).


The PDP-11 became super influential because it was the first machine that had seemingly unbound performance for its price, compared to any computer from the 60s or 50s. This gave programmers the freedom "write to their hearts content," leading to the innovations of Unix and the C programming language. Compare it to how modern web developers write entirely in the application domain, giving no regard to operating system or machine constraints. I'm proposing that the PDP-11 gave a similar feeling to programmers who had been restricted before, to a programming process that today seems as streamlined as carving hieroglyphics with a pickaxe.

Eventually though the new programs started pushing up against the limits of the PDP-11. I propose that virtually anything that seems like magic in Unix or C, from filesystem organization to page sizes, traces its origin directly back to the PDP-11.

We are still living in the world of microcomputers, no matter how far we may seem to have ventured away.


Wasn't UNIX originally hacked together in assembly language on an already-obsolete PDP-7 that was sitting around unused?

And according to Dennis M. Ritchie, the B language that preceded C was also first implemented on the PDP-7, whose auto-increment memory cells "probably" inspired Ken Thompson to create B's ++ and -- operators.


Page sizes on the PDP-11 are 8kb. Its successor the VAX had 512b pages. I don't think there's any PDP-11 heritage there. The filesystem organzation has more to do with disk sizes than the pdp-11. Also the pdp-11 wasn't a microcomputer but a minicomputer.


Compared to the PDP-7 and PDP-8, the PDP-11 instruction set and architecture is a breath of fresh air.


Great story, I love these! Personally I always thought that the difference between `bin` and `sbin` was that `sbin` contained binaries with Superuser privileges (hence the `s` at the beginning).


The s is for "statically linked". Binaries that run early enough in the boot process that /usr/lib might not be there yet.


Sources I find claim "system binaries":

http://www.linfo.org/sbin.html


Yes.

My system does it like this.

    /bin - static linked basic user commands
    /sbin - static linked system commands (incl. important daemons)
    /usr/bin - dynamic linked user commands
    /usr/sbin - dynamic linked system commands and system daemons
    /usr/local/bin - user commands installed via package
    /usr/local/sbin - system commands installed via package


Is it ? Root and admins have sbin in their path while normal users don’t, so sbin contains more system exe. they may happen to be also static because they are called early in the boot. At least that’s what I thought


For example, /sbin/sh was statically linked on the various flavors of Unix, while /bin/sh was dynamically linked.

So, I suppose it's really both. The statically linked shell would allow you to upgrade the shared libc, for example. Which is both "system" and "static" related.


I have been using Linux/Unix boxes for 19 years. I am ashamed to admit that I never learned this until now. Thanks!


Don't feel too ashamed. I also learned a lot from this article and I've been using Linux/Unix boxes for even longer than you.


This story isn’t about the difference between bin and sbin.

edit: Clearly I'm missing something in the linked article because I'm being downvoted. Could someone show me what I'm missing?


The headline promises it, but the story doesn't tell.

I don't know what the correct historical explanation is. "s" like in super user, system program or statically linked or is there yet another theory?


The headline is unfortunately a little ambiguous. We could clarify it by introducing some extra punctuation and modifying it slightly to make it clear the "split" in question is between "/" and "/usr":

"Understanding the split between (/bin, /sbin, /lib) and (/usr/bin, /usr/sbin, /usr/lib)"

Not perfect but maybe clearer


Well, it kind of is: "The /bin vs /usr/bin split (and all the others) ...".


It really isn't. The story explains why some directories in / have an analogous version under /usr. If reading this article informed you about the difference between bin and sbin then ... I'm really not sure what to tell you because the article doesn't contain anything about this.


One can learn a modicum about the differences among lbin, mbin, rbin, bin, 5bin, sbin, amdahl/bin, ucb, sun/bin, ccs/bin, local/bin, xpg4/bin, xpg6/bin, and some others at https://unix.stackexchange.com/a/448799/5132 .


I thought it meant "secure", as in the programme what you can use when your system crashes into a "secure" mode or something like that.


Wait, that's not the reason?


It kind of is and isn't.

So sbin is reserved for system binaries concerned with booting, administrative tools, repairing/restoring (the system), and the like.

Since these essential utilities by their very nature require superuser privileges, it's tempting to think that sbin simply means "superuser binaries" or something.

Someone else mentioned "statically linked", which is also true, but again just a side-effect of being for system programs. The programs in sbin need to be able to run without /usr, /lib, or /bin being mounted or even existing is the idea here.

To me personally - and this is just my uniformed(!) opinion - the only split that makes sense across all circumstances is /bin vs /sbin and /var vs /. Simply because anything but /var could safely be considered read-only, while /var exists explicitly for writable data ¯\_(ツ)_/¯


Also, you can mount /var with no executable set, for security purposes. If you go through the various security guidelines (such as STIG and various others), they give several paths that should exist as individual mount points -- both for the purpose of setting appropriate mount options, but also if a user program has write access somewhere in it, then they can't cause a disk-full condition on another (logging-type) path.


Same with /tmp - mounting it and allowing executables is a recipe for disaster in a multi-user web hosting environment.


It doesn't help that many distros don't add sbin to the path of normal users.


Splitting data by access pattern is still valuable today: setting permissions, using different filesystems and so on


IIRC, they got rid of all this in Plan 9. /bin was where binaries go. /usr was for user data. There were also /proc, /net, and probably /tmp?


Yes, /bin is a directory where the bind[1] command can be used to multiplex/union other directories on top. The / directory in general is built this way as a virtual top directory, with the disk root being stored on /root and being bound over /.

Platform-specific files are stored in /$architecture and binaries in /$architecture/bin. Shell scripts are stored in /rc/bin. For the home directory, traditionally, this style is reversed so your shell scripts would be in $home/bin/rc.

The namespace for any process can be enumerated, so in my default shell, I can see which directories are currently bound (and how) on /bin:

  tenshi% ns | grep 'bind' | grep '/bin' 
  bind  /amd64/bin /bin 
  bind -a /rc/bin /bin 
  bind -b /usr/seh/bin/rc /bin 
  bind -b /usr/seh/bin/amd64 /bin 
  bind -a /sys/go/bin /bin 
  bind -a /usr/seh/go/bin /bin 
  tenshi% 
[1] http://man.cat-v.org/9front/1/bind


This Plan 9 nostalgia weirds me out. It never happened! It was vaporware all along!

It's like having fond memories of Hillary Clinton's inauguration speech.


Are you mixing up Plan9 with some legendary vapourware OS, MULTICS or something? Maybe the HURD? (NB: not entirely vapourware, but less finished than Plan9). Because Plan9 was (and is) a real system that you could (and can) run.


> It was vaporware all along!

I don't think “vapourware” means what you think it means.

To elaborate, the comparison with an alternative universe is silly, since there are actually papers that people can read and there is actually code that people can use. So saying that “it never happened” is wrong.


I installed Plan 9 once back in the 90's. It certainly exists, it just never reached mass levels of acceptance.


I installed it in the mid-2000s to be cool.

I'd say it's closer to the DMC DeLorean: short production run, works but has short comings, more of a meme than an actual commodity.

It's not vaporware though, they definitely delivered something that works. It's just that there wasn't a good reason to use it compared to other options.


I was using Plan 9 for years, attended several workshops, and even was paid to work on it for a little while. It happened.


I mean, you can run it right now...


pro's of using 9p, /bin is where everything goes but you can bind any directory to it. so no worrying about running out of disk space since you can mount more disks.

Also, gobolinux tries to mimic this behavior as well. https://www.gobolinux.org/


Which is essentially the Microsoft Windows hier.



I am sure that one of the original aims of the Linux Standard Base Project[1] was to fix and simplify the default file system hierarchy, but researching it now, it seems that they've backed away from doing anything drastic.

---

[1] https://en.wikipedia.org/wiki/Linux_Standard_Base


The confusing and nonsensical (not immediately obvious by typing ls and reading) folder structure in unix/linux/etc has always been one of my biggest complaints with it as an operating system.

I consider it to be a warning about the importance of refactoring and renaming things sooner rather than later, because the longer that something sticks around, the more dependencies it develops and the harder it is to change.


How does the terminal work in Unix? Well, in 1869... https://www.linusakesson.net/programming/tty/


In my life we will probably never have a man on the moon and never replace POSIX.


minor nit - initrd/initramfs don't really precede linux


What is with the insistence of (in-paragraph) hard line breaks on documents still?

I put this page into Firefox's reader mode and I get lines breaking all over the place as it's now got a mix of soft and hard line breaks.

Should presentation of line length not be up to the client?


The email is space-stuffed meaning it was probably written with format=flowed[1] (you can tell because there are spaces appended to hard-wrapped lines). That RFC was invented precisely for the purpose you mention: hard wrap lines in "dump" plain text clients, but allow "intelligent" clients to soft-wrap according to their own display width.

This email would display at full-width in a client that respects format=flowed (many modern email readers, except for, notably, Outlook and Gmail). But as others have pointed out, this is literally just the plain text of the email wrapped in a <pre> tag, so you get the hard line breaks.

[1]: https://joeclark.org/ffaq.html


It it a <pre> formatted section from an old usenet post with 80 char line length... often on usenet posts were specially formatted with whitespace chars. Usenet display tools almost always use PRE and preserve that using monospace fonts to preserve the full possibilities of the information transfer.


It’s from a mailing list, not Usenet.


In Firefox's reader mode, click the "Aa" icon in the left-sidebar and you'll see -><- and <--> icons for changing the line length. Click <--> a couple times.


You're not wrong, but this is plaintext in its purest form. It's literally just plaintext wrapped in a <pre> tag. They don't use any CSS on the site, besides the very infrequent `style` tag. This is more common with the type of people who use text-based browsers like lynx. (edit: As another comment says, note they use 80 chars)

But as someone who has bash aliases for extracting text from a website so I can read it in my terminal (or page it with `less -r`), yes, this is something that bugs me too.




Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: