I consider this book pretty much required reading for junior devs and others in engineering who don't have a decent handle on basic Unix skills for whatever reason--which seems to be a more common occurrence these days than earlier in my career.
It is a very kind gesture to make this book freely-available.
It does have a small problem in the number of "bashisms" that it quietly promotes.
The entire Debian/Ubuntu family has chosen the Almquist shell because of speed and standards compliance, and many things in this book will not work there.
It would be helpful to have a book that is clear that the POSIX.2 shell is not Bash.
> It would be helpful to have a book that is clear that the POSIX.2 shell is not Bash.
Maybe the one thing I love about Linux culture (compared to the BSDs, illumos, etc.) is that it generally doesn't require everyone to adhere to some high UNIX religion.
Writing a shell script of more than 50 lines is hard to do correctly. Bash sometimes makes that a little easier. And it's almost ubiquitous. Forcing new Linux users to learn a POSIX compliant shell, when they are only likely to be using Linux and bash, is simply cruel and unusual. Just as including only `vi` in the base system instead of `nano` is a silly self-own.
I see this unreflective question far too often: "Why aren't people using my system and using Linux instead?" And the answer is pretty simple, Linux at least made efforts to reach out to people where they were, in hundreds of low effort ways, whereas your system probably didn't.
"Everyone should be using Linux" and/or "should love the command line" is pretty silly, but just as silly is "people should learn the POSIX shell" first. A shell they are likely never to use interactively. Because of a hypothetical portability concern? It's nuts.
In college during the introduction to command-line class the professor simply required everyone to write only POSIX-compliant shell scripts. And your shell scripts must work with newlines in file names. We did fun things like parsing (a subset of) HTML using only sed (did you know about sed's hold space and pattern space?). It was basically an intellectual puzzle for geeks with too much time on hand.
The impact upon the Korn shall was actually to reduce functionality, and a number of features were removed (arrays, coprocesses, etc.). The reason for this is that the source code for the Korn shell was configured to compile to a 64k text segment (for 286-Xenix), and readability/maintainability was sacrificed.
Due to this lineage of the POSIX shell, it is very friendly for embedded systems and other constrained environments in ways that its descendants and competitors are not (and can never be).
It deserves to be distinguished from Bash, without question.
A little revisionist since POSIX had its seeds in 1984, which was long before the SVR4 collaboration.
Giving DEC, IBM, and HP credit for POSIX is a little odd. They were actually the biggest beneficiaries of the fragmented nature of Unix, with their proprietary flavors. AIX vs HPUX vs DEC Unix all meant that Ken Olsen could pretend to support Unix and at the same time tell the world that "Unix is snake oil".
> It deserves to be distinguished from Bash, without question.
Absolutely, and I agree with your sentiment at a broad level -- although I think the book notes who the book is for, and that `bash` and `sh` are distinct, it doesn't explain the why of maybe considering `sh` for other uses which aren't really the focus of the book, and perhaps it should?
> Due to this lineage of the POSIX shell, it is very friendly for embedded systems and other constrained environments
I'm wondering if this will matter as much in the future. Are people really going to be programming 16 bit microcontrollers when a 32 bit parts are becoming so cheap? How constrained are we really going to be?
just as silly is "people should learn the POSIX shell" first
Where did the GP make that assertion? I don't see anywhere in the post where they say "please rewrite all the shell examples in POSIX-compliant sh". All I see is a request to inform the reader that not every shell is bash.
IMO, the year is 2022, almost 2023. Trying to do POSIX sh is tying both arms behind your back. Set the shebang to #!/bin/bash, and now you've got at least got only one arm tied behind your back.
If your environment somehow lacks bash, consider that the bug, and get bash.
Or, untie both arms, and use a language that's not going to take every opportunity to stab you in the back with subtle idiocies dating from the last millennium.
(But I do agree that, if you're going to put down a bash-ism, it's bash, not POSIX sh.)
> Or, untie both arms, and use a language that's not going to take every opportunity to stab you in the back with subtle idiocies dating from the last millennium.
May I suggest that you attempt to convince The Austin Group/Open Group of your position, to update what was originally known as the POSIX.2 shell standard?
what kind of circumstance are people in where they can use posix sh but not bash? in the year 2022? I don't understand it. fucking everything has bash nowadays. and if you're on some machine that unaccountably doesn't have bash, then why can't you just sudo apt install bash, or scp the binary over and chmod +x it?
if you can't even do that, maybe reevaluate what life choices got you to this place. life is too short to write strictly posix sh.
Linux is not the whole world. Debian is not the whole Linux. Interactive shells on servers are not guaranteed to be available. Control over the installation image is not guaranteed, or always desirable.
> life is too short to write strictly posix sh
Alas. Life is too short for writing or responding to uninformed rants. Yet here we both are.
POSIX shell scripting has its place. It's not as difficult or complicated as you imagine. The portability tradeoff is valuable or essential in many real-world cases.
>Linux is not the whole world. Debian is not the whole Linux. Interactive shells on servers are not guaranteed to be available. Control over the installation image is not guaranteed, or always desirable.
I didn't literally mean "sudo apt install", I meant "install the thing by whatever means is appropriate to the platform". if you can't install anything then what are you actually doing with it? you have to get your software onto it somehow.
and at any rate, the article is literally called "the Linux Command Line"
>The portability tradeoff is valuable or essential in many real-world cases.
like what? seriously, what?
I don't get it. I understand software that has weird/harsh requirements because it's on a super constrained embedded platform. that's a real limitation. but being able to run posix sh but not bash seems like an entirely artificial problem. there's no rigorously conceivable resource constraint that would do that. if your unix is so fucking degenerate that it can't run bash then that's a problem of your own choosing.
anyway if it's so easy, you can take on the burden of porting scripts from bash to sh yourself, instead of annoying the rest of us with the "bashisms" whining.
Android as policy does not allow GPL software in userspace.
The MirBSD Korn shell, mksh, is used there as the system shell.
That footprint is huge.
Some utilities that I have used on Android bring a copy of Bash along, but it's not standard.
I also mentioned above that I do some shell scripting with Busybox-win64. I can't use arrays in the "bash" that it implements.
I've also worked on routers that use the Busybox shell, same problem.
Dash is also far smaller and advertised as 4x faster than Bash. Anything I write that targets dash will run on all the others, so this skill is quite useful.
> The entire Debian/Ubuntu family has chosen the Almquist shell because of speed and standards compliance, and many things in this book will not work there.
So Debian/Ubuntu ship completely without Bash? According to [1], it's Bash, but it might be outdated.
"DASH is a POSIX-compliant implementation of /bin/sh that aims to be as small as possible. It does this without sacrificing speed where possible. In fact, it is significantly faster than bash (the GNU Bourne-Again SHell) for most tasks."
Hm but the default shell is still bash. It would make sense that "bashisms" don't work on "sh". You should use bash or bash derivatives (zsh) for bashisms.
I think the bigger stumbling block is that people often can't differentiate bashisms from basic shell scripting.
bash is the default interactive shell for users, but dash is the default shell for scripts with a #!/bin/sh shebang (or no shebang, unless they're run from bash), system() calls, etc. This means bashisms will work sometimes, which can be worse than not working at all.
I agree about the importance of differentiating bashisms from basic scripting, but I think the root cause of that is documentation (including this book) that don't make the distinction.
The book is very clear about the fact that it’s using Bash, even noting when a feature requires a newer version, and uses #!/bin/bash for scripts throughout.
...But the foreword for any script could just as easily be `#!/bin/bash`... you're choosing not to use bash. If you're on a system with no bash, and are forced to use `sh`, then just don't use bashisms...
I'm not trying to be difficult, but I just don't see a problem. Explicitly use bash (or a derivative) if you want bashisms, don't if you do not.
Because developer tools are increasingly polished and easily accessible without in-depth knowledge of the arcanes of the command line. And that's great to welcome more people in our field.
The downside is that once they're welcome, the climb of the learning curve is barely started.
Those polished tools hide a lot of the underpinnings so that nothing is understood. We get header-only C libraries (not C++) because people don't understand build tooling and want something that works on easy mode.
I'd say that's a bit orthogonal - command line tools are also hiding underpinnings from you. The issue is about the UX, not the functionality. Seemingly every command line tool has built it's own language of incantations on how to use it. I've been using grep, tar, curl, find, xargs etc. at least a few times per week for over a decade now now. I shouldn't have to look at the man pages this regularly just to remember the letters I need to type on something I've done 30 times before.
I don't have this problem with the equivalent libraries in the programming languages I use, give me 'easy mode'. We need new binaries with a modern UX, or a better IDE for using the command line.
Command line tools are composable. They give you superpowers by being able to rally a collection of unrelated programs into working for your benefit. Monolithic GUIs can't do this.
Who said anything about a monolithic GUI? Just compare the UX of a modern programming language and its standard library to that of Unix OS's. By and large they have a unified, more user-friendly style, better tooling, better documentation.
For me, it's hard to make good software today because the bar is actually higher, in the sense that it must be usable by a much wider audience and I can't be lazy about usability or even worse, let it unnecessarily complicated because "people should know about it or learn".
Need to migrate your IDE project to a CI system? How do you do that if you only know how to click through a wizard? Everything has to be spoon fed and if it isn't they are paralyzed.
I was astonished last week again how bad GUIs for git are. Take VSCode: It encourages you to enter just the title of the commit message. That's the boring part for people used to describe WHY that change was made in the long message.
It arguably follows the underlying git design more closely than many git clients. Git has no concept of commit titles and commit descriptions - it's all just a message, so in VS Code you write it all in the same box with newlines [1].
It's definitely not obvious if you're used to having separate fields, but it's a bit unfair to complain about all GUI clients because an opinionated text input box is missing from a single feature from one of them surely?
Do you think that this is because they don't really need to anymore with all the nice GUI/tools around? Is there an advantage to bring able to do something on a command line vs other ways?
Asking because I'm at a bit of a crossroads - I have a good handle on about 5% of the command line knowledge which gets me through 80% of the stuff I need to do. I'm wondering if learning to use more commands is worth the effort when I can already get the task done without using the command line?
I think that you probably have 80% of the gross capability, but you will gain a huge level of refinement. I'm about 10-15 years into the command line (depending on how you count, lol) and I keep finding more stuff that just wouldn't stick 5 years ago, because I simply didn't have those problems yet.
E.g. there was a time in my life when I didn't get what `xargs` even did, I literally couldn't grok it. Now, I can't imagine life without `-0`!
The composability of UNIX commands is just so amazing. Every command builds on every other command. I haven't seen that with any other tools or program.
Easily handling arguments with whitespace is the obvious answer.
e.g.:
find . -type f -print | xargs wc -l
vs.
find . -type f -print0 | xargs -0 wc -l
If you have filenames with whitespace, the first will skip a bunch of files.
If you're doing processing on other inputs which may include whitespace, you'll be able to handle those using xargs without worrying about additional delimiting.
And for those not aware, '-print0 / -0' arguments to find(1) and xargs(1) respectively tells the first to output ASCII NUL delimited arguments, and for the latter to expect the same. As ASCII NUL should not normally occur within any argument, it's the ultimate delimiter.
I've used xargs and/or bash loops to do some fairly heavy-duty web scraping given an input list of arguments. Using xargs allows multiple simultaneous parallel queries such that any one request stalling out doesn't hold up all process.
And to clarify, by "skip a bunch of files", what I of course mean is that if your document is named "Your Document.md", what the first find(1) command will do is attempt to run "wc -l" on "Your" and "Document.md", neither of which it can find, a fact about which it will complain bitterly, creating much noise in your shell session. It will fail to run "wc -l" on the file "Your Document.md", which the second example will do correctly (and not on the name-fragments).
"wc -l" gives a line count for a file. That's ... not especially advanced, but is a trivial example of an xargs command which you should be able to try without causing any problems on your own system.
It's tough to quantify the knowledge levels, but, I'd say that a practical understanding of bash loops, piping, grepping, and 'cut' ing text. Is a good start for basic dev work.
I've seen hours wasted writing jankey one off c++ utils, for lack of knowing that grep/awk exist.
Being able to tail a dev log and filter it for errors is also a part of seasoned developer competence too, I'd say.
I wouldn’t call myself seasoned developer but tail (and grep) saved so much time during my masters thesis, I don’t even know how I’d checked for new entries in my log files other than using cat. When working with anything posting a log, I’m very grateful to know about tail.
Pipes, redirection, scripts, etc are some of the most important reasons why command-line is better in many ways than GUI. In some cases GUI is helpful, but often, command-line is much better in many ways.
For this and other reasons, I will often write programs as command-line using standard I/O for most data. For example, a music playing program which reads the music file from stdin and writes the audio data to stdout which is taken by aplay to play the audio on the speaker. (In this way, you can also add other thing in between such as special effects, if desired.)
(It is also useful for other programs (interactive or GUI or whatever) to have the opportunity to open pipes with other user-specified programs; Meirloom-mailx is one interactive command-line program does it (I often use that feature to display pictures attached to messages), and Free Hero Mesh (which has GUI) also exclusively uses such pipes to import/export levels and pictures and move lists.)
The book is freely downloadable as a PDF, and the introduction gives a brief answer to that question.
The rest of the book offers a somewhat longer-form answer.
Another reference I'd strongly recommend is the O'Reilly UNIX Power Tools book, which though it turns 30 as of the new year remains one of the best "what are some really useful recipes" books about and for Unix / Linux.
When contemplating a future investment into learning something, I like to consider the long-term benefits. Do we believe that command lines will become more or less important in the following years? My bet is that they will continue to fade over time, as they have been for years (decades?). I don't think there's much sense to invest more than the minimum unless there's a specific use case in mind.
Consider each program as a node in a graph, with the edges as the possibility of interop -- output of program A working as input of program B. With UNIXy command line programs this is pretty close to a fully-connected graph. So, per Metcalfe's Law, the value scales with the square of the number of programs you know how to tape together.
Command line programs compose like this, GUIs don't.
depends on how close you wanna get to the servers, and then the hardware?
I suppose it's a super rare occurrence now, but back when I was starting on this, I would mess up my X server and was forced to fix it from the command line.
this true. getting started in the industry has a lot of assumptions about command line knowledge, every other tutorial or book kinda assumes it. it wasn't until 3 years later that I realized that I should have learned it sooner.
I had done a review of it as a blog post under the O'Reilly Bloggers program which existed some years ago. But it and other such book reviews disappeared after a reorg of the site. So only a stub remains:
I'd be cool to type something like `// spin up a lamp stack docker image`, and have below it the ghosted auto-completed command powered by chat gpt or something.
I'd go so far as to say all command line apps should have an -explain suffix, that doesn't actually run the command it just parses it and outputs how it understands it in plain text. Combined with the above to help avoid AI mistakes.
I'm sorry, but that sounds awful. I turn to the terminal when I want precise and exact control over commands - the above seems highly fuzzy and error-prone.
Some programs kinda do that, but you're hitting on an old
problem that Unix/Linux has a few ways of doing similar things
all of which can miss the mark.
commands info, man and apropos are fairly close to your goal, but:
- info only covers bash/core-utils
- not everything installs something into the apropos file (and many
package managers seem awful at remembering to delete entries on
uninstall)
- man pages have many standards and good practices, but some are
still terrible or nonexistent
So a --help switch is the lowest common denominator. And has the
advantage of living in the executable as an argument parsing
outcome. But that means it suffers locale parochialism.
I think people may be afraid adding yet another switch like --explain
would only make things messier.
So far I think the best is usually the first stanza of a well written
man page.
You can't fix this in the tool you are trying to invoke because the tool never runs. The error E2BIG is returned from the syscall `execve` because it fails to run the program.
If you just want to work around this on some architectures you can simply raise the stack limit which defaults to 8MiB.
If you want to do this in a portable way, learn to use `xargs` or `find`.
> If you want to do this in a portable way, learn to use `xargs` or `find`.
xargs and find are a hack here, because it will start the target program multiple times. They only work if CMD FILE1 FILE2 has the same meaning as CMD FILE1; CMD FILE2.
The fix with the stack limit is also a hack. What other system requires you to specify a hard limit, these days?
I think it does. The man page for execve(2) documents it. On my system, the documented limit is 2 MiB.
OTOH, I do sort of see the argument for "why is there an arbitrary limit here?", beyond just "it's limited by available memory".
OTOH, what on Earth is one doing, passing more than 2 MiB of arguments to a program? Surely one would be better served with piped input, at that point…
I've hit command line length limits on both Windows and Linux (and they're different lengths).
Typically when I've run into it, it's been with build systems, especially those built around make/gnumake. Have a set of object files that need linking, what do you do? Typically, just expanding the variable holding the list as an argument list to ld or whatever works just fine. Until the project grows substantially and you're linking hundreds or thousands of object files. And what if they're in a different path? A subfolder? What if you're using absolute paths to object file locations? These things can blow up in weird and surprising ways and it's not always obvious what the cause is.
Thankfully, most tools used to build software typically have a mechanism to read arguments from a file that doesn't suffer from this limitation.
Also, as a sibling comment mentions: globs expansions. Typically done by the shell, then passed to the invocation target. Better to pass the glob in quoted and have invoked program do the expansion...
> OTOH, what on Earth is one doing, passing more than 2 MiB of arguments to a program?
Simplest example is:
mv * /some/place/
in a folder with a lot of files.
Also, if Python had a limit of 2M bytes in e.g. a list, then all hell would break loose. Yet, Bash is some people's Python, under some circumstances (and in Bash you're more likely to run into this problem).