Hacker News new | past | comments | ask | show | jobs | submit login
Pure Bash Bible (github.com)
746 points by ausjke 35 days ago | hide | past | web | favorite | 248 comments



This seems like a good time to mention my (ridiculous) project, a ctypes module for bash.

https://github.com/taviso/ctypes.sh/wiki

There are some little demos here:

https://github.com/taviso/ctypes.sh/tree/master/test

I even ported the GTK+3 Hello World to bash as a demo:

https://github.com/taviso/ctypes.sh/blob/master/test/gtk.sh


I found ctypes.sh to be legitimately useful for managing resources in nix-shell.

E.g. flock a lock file and set CLOEXEC so that subprocesses don't hold the lock open after the shell exits.

E.g. Use memfd_create to create a temp file and write a key. Then pass /proc/$$/fd/$FD to programs that need the key as a file. When the shell exits, the file can no longer be opened.

You can do similar things with traps, but they aren't guaranteed to execute, whereas these OS primitives will always be cleaned up.


"ctypes.sh is a bash plugin that provides a foreign function interface directly in your shell."


This is so awesome. I need to play with this.


This is MARVELOUSLY ridiculous and I love it. Thanks for sharing your work.


Hello, I'm the author of the Pure Bash Bible. Happy to answer any questions you may have.

Here's an example of what bash is capable of: https://github.com/dylanaraps/fff/ (a TUI file manager written in bash)!


This might be an odd/off-topic question, but in Telegram this article has an auto-fetched thumbnail of a cat smoking a cigarette and a text similar to 'heavy metal music playing', I'm just curious where this picture is from, if you have any idea? I checked the README for the repo, pictures of the contributors etc. but I'm unable to figure out where it's coming from.


That's a very very very old GitHub avatar of mine, I wonder why Telegram hasn't pulled a later one.


You can use https://telegram.me/webpagebot to refresh avatar cache


Telegram also shows my old Github avatar - it seems to cache them extremely aggressively.


Curious, thanks for the answer!


The site[1] where the video is actually hosted has the image you described as the avatar of the uploader. Could it be the reason?

1. https://asciinema.org/a/qvNlrFrGB3xKZXb6GkremjZNp


Just some feedback on the sale of the book (which I wanted and probably will buy).

Upon checkout you need to confirm the sale via a email sent to the email address provided by you.

I may be the exception and granted - it's basically due to a very crappy phone on which email ceased to work - but I was not able to finalize the sale since I'm not able to access my private email remotely.

I sent myself the link and will probably give it another shot from home. But you may want to take it up with the seller that there are folks out there for which this is not really a convenient way to close a sale. Especially not after entering valid credit card information.


I'll look into also putting the book on Amazon or another "ebook" website. I chose leanpub as it _is_ for books like this but if it does cause issues for people I'll explore my options on other platforms.

Thanks for letting me know and apologies for the inconvenience.


It's more an annoyance than really a problem and stick with them rather than Amazon. Leanpub looks like a nice enough bunch and you, as the author, receive most of the proceeds.

I'm probably anyway the exception nowadays not being able to access private email remotely. But what you may point out to them is that it may be worthwhile to think through their checkout process. Also for their own benefit and the benefit of other authors

As I said I sent the link to myself and if I don't forget will buy it from home.

I really like to support you and your efforts and it looks like an awesome resource for somebody using bash a lot.


I'm a bit puzzled how you've ended up with a system where you can't access email remotely. That probably took a lot more time than just creating a gmail account or even using your own domain but hooking it up to google for all the boring stuff (like, uh, making it available remotely). Although you mention "private" email so perhaps you also have "public" email which you can access - i'm not sure how I'd configure mail servers to deal with that. Back when I wasn't confident accessing email via random internet cafes etc from my phone I created several accounts and have stuck to that system; some accounts i only access when i'm sure it's safe, and others - not used for banking etc - i couldn't care less about and use from anywhere.


My phone sucks (Nokia 8110 4G, where mail conked after the first software upgrade) and working for a bank, which disallows external messaging for regulatory reasons.


Pointers to documentation of features used would be a major benefit to this reference.

The hacks (as a 30+ year sh / ksh / bash user) are indeed Very Cool.


[OT] Hi Dylan! Just discovered your project - KISS. I respect you a lot for what you write, and you've been an inspiration at times.

Did you just decide one day that you have to write a distribution from scratch? What was the thought process, and how complicated is it actually. Also, I'd like to contribute if there's a chance.


> [OT] Hi Dylan! Just discovered your project - KISS. I respect you a lot for what you write, and you've been an inspiration at times.

Thanks, I appreciate it! :)

> Did you just decide one day that you have to write a distribution from scratch?

Pretty much. I'd been distro hopping for some time and wasn't happy with any of the choices in front of me.

I wanted something that could run without `dbus`, `glibc`, `systemd`, `wayland`, `polkit`, `elogind`, etc etc and none of the other distributions could provide this.

Even Gentoo through their arms in the air when Firefox 69 broke the `--disable-dbus` configure flag (and added a mandatory dependency on `dbus`).

I instead spent the hours patching `dbus` out of Firefox 69 and that's how I ship it in KISS. https://github.com/kisslinux/repo/blob/master/extra/firefox/...

> What was the thought process

Start from zero and build piece by piece questioning each step along the way. Is this needed? Are there alternatives? Can we do this in a "simpler" way? Step away, come back to it later and ask "was this right?", "can we trim back the fat?".

This repeated until things were effectively "done".

> and how complicated is it actually

No piece of software seems to list its (mandatory) dependencies properly so it was a trial and error of figuring out _exactly_ what each piece of software needs.

Looking at other distributions themselves wasn't much help as they list a lot of "optional" dependencies as "required".

There's also no (or very little) documentation online for how to write a package manager or Linux distribution from scratch.

It's been a tedious but rewarding process thus far. I'm talking to you from KISS right now! It feels good to turn on my laptop and be running a distribution I created from scratch. :)

> Also, I'd like to contribute if there's a chance.

Go for it! In terms of contribution there's bug reporting, fixing documentation, adding missing packages, fixing bugs in existing packages etc.

Hop on IRC (#kisslinux @ freenode.net) if you'd like to chat. :)


completely unrelated, it's so funny that i saw this comment, looked at your name and thought "hey, i know this guy!". it turns out i forked your dotfiles ages ago when i first started playing with i3. also neofetch is pretty cool!


Thanks!


Any plans to sell the book in dead tree format?


That'd be really nice and it is something I've thought about, the issue is figuring out the _right_ way to do it. Self publishing or perhaps a publisher of programming books?

I can see now that there's a clear interest in a release in physical form. I'll start seriously looking into it. :)


fff looks amazing. I’ve started building something like it many times, but never finished. Thank you


Thanks :)


I have a personal dislike for regexes and non human readable code, it gives maintenance headache. It's why I avoid shell scripts as much as possible.

The first example is not human readable, if the name of the function is a lie, I have no idea what this piece of code do :

  trim_string() {
    : "${1#"${1%%[![:space:]]*}"}"
    : "${_%"${_##*[![:space:]]}"}"
    printf '%s\n' "$_"
  }


I have a personal dislike for "non human readable" used as shorthand for "not immediately and easily readable by me".


I'm not sure you are being fair. I think any frequent user of regex understands how easy it is to produce expressions that are very difficult to parse.

I also think that a discussion about readability makes sense in a post about Bash. I don't personally program very often in Lua, Go or Ruby, but for the most part when I encounter code in these languages I don't find it very difficult to understand and modify. In contrast to this, every encounter with a significant amount of Bash seems to lead to a great deal of googling.


I haven't said anything about regexes here. What I wrote is that I find the shorthand "non human readable" specifically to be objectionable. Are regexes sometimes difficult? Sure. Are people who read bash also human? Yes, and that's my point. I hope that point is fair.


The human part is a hyperbole. "This code is unreadable" implies "This code is unreadable for the intended audience."

Here's how I'd write that function: trim_string() { python3 -c 'import sys; sys.stdout.write(sys.argv[1].strip())' "$1" }


"A collection of pure bash alternatives to external processes."


It is a neat academic exercise to find these but I think that's sort of the point - why?

If you're building something that multiple folks will be reading and using just call out to external processes and make it easier to follow what's going on. It's neat to know how to do these things, but they're honestly a bit of a security hole because a large portion of folks using them will never comprehend the why and how and just assume it's doing the proper what.


Why? No external dependencies, your code can run anywhere you have a Bash shell. This also means a smaller attack surface.

Also readability: While some of the examples may be somewhat confusing to someone who doesn't have a lot of Bash experience, to someone who does it can be a lot more readable then feeding a string in yet another language to an external program and processing the output..


Sed exists everywhere for a reasonably complete definition of everywhere - if it doesn't exist then you probably don't have access to Bash (and definitely not a fully featured one)


I have managed networks with 1000+ (microwave link) network devices that had Bash but no sed.. Sure was nice to be able to script maintenance on those..

Your definition of "reasonably complete" might not work for everyone else's use case.


Other people mention constrained environments, but tight loops are also a thing. If you know bash's parameter expansion well, it can make it really easy to write a one-liners that speedily process large amounts of output. Setting up and tearing down a whole process (e.g. sed) can cause an order of magnitude slow down, which is significant if you just want something quick and dirty.


I feel like... maybe? But quick or dirty should be enough - you could also build a python interpreter once to map over the data and then maybe write up some tests to confirm the functionality.

Every explanation of why to use bash here seems like it's got a whole lot of constraints on when it's a good to use - I've finished some tasks just in bash scripts but when I'm writing something for anything other than one time passes it seems like more of a maintenance liability than anything else.


Meh. Choose your poison. As the trope goes, "No language is good at everything." Python is verbose compared to bash for many tasks, and the reverse is true as well.

IMHO, spending the time to actually learn some bash can really improve one's CLI life. Bashing on bash seems mostly like a tired ol' trope, "If it's in bash, it's bad."

The author of the original article really has written some pretty shell code!


Useful in small embedded. Add twelve lines to a boot script, or bring in several new executables into /bin?

There are resource constraints even in large-ish embedded. Your main file system may live in a decent amount of flash space, but when the kernel is booting, it uses a tiny file system in RAM, which is pulled out of an initramfs image. There is a shell there with scripts.

The partition for storing the kernel image (with initramfs, device tree blobs and whatever else) might be pretty tight.

However, because a lot of these snippets depend on Bash extensions, they preclude the use of a smaller, lighter shell.


Wouldn't a compiled binary be smaller and faster than an interpreted language? Assuming stdlib is already loaded.


Problem: we're using trim_string in an initramfs image. Now we need Python3 in initramfs. Oops, kernel doesn't fit into the ARM board's flash partition any more ...


lmao @ spinning up a python interpreter just to perform a trivial string operation


The problem is the direct logical implication of the hyperbolic statement "X is not human-readable" is that only non-humans can read X. Which is rude.


I think this bible is an interesting attempt at avoiding call outs but... I agree with the parent - /\s(.?)\s/\1/ is just easier to read, even with character classes /[:space:](.?)[:space:]/\1/ than the trim example, it's actually a bit sad that string manipulation is the first section since string manipulation is such an inherently dirty task to do.

It is neat to discover functionality of bash I was unaware of by see folks push it to the limit though.


Just because you seem to be unfamiliar with bash doesn’t mean it’s unreadable. Almost any language will look cryptic if you don’t know it. That function is mostly just parameter expansion and very common in most bash scripts. I bet if you read the manual you would easily be able to figure it out. You just have to learn the language.


I'm with jmnicolas on this. I've been using bash on and off for years now (approaching decades). I still get it very wrong almost always. It's the only language where this is a persistent problem for me.

Just yesterday I was struck by the difference between if [[ ]]; and if [ ];

I didn't even bother grokking the difference in the end. I simply found something that worked and moved on with my day.


Single square bracket conditionals [] are the "older" (POSIX) version but do not have certain features that we would like to use in our shell scripts. Bash uses double square brackets to extend the functionality.

This is my understanding. Possibly not 100% correct but essentially correct enough for me to understand the reason/rationale for the difference.


You got it! `test` is a shell builtin. `[` is basically syntactic sugar, and is a synonym for `test` with the additional requirement that the last argument must be a closing bracket `]`. `[[` is a shell keyword with more features.


I'm in no way defending bash as a language. There are lots of gotchas and weird constructs. I avoid bash too. It's just that trim function isn't that cryptic or "unreadable" if you know the syntax.


I think the point is that it takes a lot longer pick the syntax up compared to some if/then/for/loop version of trim even if that would be way more verbose.


Yes, but I think that's antithetical to what languages like bash and perl are trying to do. Someone well versed in the language can do some pretty complex operations in a few keystrokes.


which gets back to the original point that these languages have chosen to be more powerful over being more immediately readable. you have to know many more features of the language to understand what is going on compared to colloquially used constructs like if/then or for loops. If you understand english you are a lot closer to understanding the latter rather than the former, more domain specific syntax.


I don't agree at all. I don't find it less "readable". You just have to be used to it. If you understand English, German isn't that hard of a language to learn. Japanese would probably be a little bit harder to pick up though. Doesn't mean Japanese is unreadable


Maybe "unreadable" and "readable" aren't the best ways to approach this conversation.

It's certainly less readable than, say, my_str.strip()


Are you comparing a call to a definition? `trim_string "$my_str"` is no less readable than `my_str.strip()`.


I mean, I would say it is. Why are there quotes around the variable? What happens when I remove the quotes?

The quotes don't function like the parens. If this were a two argument function, you wouldn't put one pair of quotes around the whole thing. They're clearly transforming the variable somehow, but I'm not sure how and/or why they're necessary.


Sure, but bash only has certain language constructs. This implementation is using parameter expansion and some other built ins. It's definitely more complex than yours, but that's what bash has to offer.


I highly recommend running shellcheck on all of your bash code. It is what finally taught me the practical difference between [[ and [. There are some tests that will always pass in [, for example, but work properly in [[; one project never realized.


My (personal) better recommendation is to avoid bash scripts whenever possible ;)

I generally tell people that, once a script is longer than ~100 lines and/or you start adding functions, you're probably better off with something like Python.

I know that's not a popular opinion with shell enthusiasts, but it's saved me so much frustration both in writing new scripts and coming back to them later for refactoring.


> I generally tell people that, once a script is longer than ~100 lines and/or you start adding functions, you're probably better off with something like Python.

Agreed. My threshold is usually "when you want to start using arrays or dictionaries". I find bash best for file manipulation and running programs in other languages


Bash is a typical Unix tool, and Unix follows the KISS principle. Hence, also Bash scripts should be KISS.

If you need >100 lines of code your code is too complex in the script world, and you should split it into several scripts where each does one thing, and that one thing well. Usually, bash scripts are applied with pipes. A sequence like cat x | tr a b | sort >output is much more likely and easy to handle then a single script which does all these things.

KISS with Bash is a very different approach compared to other scripting languages like Python and Perl. There you can write long code easily and conveniently. However, things can get tough when larger scripts need to be maintained.

I consider the examples in the "Bash Bible" a collection of useful black boxes. It's fine if they just work. Regarding the "unreadable" trim_string for instance, if you have problems to understand that code, and you have to change something then you can simply write your own new trim_string script, even in Python or Perl if you like. Pipes work also well with them.

Update: Another advantage of bash and pipes over Python/Perl is that the Unix system can assign each script in a pipe to a separate thread. That means, simple bash scripts with pipes can work _much_ faster than single scripts in Python or Perl.


I tend to work in Windows, Linux and MacOS pretty regularly... so I tend to follow a similar approach (node instead of python though). Windows-isms under msys(git) bash in Windows just gets painful to coordinate at times.

I am curious if people will uptake PowerShell Core (though not much of a fan there either).


I'm so glad I'm not alone in this.

I never even bothered to learn Bash properly because even that's difficult, and I figured it'd be "good mental money after bad". And, now that Python ships with all distros, I feel even less need to.

I do like Bash's range syntax though:

    for i in {0..5} ; do echo $i ; done
Like Ruby's. It's so nice! D: I lament Python's lack of it.


    echo {1..5} 
is even nicer. Or

    echo {1..5}{1..5}{1..5}


My rule of thumb is to use a real language if there is more than one if fi block.


So, what in your opinion constitutes a "real language", and why?


a "real language" would be any programming language that is primarily designed and presented as a programming language.


Nice recursive definition..

The practice of programming in shell languages was well established when Bash was designed and Bash was definitely designed with that use in mind. So by your definition Bash is a "real" programming language.

Lisp on the other hand, was much more designed as a system and formal notation to reason about certain classes of logic problems.. So, Lisp is not a "real" language?


Bash is primarily a shell, it even has the word shell in its full name.


And a shell can't be a real language? https://scsh.net/


a shell can't be a real language?

No, it can not.


By "programming language" he probably means something that has robust flow control mechanisms and metaprogramming facilities.

You can definitely tell there's a different "feel" to bash and Tcl, Python, Perl, Go, etc, yes? Shell languages basically evolved out of batch processing languages that were meant to only run programs in sequence and it shows.


I agree, but I like phrasing it as "only run commands with substituted arguments." There's some more discussion on this below. :)

In the past I almost exclusively used Python, but I'm starting to like Go.


My disclaimer is that I love bash. I write way too many things in bash. But eventually I go back and refactor the scripts into Python (or lately, Go) because I am so sick of people refusing to touch my bash scripts.

String munging in Python is so much objectively easier to read than bash. I would say that it's because it is similar to a lot of the more popular languages than the sort of cryptic parameter substitution bash provides. And if that makes it easier to maintain some automation, then its worth the effort just to use Python IMO.


Python is a portability minefield. I'll spend more time trying to get the script/package run than reading the bash script.


Couldn't agree more, but honestly I think that is a completely overplayed issue. Just be explicit (This requires Python3.7) and have a requirements.txt file, maybe bundle with a `make install` command and move on.

If people can't figure that out... I don't know how you're going to expect them to read a cryptic bash script.

Don't get me wrong there is totally some good use cases for bash. Init-scripts come to mind when you don't want to lug around a Python VM in a lightweight container, for example.


I don't think it's a matter of familiarity. Most of us here have used multiple languages, and I have no problem saying that the Bash syntax is atrocious.

"How do we close a statement.... errr. Let's spell it backwards".


How is it not a matter of familiarity? Just because it's not intuitive to you doesn't mean it's bad. I find anything not in an S-expression to be atrocious. Doesn't mean I can't learn and understand "horrible" syntax where I have to separate things with semicolons...


I am saying, we are allowed to have an opinion on it. You can be good at something and still acknowledge that is sucks.


> "How do we close a statement.... errr. Let's spell it backwards".

That's on the original Bourne shell and its author's evident love for Algol 68. Wikipedia has a fine summary.


"Almost any language will look cryptic if you don’t know it."

Seems disingenuous to me. Java and Python are in a different class of readability than Bash or Perl.

Example: "abc" + "def" = "abcdef" vs "abc"."def" = "abcdef"


> "abc" + "def" = "abcdef" vs "abc" . "def" = "abcdef"

Would be better comparison if you would use whitespace in the other case as well. Which then makes it just as readable.

Also. "0" + "42" would that be "042" or 42? It may be better readable, but the semantics are unclear.


That's an argument for types. While it's more convenient to use a language without types it's more error prone. So I guess in the end strong typing wins.

  >>> 0 + 42
  42
  >>> "0" + "42"
  '042'
  >>> "0" + 42
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  TypeError: cannot concatenate 'str' and 'int' objects
  >>> int("0") + 42
  42


Is the following a maintenance headache, or non-human readable?

  #!/bin/sh
  FOO=" some long string here "
  FOO="$( echo "$FOO" | sed -e ' s/^[[:space:]]//g; s/[[:space:]]$//g ' )"
This isn't "pure bash", but most of what I write in shell scripts isn't "pure bash". It's shell scripting: dirty, slow, easy, effective.

Like any 'language', it takes on the complexity you put into it. English is really complicated, but you can also use a subset of it with only 850, 1200, or 2400 words, and suddenly it's very simple and clear.


> echo "$FOO" | sed -e ' s/^[[:space:]]//g; s/[[:space:]]$//g '

Your example fails when $FOO is "-n", for instance. Also the g modifiers are redundant in your example since there's only one beginning of line per line and one end of line per line.

I would instead write this:

  sed -r 's/^\s+|\s+$//g' <<< "$FOO"
EDIT: I think I see now what you probably thought would happen by using g, but no it wouldn't remove multiple spaces. So, you example also fails when $FOO is " x" (using 2 or more spaces at the ends).


Beware that -r is a GNU-specific option.


Unlike the original code, this strips whitespace from every line of input.


> It's why I avoid shell scripts as much as possible.

This is all from somebody who has just written a linux user space. Perhaps the author avoids shell as much as possible. It just isn't possible. I doubt it, though. Bash is awesome.


I only got to the first example and had a hard time deciding if the article was satire or not.

Let's use bash instead of a real programming language! It is super convenient! To do super fundamental things like trimming strings you just have to implement your own function with 40+ non-ASCII characters in a row! That example makes regexps seem like human readable and god knows if it is even correct or works portably across different versions of bash.

To be fair the article assumes you are first in bash and want to avoid launching subprocesses but for any production scripting that is the wrong hypothesis to begin with. A better approach would be to not start inside bash at all. Just do your scripting in a higher level language where basic ABC stuff like string, list, number and error handling are already there for you. Even if that trim_string function and everything else from the article would be provided in some bash-bible-std-lib it wouldn't even come close to what's available in say Python for example, and you still have to wrestle the syntax and other obscurities like -a (or -e) meaning file exists.

While there are some good examples in there if you are stuck in bash, this article was more of a 100 reasons not to use bash to me.


And it doesn't have to be. There should be trust that the function does what it says. If I use `printf` and don't understand how it's working internally it doesn't matter to me because I know what it does and trust it.

These scripts/snippets for me are just an unofficial extended standard lib


"maintenance headache."

Then stop maintaining code from languages you don't understand. This is frustrating. I've seen solutions on the comments that call PYTHON! Are you effing kidding me? PYTHON ??

Granted, BASH docs aren't particularly succinct, but shell scripts are an absolute necessity in the OS world.

It's pretty simple: If you don't understand shellcode, don't maintain an OS, or rather, don't expect to be accommodated for lack of knowledge for something that's been standard for 30+ years.


My inner pedant compels me to reply that these are not in fact regexes, but Bash parameter expansions.

https://www.gnu.org/software/bash/manual/html_node/Shell-Par...

My inner pedant has no comment regarding readability of Bash parameter expansions.


On the contrary regexes were invented because you shouldn't have to write 500 lines of code for that which can be done in two lines.

If we didn't have regexes we would have 100s of if/else's scattered all over program logic. That would be more hard to handle than the regex itself.


Perhaps it shouldnt be defined as what the regex can do, but which unit tests its able to pass.


It might be worth it to learn it though, even if just for fun.


Do you have a personal dislike for math too?


those are not regexes though, are they ? apart from the [:space:] part


Didn't realize at first that "pure" refers to features available in bash without calling out to external processes, when I would've thought purism in this context should refer to avoiding bashisms and writing portable (ksh, POSIX shell) scripts.


I understand your concerns about bash features and POSIX compatibility. The bash bible was written specifically to document the shell extensions bash implements.

My focus for the past few months has been writing a Linux distribution (and its package manager/tooling) in POSIX sh.

I've learned a lot of tricks and I'm very tempted to write a second "bible" with snippets that are supported in all POSIX shells.

(I created the bash bible).


You can add me to those interested in a pure POSIX shell bible. Whenever I'm about to write a moderately large script, I'm actively avoiding bashisms as they don't buy me much yet make my script unportable. But, given you've spent so much time on this subject, are there any bashisms that are truly essential and you don't want to live without?


> You can add me to those interested in a pure POSIX shell bible.

I've started working on it here: https://github.com/dylanaraps/pure-sh-bible

> are there any bashisms that are truly essential and you don't want to live without?

The only thing I'd say I miss when writing POSIX `sh` is arrays.

I work around this by using 'set -- 1 2 3 4' to mimic an array using the argument list. The limitation here though is that you're limited to one "array" at a time.

The other alternative I make use of is to use "string lists" (list="1 2 3 4") with word splitting.

This can be made safe if the following is correct:

- Globbing is disabled.

- You control the input data and can safely make assumptions (no spaces or new lines in elements).

While it's something that'd be nice to have, there are ways to work around it.

EDIT: One more thing would be "${var:0:1}" to grab individual characters from strings (or ranges of characters from strings).


  set -o pipefail


That, and local variables. I avoid bash for the same reasons as GP, so I tend to (reluctantly) live without pipefail and work around the lack of local variables with subshells (which I assume has a performance impact - but hey).


> I've learned a lot of tricks and I'm very tempted to write a second "bible" with snippets that are supported in all POSIX shells.

That could be potentially even more interesting than a bash-specific one (as it is harder to get it right -- bash can be figured out out of the single reference, anything "portable" has many dependencies).


Off the top of my head, a few notable things I've learned:

- Safely working with "string lists" (list="el el el el").

    - Filtering out duplicate items.

    - Reversing the list.

    - etc.
- Using `case` to do sub-string matching (using globbing).

- Using `set -- el el el` to create an "array" (only one array at a time!).

- `read -r` is still powerful in POSIX `sh` for getting data out of files. `while read -r` even more so.

- POSIX `sh` has `set -e` and friends so you can exit on errors etc.

- Ternary operators still exist for arithmetic (`$(($# > 0 ? 1 : 0))`).

- Each POSIX `sh` shell has a set of quirks you need to account for. What works in one POSIX `sh` shell may not in another. (I found a set of differences between `ash`/`dash` in my testing).

POSIX `sh` is a very simple language compared to `bash` and all of its extensions so there won't be as many snippets but there's some gold to be found.


I've started working on it here: https://github.com/dylanaraps/pure-sh-bible


I see you already made progress there! Congratulations!


I just wanted to add I'd be very interested in your shell-based package manager as well!


I'd be very interested in a POSIX sh version of this bash bible.


I've started working on it here: https://github.com/dylanaraps/pure-sh-bible


I expected a tool on bash where you can access the pure "Bible"

hallelujah.sh


On that note, does anyone have a favorite cli Bible reading program?


`apt search bible` gives lots of hits, though my favourite edition is unfortunately only available on tumblr: https://kingjamesprogramming.tumblr.com/



Why is this getting downvoted?


Because it is super offtopic. I would also downvote "does anyone have a favorite CLI recipe manager?"


no mention of /dev/tcp?!

yes, it looks like a device node in /dev, but it's really a pure bashism for opening tcp connections to arbitrary hosts and ports


I've been meaning to get around to it. I hadn't messed around with /dev/tcp prior to writing this bible.

I've since implemented a very bare-bones and very featureless IRC client using /dev/tcp and bash.

https://github.com/dylanaraps/birch

I will get around to it eventually. The one hurdle I want to get over before writing a piece about it is the handling of binary data using bash.

This is something a little tricky to do with bash but it'd allow for a 'wget'/'curl' like program without the use of anything external to the shell (no HTTPS of course).

I want to really understand the feature before I write about it though in the meantime I could just write a reference to the syntax/basic usage. :)


https://www.linuxjournal.com/content/more-using-bashs-built-... has a decent explanation.

I came here to say the exact same thing. This is my favourite thing most people don't know exists in Bash.


Keep in mind that major distros (Debian, possibly derivatives) disabled tcp/udp service names in 2000:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=65172


Whoa... News to me! Thanks.


I'm only using bash occasionally, this was useful.

In particular the "obsolete syntax" section, I wasn't aware of it.

https://github.com/dylanaraps/pure-bash-bible#obsolete-synta...


I don't quite get the recommendation to always use env bash over #!/bin/bash? If I use the full path, it is to get just that - the system's Bash. If it is missing or overruled in $PATH then I most likely don't want the script to run in the first place.


'System bash' isn't a universal or clear concept. For example, if you're using Modern OS X, you likely have bash via brew or some other userspace package manager but no system bash. Presumably you (or at least, most people) would still like your scripts to run in this case.


But if I write the script for a bunch of RHEL servers then OS X and brew are irrelevant, and the full path is better (IMHO). It's the 'always use env..' I object to.


It's RHEL... today. `/usr/bin/env` is a POSIX standard. Maybe one day, RHEL will put bash in /usr/local/bin. Or maybe you'll switch to FreeBSD one day, and suddenly everything goes boom.


env and sh are both POSIX but AFAIK the path is not specified for either of them. If POSIX has an opinion on how you should start a script I would be happy for a link?

Bash in RHEL is in /usr/bin and /bin as /bin in symlinked to /usr/bin. I think it is equally unlikely that RHEL (Debian, SLES..) will will move either /bin/bash or /usr/bin/env as it would break a million scripts out there.

If we should migrate to FreeBSD while, for some reason, reusing linux oriented bash scripts, changing the path to /usr/local/bin/ would be the least of my headaches.

I agree that 'env' can make good sense if you don't know who/where/when your script is used. For internal projects, I don't really see the advantage.


    $ uname -a
    Linux localhost 3.10.49-5975984 #1 SMP PREEMPT Thu Oct 8 17:25:20 KST 2015 armv7l Android
    $ which env
    /data/data/com.termux/files/usr/bin/env
https://xkcd.com/927/


Related, pure Bash L77 data compression/decompression, crc32, hex enc/dec, base64 enc/dex, binary head/cut, in a single 13KB file:

https://github.com/faragon/lzb


This is really neat, thanks for sharing.


Dylan, I was wondering if you could share your process for going from chapter text files to a full ebook? This looks like a really approachable way to writing a book. Is the conversion something scripted or is it a more involved process?


I write a bash script or two every month so I thought I'm okay. But then came along the very first example:

  trim_string() {
      # Usage: trim_string "   example   string    "
      : "${1#"${1%%[![:space:]]*}"}"
      : "${_%"${_##*[![:space:]]}"}"
      printf '%s\n' "$_"
  }
Ok, so the : is somehow a temporary variable... Then there is a variable starting at $ and you lost me :D Can someone break down that line for me? What the hell is going on here?

    : "${1#"${1%%[![:space:]]*}"}"


: is the null command. Kind of like /bin/true. So : followed by anything exists simply to perform expansion on something. If you fully understand what that means, then you will understand this: $_ is simply “whatever the last argument to the previous command expanded to”. So : is the previous command and the ${1… expanded value is now ${_….

Because $_ is used in the expansion of itself, it is the same value, because the command has not yet completed, which would (re) set $_. So use of a thing ($_) in expanding the same thing ($_) is perfectly fine until after that (null) command (:) runs. You see that the final $_ is used standalone. Hope this helps.


I think the idea is that the `:` builtin allows expansion of arguments without actually doing anything else. However, the temporary variable $_ is filled with the content of the expression. That is, after the first

    : "${1#"${1%%[![:space:]]*}"}"
The $_ temporary variable contains the result of removing the leading spaces. In the next line, the spaces at the end are removed from the temporary variable with the "${_%..." syntax.

You can test this in your own shell by e.g. doing:

    : $PATH
    echo $_


And that's the problem with bash scipting - very quickly it gets very cryptic, difficult to follow and understand without knowing various "clever tricks" and gimmicks. This is all fine and dandy for personal usage but god forbid other developers might need to changes something inside this kind "clever" code.


Bingo. I am a pretty casual level scripter. My programming is always just a means to an end (usually biology related) and I would much rather write easy to read code that takes 30 secs to run than whatever the pure bash thing in the parent post is.


I feel the code above is clearly a bastardization, which is possible in several languages. Not to discount your point in the many bashisms, but that example is definitely not what's wrong with scripting.


I don't like this function very much but here's a few notes...

: is a "do nothing" command -- but the line is still evaluated

%% means to replace leading chars that match pattern

## means replace trailing chars

I don't know why they're using $_; thats the variable containing the interpreter name, i.e. "/bin/bash" [edit - also the name of the previous command!]

I can't be bothered analyzing it any further :-)


After the first command, $_ expands to whatever the last argument to the previous command expanded to. In this case the previous command was : and the only argument is by definition the last. This is how you chain things together without clunky temporary variables.


Kinda cool, I don’t write or read much bash and tend to stick to sh compatible stuff.

There are some neat tricks in here but they don’t seem very readable compared to perl/awk/sed.


Yeah, I also don't look at things like

``` trim_string() { # Usage: trim_string " example string " : "${1#"${1%%[![:space:]]}"}" : "${_%"${_##[![:space:]]}"}" printf '%s\n' "$_" } ```

and think: "I should use bash more".

Bash is nice for making simple things simple but for complicated things it's just shitty. I used to think that this is due to the complicated quoting rules which make the simple things simple but tcl does a much better job at that.

In either case I prefer the clean rules of a Python or Perl for anything larger.


Please limit your shell scripting to POSIX sh for the sake of broad compatibility with current and future operating systems. There's a quick reference here:

https://shellhaters.org/

If you find yourself frustrated by the lack of bash extensions, your program is probably complex enough that you probably shouldn't be writing a shell script.


Very nice recipes. I'd avoid shadowing actual executables with bash function names (see head in the book) as a harbinger of great grief.


As someone that writes bash scripts regularly for my job, one of the big struggles I've had is remembering what I or someone else wrote when I look back 6+ months later haha.

I had the same issue with reading other people's perl as well.

I think the great and terrible thing about both languages is that there is literally a million different ways to skin the cat / write a regex.


Neat. Sometimes bash is the right choice, and then a guide like this is a great complement to google / stackexchange.


Very useful. I hate having to pull in external processes to do something simple in BASH that the manpage doesn't cover. Expecting PERL to be on every machine (or the right version of SED.. posix? gnu?) is not reliable in my line of work.


Would love to see this in a ZSH equivalency... Especially with the impending move to ZSH in MacOS.


Looking at the very first example:

  trim_string(){  
      # Usage: trim_string "   example   string    "
      : "${1#"${1%%[![:space:]]*}"}"
      : "${_%"${_##*[![:space:]]}"}"
      printf '%s\n' "$_"
  }

This reads horrible. I see no reason to prefer this over programs like sed, bash is after all a shell, intended firstly for running external programs/commands.


> This reads horrible.

Just use the shell function by its descriptive name...

The reasons are explained in the foreword:

Calling an external process in bash is expensive and excessive use will cause a noticeable slowdown. Scripts and programs written using built-in methods (where applicable) will be faster, require fewer dependencies and afford a better understanding of the language itself.


Sure, but the point of Bash is to run external programs.


The slow down caused to any developer that has to try and read that is probably orders of magnitude more worth optimizing for than how fast a bath script runs.

I'll take the grep/sed/awk version.


Not defending this particular example, but there are contexts where avoiding an external call can make a difference. Think a script calling an external program in a deeply nested loop. This sort of optimization could be used as a last resort, after identifying a real performance issue and evaluating the possibility of restructuring the code.


I've ran into this decades ago with file servers back. Half a million files, takes too long to do a simple for loop and calling 3rd party processes for awk/sed when I was just using them to format/search text. Breaking it down to just to mosty bash one scripts reduced the run time and ended all pauses.


I was going to argue that it would be better to simply not use a shell for that, but

> decades ago

Frankly, I can only imagine how the environment then would be. Thinking back with your current experience, what do you think you would have done if you had to fix it again?


Things have changed so much, from the app side laying out data, file system/storage side, and hardware speeds, that people are much more lazy with applications due to the environmental improvements.

I use to make lists first, then process the lists, I still do this sometimes since its faster. If you have to run a query every time, your probably doing it wrong, but for small stuff, everything is so far, I can chain gnu apps and be done. I'm not a programmer, I'm a sysadmin so mostly deal with the fixing things like auditing or fixing data on a file system. (or maybe db)


And the funny thing is that some of these systems are still in use for critical missions.

Sometimes even bash isn't an option, you have to deal with older shells like ksh.


Sure. I don't disagree there is a time and a place. I just never personally encounter those kinds of times or places. I don't doubt they exist and in that case I would definitely agree the correct way to address it would be write the more performant code.

I guess my bash uses cases are more just around internal tooling and are never performance critical, so my personal preferences are for readability under those circumstances.


Would not the chances be, that if in a deeply nested loop, requiring optimization— you’re better of using a "proper" programming language? Maybe Python, or something similar.


Yes sure, but rewriting a legacy system is not always an option. My experience comes from the scientific computing world where venerable Fortran programs are glued together with huge piles of shell scripts.


I was talking about the implementation. To whoever has implemented and who will maintain this; to them I think this will read horrible, and that is not good even if it is invisible to an end-user. I mean, how long wouldn’t even a very experienced bash programmer take to understand that substitution...


To be fair this is one of the least readable examples of the document. They're not all that bad.


Yes, far from it. Actually a lot them seem very intuitive, esp.

  lower() {
      # Usage: lower "string"
      printf '%s\n' "${1,,}"
  }
And with uppercase, "${1^^}" instead.


> I see no reason to prefer this over programs like sed, bash is after all a shell, intended firstly for running external commands.

Are there systems that don't come with sed installed? (Some docker containers I have logged into don't seem to have less).


Well, less is only intended for interactive use AFAIK, in contrast to sed which is very much scriptable; and part of the POSIX standard [1]

[1]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/s...


I've never known `sed` not be installed but I have been caught out by different implementations of sed before.


I have inherited about 4K LOC Bash, which mostly works as advertised. No one wants to touch it! suggestions ?


Skim https://learnxinyminutes.com/docs/bash/ Then open the code and just go through it. Check everything you find unfamiliar with that cheat sheet. If it doesn't cover something in your code, write it down to research separately later.


Assuming it's all in one file I would move related pieces of functionality into separate files and and then source them when necessary. Should make things more manageable for people only wanting to make small changes.


no, its 27 directories and 947 files, with a plugin architecture (pipeline)


my mistake, the project is closer to 40,000 LOC, now that I count it...


Leave it alone if you don't need to modify it. If you start needing to modify it a lot start rewriting it in ruby, go, or python at that point. Use set -x at the top of the file to get an idea on what's going on.


I have moved to Crystal Lang for the times when i need the speed that ruby can't offer. I could have leveled up my bash skills, and was indeed using bash for that porpoise, but crystal is easier i think.


Kudos for using shellcheck. I use it all the time.


Can anyone recommend something similar for Make?


Great read dylan!


Thanks! Fancy seeing you here. :)


Oh cool. Same fellow who made pywal.


Looks more like a "cookbook"


for f in *; do is cute but what about those idiots who put spaces in the filenames?


The title made me think you can query The Bible in bash...



How about performance? Is it fast as rust or slow as python?


While this is interesting I see doing anything but launching programs with simple text-substituted arguments as too much for bash or sh. Run shellcheck on some of your own code, or the code of even a simple project to see how hard it is to really use bash.

Why I think people gravitate towards it is because languages such as python add too much pomp to launching a shell process. A language like perl is usually easier to use but everyone hates it now.


The shell is my favorite language and I really don't know why.

It's definitely possible to write shell code which properly passes shellcheck's linter though it's an uphill battle to learn the ins and outs and _why_ X is wrong when Y is right.

I even managed to write a full TUI file manager in bash!

https://github.com/dylanaraps/fff

I full understand that there are times when the shell should not be used and when other languages are a better way to solve a specific problem, however I love pushing the shell beyond its supposed limits! :)


Have you read Bash Pitfalls[1]? Do you write truly correct Bash/POSIX code? Do you still love it?

[1]: https://mywiki.wooledge.org/BashPitfalls


> Have you read Bash Pitfalls[1]?

I've read pretty much everything I could get my hands on regarding the shell (including the mentioned link) and I still love it.

> Do you write truly correct Bash/POSIX code?

If we define correct as passing shellcheck, avoiding all pitfalls and maintaining compatibility (POSIX sh not bash), then yes, I like to think so. :)

> Do you still love it?

Oh yeah! I've been writing a ton of POSIX sh as of late. My latest project being a Linux distribution: https://getkiss.org/

(hello from Firefox in KISS!)


Wow, very impressive. Do you also believe it's a viable language with which newcomers should start writing scripts?

Edit: follow up question is Do you believe bash / POSIX shells actually follow KISS principles?

Not questioning whether your OS is KISS, but I don't think that necessarily reflects the KISS-ness of the underlying language.

My questions clearly reflect my current impression that in the long term, shell pitfalls largely undermine the benefits of its apparent simplicity. The gist would be for you to provide some way to change my mind. I guess the codebase you provide is a strong counter example; but you'll agree it doesn't reflect general usage of shell in the wild.

Edit 2: You know what, I just read your original comment again. I kind of retract my question since you do concede that it's an uphill battle and you love it in spite of its flaws. I guess that's cool (and I agree it's fun trying to write correct bash as a challenge) as long as you're in control of the code being produced, but my main impression remains that it's a bad language to publicize and its presence in most codebases inherently bears a strong cost.


> Do you believe bash / POSIX shells actually follow KISS principles?

POSIX `sh` yes. `bash` less so but I'd still lean more towards a yes.

Ultimately though, it depends on how we define "simple". Both `bash` (2.6MB) and POSIX `sh` shells (`dash` (232KB), `ash` (1.2MB (busybox)), etc) are tiny in size if we compare them to Python (137MB) or Perl (44MB).

(Numbers taken from my system using `du` on each file which belongs to each shell/language.)

If we define "simple" to language features then I think the shells come out on top again (especially POSIX `sh`).

If we define "simple" as ease of use (without shooting yourself in the foot) then I'd agree with you and say that the shell loses here.

There's a time and place for using any tool (in production) but I find it fun to push the shell beyond what is thought possible in my personal projects. :)


Shell scripts also win "simple" in ease of deployment. (well assuming the author has paid attention to platform differences)


From what I understand, the commenter is the author of 'Pure Bash Bible' :-) https://getkiss.org/pages/team/


I don’t think it’s pomp. Once I learned Unix pipes and the tools for manipulating data (sed awk cut etc), it just became much faster and easier than writing python scripts that do the same thing. You can literally connect the output of one process to the input of another with a single character. It’s much more complex in python.

It also tends to be very portable.


I think you're reading my comment the wrong way: I meant to say that doing e.g. piping in Python is a lot of pointless work (pomp), as you agree.

This is perhaps one big benefit but not one that is exclusive to a sh-like language. Instead I would like to see a language with strong flow control or metaprogramming capabilities take on processes as a first class citizen. Perl is probably the closest but still has some warts related to redirection.

The best pattern I have seen is encapsulating business logic into Python or Go and then if really necessary piping it to another script. But, often, if you do this you can just keep the piping internal using data structures.

Unix-pattern facilities work very well for interactive use, which tends to be simple, exploratory, and trial-and-error. But a project's build script may not be simple.


Perl was literally born because Larry Wall reached the limits of what could be done with a combination of Shell + C + Unix utils.

In fact this is how Perl 1 looks: https://st.aticpan.org/source/RCLAMP/perl-1.0_16/

https://github.com/Perl/perl5/commit/8d063cd8450e59ea1c611a2...


If you pardon the shameless self promotion; I'm working on something just like that:

https://github.com/lmorg/murex

It currently has:

* Proper error handling (eg try and catch blocks)

* unit testing and debugging frameworks to help with development and maintainability

* data-type aware, including complex types like how CSV, JSON and YAML are all handled as memory structures and thus the same tools can query any structured data format without understanding it's contents

* while still ostensibly working the same way as a traditional POSIX shell

There's also some work on improving the REPL experience too where I've included:

* automatic man page parsing for flags

* a "tool tip text" like hint line which tells you where commands reside on the fs, what it does, etc. Which is handy if you're trying to debug an existing commend.

* if you paste multiline text into the console you get offered a chance to preview the text before executing it (handy if, like me, you're pretty useless at copy/pasting content reliably)

* a package management system so you know exactly which functions and imported scripts are loaded from which sources (no more "where did that autocomplete suggestion / alias / etc get loaded from?"

There's a few other features like support for events and such like, but they're not yet documented.

The shell is currently beta but I've been using it as my daily driver for about 18 months now. There are still quite a few bugs, plenty of places where code needs to be rewritten for performance and lots of stuff isn't yet documented (though the documentation is pretty good already considering it's only me working on it). So don't expect a finished product. However I do think I'm at the stage where I'm ready for more users to have a play and I welcome PRs, issues raised, general comments and feedback, etc.

# end of shameless self promotion :D


Seems quite similar to Elvish (https://elv.sh/); have you checked it out?


Sorry, only just seen this comment. Just in case you're still following this thread and interested in a reply:

I'm aware of Elvish and really impressed with what's been built and the traction that has gained. There is definitely some overlap between murex and elvish but also a lot of area's where our shells differ.

I think there is sufficient difference between the two shells to justify their existence.


Oh, definitely. Just thought it was something you should check out and I didn't see it mentioned anywhere.


Nice project, look forward to seeing a HN post on it!


Thank you


> Perl is probably the closest but still has some warts related to redirection.

I wholehartedly agree on perl but can you expand on the redirection warts? I seldom had problems with perls' FHs, while, on the contrary I seem to be unable to wrap my head around the contorted syntax involved in bash's handling of descriptors - especially when more than 2 handles are involved.


Well -- I was comparing Perl to Python in this case. There is roughly the same amount of boilerplate and both are easier to read than bash's redirection.

The wart is that you need IPC::Open3 or equivalent because Perl's intrinsics can not synthesize the pipe operator (though you will think that they can) if you need to insert yourself into the middle of a chain of commands.

Nowadays there are decent wrappers for calling Open3 but more commonly you just find people running one half of the command, buffering the output, and passing it to the second half.


Aw, right. I'd forgotten about IPC::Open3. Because I did not want to think about it, mostly (and so, I usually do exactly what you say people do :-) )


My mistake - I managed to completely misinterpret what you meant. I agree with your comment!


I think tcl is exactly in this niche. I haven't had time or an excuse to learn it, but it seems to fit perfectly. Haven't yet found out why it seems to be dying away


Following page has Tcl code written by antirez to create command pipes:

https://wiki.tcl-lang.org/page/Pipeline+programming


A collection of pipeline programming implementations in Tcl:

https://wiki.tcl-lang.org/page/Commands+pipe


So PowerShell then?


PowerShell removes the pomp from running processes and adds it squared into running normal functions.


Isn't that Windows only?


Not any more: https://github.com/PowerShell/PowerShell/releases/

I quite like Powershell on Windows, but I'm not sure I dare try it on Linux. I think it might make my head explode.


It's quite clear Microsoft are only a couple of years from either an official linux distro or some halfway house of using a linux kernel internally for major os functionality, so it makes little sense learning some new(ish) shell when you could just use bash (via wsl or whatever).


No, but I haven't seen many Linux users (admitting to) using it.


The overhead of repeatedly launching subshells/processes to do simple operations can add up quickly, especially if it is happening in loops or in parallel. Yes, you shouldn't be using bash for performance, we all know that. But scripts often grow over time and suddenly are found to be slow/resource hogs. I have seen people demand that proper logging be added to a bash program and it then get minutes behind because of the overhead of all the processes called to do the logging. I was able to make that 10000x faster using pure bash.

As to why people like it: it often feels more natural when you are automating what you would type interactively. Unix pipes/coreutils/etc. also feel like a better fit when that automation is mostly about connecting other programs (it's the auxiliary stuff that you'd maybe want in pure bash). Reading the subprocess Python documentation does not exactly fill me with joy. I've heard libraries like Plumbum make it a bit neater - but then you have to ask why learn a bunch of new libraries when I already know bash? In the end it's about the best tool for the job. The danger with bash is going too far, especially if you don't actually know it very well.


My favourite way to log.. Redirect stdout and stderr ( &> ) into a named pipe ( >() ) running "tee" And get the redirect into the log file as well. `exec &> >(tee ${__DIR}/${DOC_LOCAL}/${LOG_LOCAL})`


Would you mind expanding on this with an example? I'm trying to improve performance of my bash logging


So this line only redirects all script output to a file as well as ensure that that output makes it onto your screen at the same time.

It is unstructured only in the way you allow any running command within your script to dump their output.

I run most commands inside my scripts with `> /dev/null 2>&1` and then rely on exit codes to wrap structured information to be echoed out with this function:

echo_l(){echo;echo '--------';echo "${1}";echo '--------';echo}

Or functions such as this to populate a log:

log_cat(){echo "${1}" >> ${__LOG} }

But the tee named pipe is the winner.

PS. The __DOC_LOCAL and __DIR variables start with these magic variables below. These variables are a life saver and allow easy directory and file manipulation, they kind of setup a top-level context:

# Set magic variables for current file & dir

__DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

__FILE="${__DIR}/$(basename "${BASH_SOURCE[0]}")"

__SCRIPT="$(basename ${__FILE})"

__BASE="$(basename ${__FILE} .sh)"

__ROOT="$(cd "$(dirname "${__DIR}")" && pwd)"


+1. My goal is to learn how to robustly do any logging whatsoever. Eager to learn new tricks.


Do you just want to log the scripts execution or do you want something more structured? If it's the former you can redirect the output from within the bash script with this (apologies for any condescension, I'm not familiar with your skill tree):

  if [ ! -t 1 ]; then
    exec > /my/log/file 2>&1
  fi
The if statements tests if your at an interactive prompt, if your not all output from the script get's redirected to /my/log/file. The above poster is instead redirecting into a subprocess " > (tee)" that will both print the output and log it.

It should be noted that often the bottleneck is the terminal itself, try running your scripts with a "> /dev/null" to suppress output and verify the slow part is actually the script.


> I see doing anything but launching programs with simple text-substituted arguments as too much for bash or sh

I would heartily agree. If your shell script grows beyond half a dozen commands or so, you're probably better off rewriting it in just about anything. Python, ruby, go (gorun), rust (cargo-script), or whatever else, doesn't really matter as long as it's not shell.



I am slowing moving from bash to python for my utility script. I was unable to love Perl, I think this weired syntax is a very bad choice.

Anyway bash is still faster to use, and a lot of Unix services are based on it. The book is well written and have a great added value. Thank you for sharing!!


>>I think this weired syntax is a very bad choice.

Ah, the luxury the modern breed of programmers enjoy today makes me feel jealous. In these days of Splunk and Document databases(returning jsons and xmls) its hard to understand why so many things in the past were the way they were.

Apart from DBMS interaction(Perl had DBI/x modules for that), pretty much every thing in the decade of 80's even upto late 2000's(tons of legacy systems) was so non standardized that people were literally parsing through log files and non standard data formats to store/exchange a lot of things, this was when the internet was growing crazy year over year, systems needed to be built and put in place. This means you really need to have first class facilities to manipulate text. You needed regexes baked neatly into the language. You needed qw, you needed ``, you needed while<FILEHANDLE>, you needed binary file handling features, you needed string manipulation facilities that could help you drill through any text file you could imagine, you needed powerful functional programming features, you needed OO etc etc. And you needed to get this done under tough deadlines on slow machines. Remember Java being cross platform compliant was one of the biggest selling point of the day. Perl had this before Java.

I personally worked on building a store using rcs and perl, that could store versioned config files, almost like a document data base. The parsing facilities required for the application we did demanded nothing short of a tool like Perl.

Python, and also Java growing rapidly once the data exchange formats were reduced to mark up languages and JSON. Suddenly you could with a library what most Perl programmers were doing using their language powers.

Also look at the Human Genome Project and Perl usage there.


Perl's Tie::DBFile is still the lowest effort data persistence I have seen.

https://perldoc.perl.org/DB_File.html#A-Simple-Example


I do Python more or less full time these days, but bash replacements are the things that I get Perl out for. It has so much more syntactic sugar that makes them better than Python. Python is better for proper applications. Perl is better for one off scripts for manipulating the underlying system.


Inability to use tools like Perl and Emacs is the biggest reason why I have to often break the bad news to people that their week to month long projects(Typically in Python and Java) can likely be done if a few minutes to a day or two, if they knew how to Emacs or Perl well.

Programmers take great pride in freeing accountants and ware house workers from drudgery. But seldom do we look at our work in the same way.

In the real world, most software work is done very similar to digging coal mines with shovels. Laborious manual hand typing jobs.


ehh, I agree that its usually faster do do stuff in perl than in python, but if someone spent weeks doing something in python, that could be done in few minutes of perl, python is not the problem. And would probebly take them weeks to do it in perl as well.

There are things you can do in line of perl that will take you 5 -10 min in python, but once program goes above one liners, difference is not that big.

And of course, if you learned vim instead of emacs, you would be even faster :))


Yeah perl was nice when we were allowed to use that. Why doesn't python have something like named pipes?

    f = open("ls|", "r)
    f.read()
    f.close()


I'm not sure what that's supposed to do, but subprocess provides these facilities, although definitely in a more verbose way.

    f = Popen('ls', stdout=PIPE).stdout
    f.read()
    f.close()
alternatively given this exact behaviour:

    run('ls', stdout=PIPE).stdout


yeah sure subprocess.check_output(). If you ever dealt with much perl you know how much more pleasant and easy launching processes was in that language - like shell. Python is great, I'm not on the "bash python" wagon. Launching processes, getting the output, munging it and shoving it at another process is more fiddly and less natural in python. If you want to use that as evidence that I'm a garbage developer go ahead, I realise I'm not writing a carefully researched paper here with examples.

In Perl you literally took the part of your shell pipeline, quoted it and opened it like any other file.

    ./output_generator | wc -l
    becomes

    open(SRC, "output_generator|");
    open(WC, "|wc -l);
And you do whatever you want with those filehandles. It's been ages since I wrote any perl. I miss it.


You might like the python "sh" library [0]. It's not perfect, but it sure does remove the visual noise around subprocess.

[0] https://amoffat.github.io/sh/


> yeah sure subprocess.check_output()

It's deprecated and does something quite different.

> If you ever dealt with much perl you know how much more pleasant and easy launching processes was in that language - like shell.

I'm sure it is, but you're missing my point.

> If you want to use that as evidence that I'm a garbage developer go ahead

Well that escalated quickly.

> open(SRC, "output_generator|");

> open(WC, "|wc -l);

>

> And you do whatever you want with those filehandles.

Again python does roughly the same, just with more overhead: a trailing pipe is an "stdout=PIPE", an input is a "input=<whatever>", and you access / forward stdout explicitly:

    src = Popen('output_generator', stdout=PIPE)
    wc = Popen(['wc', '-l'], input=src.stdout)
And as the sibling notes, for shell replacements you can use the sh library to lower the syntactic overhead of popen.


Doesn't scale well when you have to stitch more than two commands with pipes, which is a very common use case for Unix CLI utilities.


I found this problem as well when doing a direct port including pipes from bash to python. It was the first iteration into python and I was lacking time to do a proper port into python native.

The python code to do piping ends up as longwinded and the plumbing of pipes ends up a massive headache so I wrote tidycmd to overcome that issue https://github.com/laurieodgers/tidycmd


Don't see why it "doesn't scale well", the overhead is roughly constant: use the stdout of one program as the input of the next.


That's not as easy as

    cat something | grep "this" | cut -f 1 | sed -e 's/.../.../' 
You end up writing too much code, it's very verbose. Some times symbols are what you want. In fact the biggest progress in the growth of Math happened when they tossed out doing math with words and bought in symbols.


> That's not as easy as

> cat something | grep "this" | cut -f 1 | sed -e 's/.../.../'

Literally none of this is actually useful if you're already in Python:

    (
        line.split('\t')[0].replace(…, …)
        for line in open('something')
        if 'this' in line
    )
> You end up writing too much code

I can believe that if you're calling to external processes to perform operations which are pretty much trivial in the language.


Of course, if you plan to use one $language alone, you can do anything in that $language.

This question is specific to pipes.


> Of course, if you plan to use one $language alone, you can do anything in that $language.

That's missing what I'm noting though, which is that you don't need pipes anywhere near a shell script if you can simply do more of your work in-language. Case in point being that the entire pipeline you cited as an issue has no reason to exist outside of a shell or shell script.


That's because regexes are an another pain in Python, You have to write a lot of boiler plate exception handling code. Plus the same boiler plate code for file operations.

It doesn't feel like the language was designed for these tasks.


You forgot setting -e and -o pipefail. In case "something" is missing or can't be read for some other reason, your script will just continue happily without warning.

But oh, if you do set -o pipefail the grep will stop the whole pipeline when none of the lines matches "this". So you have to keep fiddling with ${PIPESTATUS[0]}. And none of pipefail or PIPESTATUS are really portable.

Not so simple after all.


Yes, Python is the Cobol of script languages.


What? Why?


I started out with bash scripts and .bat files and eventually found perl to be the ideal solution to 99% of what I needed as well. I'm still using it for log file parsing and day to day administrative tasks, but I'm starting to pick up python because that seems to be where everyone has agreed to move to.


One reason i prefer Bash (and i mean Bash not just any shell) is that it tends to stay stable - scripts written years ago work just fine today. These days i mainly use Bash on Windows (via MSYS2) and really it is available pretty much everywhere, either out of the box or via installation.

If anything code that i used to write in Python at the past i write it in Bash nowadays, exactly because Bash has a better record when it comes to not breaking stuff. Though it helps that my Python use is also mostly scripts meant to run from the shell.


Yes - and no. I find the shell is a useful glue language for procedural tasks (do this, run P1, then P2, and if blah) which can/had better be logically simple but rather long and heavy on the system interaction side.

For these, being able to avoid the various $(echo... |sed) can be refreshing. Beside, the book makes for a nice repository of techniques.


I use nodejs. Sometimes it's a case of "go with what you know" rather than "use the best tool for the job". I have too many other things to do for me to bother learning yet another language just for command line scripting.


I recently started working a more devops role at a company (I've been a ruby dev for most of my career) and one of the things that I've noticed is how _insane_ scripting by ops people is. I'll start poking around a build pipeline and there will all sorts of tortured usages of sed and jq and bash and coreutils to get things done that would be _really_ simple in ruby. I routinely see people using wget and curl in the same script. I'll see people pipe sed text replacement into perl for more text replacement. I'm constantly aware of ruby three liners that are fully portable, even to Windows, that could replace a dozen lines of potentially subtly buggy shell scripting.

And honestly I can't see any good argument for this patchwork approach to gluing things together. I guess some ops people might argue that you'd have to have ruby everywhere but the counter argument would be that we use docker images for everything and adding ruby as a dependency isn't any worse than all the insane dependency gymnastics it takes to get our node apps working.

And all of this applies equally for any language with a reasonable standard library (python, perl). I think people have weird feelings about using bash or make or whatever to accomplish things, like they are riding closer to the metal or that they are living some deeply pragmatic zen Unix philosophy, but mostly they are making an un-testable mess until it works once and then, if they are lucky, they don't have to touch it again.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: