
Understanding Bash (2018) - sergi_chalauri
https://www.linuxjournal.com/content/understanding-bash-elements-programming
======
fooblat
I love bash and I know I'm not alone.

Sure, it's syntax is based on ideas from the 1960s[0] and it has a weird core
library of functions[1]. Certainly writing code that is a blend of logic and
managing other programs takes some getting used to. However, it is very well
documented[2] and I've had great success with it over the past 20 years.

If one takes the time to get to know it, it is actually easy and fun to write
scripts that are robust and easy to maintain.

I'd venture to say that bash is the putty in the little gaps of the internet.
It is the fitting that glues programs to the various systems on which they
run.

0\. [https://en.wikipedia.org/wiki/ALGOL](https://en.wikipedia.org/wiki/ALGOL)

1\. [https://en.wikipedia.org/wiki/POSIX](https://en.wikipedia.org/wiki/POSIX)

2\.
[https://www.gnu.org/software/bash/manual/](https://www.gnu.org/software/bash/manual/)

edit: formatting

~~~
bordercases
It's also extremely fast.

~~~
kazinator
Sure, until you need to do something advanced like have a function return a
string and use it in the caller.

Test case:

    
    
      #!/bin/bash
    
      fun()
      {
        echo "result"
      }
    
      RES=$(fun)
      echo $RES
    

Snippet from system call trace (via "strace bash ./test.sh"):

    
    
      pipe([3, 4])                            = 0
      rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
      rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
      rt_sigprocmask(SIG_BLOCK, [INT CHLD], [], 8) = 0
      lseek(255, -10, SEEK_CUR)               = 51
      clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fccbed72a10) = 3870
      rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
      rt_sigaction(SIGCHLD, {0x446240, [], SA_RESTORER|SA_RESTART, 0x7fccbe3b0ff0}, {0x446240, [], SA_RESTORER|SA_RESTART, 0x7fccbe3b0ff0}, 8) = 0
      close(4)                                = 0
      read(3, "result\n", 128)                = 7
      read(3, "", 128)                        = 0
    

In other words, this requires a pipe to be created a child process to be
spawned, and its output to be read from that pipe.

In any reasonable scripting language, and even some unreasonable ones, this is
all done in one address space. How about, oh, Awk:

    
    
      function fun() { return "result" }
    

One of the most common mantras for effective shell programming is "avoid
writing loops; orchestrate external utilities".

~~~
Spivak
The way one typically accomplishes this in bash is the following since the $()
construct spawns a subshell.

    
    
        #!/bin/bash
        
        fun() {
            res="result"
        }
    
        fun && echo "$res"
    

The same strace snippet.

    
    
        openat(AT_FDCWD, "./test.sh", O_RDONLY) = 3
        stat("./test.sh", {st_mode=S_IFREG|0755, st_size=53, ...}) = 0
        ioctl(3, TCGETS, 0x7ffd65c6d890)        = -1 ENOTTY (Inappropriate ioctl for device)
        lseek(3, 0, SEEK_CUR)                   = 0
        read(3, "#!/bin/bash\n\nfun() {\n  res=\"resu"..., 80) = 53
        lseek(3, 0, SEEK_SET)                   = 0
        prlimit64(0, RLIMIT_NOFILE, NULL, {rlim_cur=1024, rlim_max=512*1024}) = 0
        fcntl(255, F_GETFD)                     = -1 EBADF (Bad file descriptor)
        dup2(3, 255)                            = 255
        close(3)                                = 0
        fcntl(255, F_SETFD, FD_CLOEXEC)         = 0
        fcntl(255, F_GETFL)                     = 0x8000 (flags O_RDONLY|O_LARGEFILE)
        fstat(255, {st_mode=S_IFREG|0755, st_size=53, ...}) = 0
        lseek(255, 0, SEEK_CUR)                 = 0
        read(255, "#!/bin/bash\n\nfun() {\n  res=\"resu"..., 53) = 53
        fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x2), ...}) = 0
        write(1, "result\n", 7result
        )                 = 7
        read(255, "", 53)                       = 0
        rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
        rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
        exit_group(0)                           = ?
    
    

If you want use the old magics to return values like the built-ins then it's a
little uglier but a lot cleaner.

    
    
        #!/bin/bash
    
        fun() {
          eval $1=\""result"\"
        }
    
        fun res && echo $res

~~~
kazinator
Assignment to hard-coded global variable is not a return mechanism, and
generally a nonstarter. It's not a viable approach for generally returning
strings out shell functions everywhere in a codebase as a matter of habit.

Use of eval: still slow, because the shell eval re-processes input from the
character level and up. eval should be generally avoided as much as possible
in shell programming. Careless use of eval can introduce security holes (piece
of untrusted datum gets evaled as an expression). You really need to have your
black belt in "shell escaping karate".

Producing output and capturing with command substitution is the primary idiom
for getting text out of a shell function. It has no visible side effect. The
rebinding of standard output is scoped to the process substitution (and is
confined to the child process, in fact), and the creation of the temporary
process and pipe, expensive as they might be, are invisible to the program
semantics.

> _The way one typically accomplishes this in bash_

In summary, what you propose here is not only vanishingly atypical, but also
bad coding practice.

~~~
Spivak
> In summary, what you propose here is not only vanishingly atypical, but also
> bad coding practice.

What I'm describing has been SOP in bash for twenty years. You can program in
a more modern style and replace some uses of eval with
${!indirect_references}, declare, and associative arrays but eval is still in
wide use today. Search through /etc for eval with grep and you'll return
plenty of results in the wild.

For example the eval in my previous post could be rewritten as the following

    
    
        declare -g "$1"="result"
    

but this form, although safer, is the less common usage.

Assigning to global variables or passing in a variable name to get the return
value is just how bash works. See $MAPFILE $OPTIND $OPTARG for examples of the
former and read as an example of the latter.

~~~
LukeShu
That [declare -g] is bad too, because it means you can't use it to set a local
variable (see my sibling comment about dynamic scoping), which is surprising
and confusing to the caller.

While I agree with the parent that the whole thing is disgusting, I also
recognize that it's a valid optimization technique, and it's sometimes
necessary. So, if you do go down that route, I'd encourage you to do it as

    
    
        printf -v "$1" '%s' "$result"

------
peterwwillis
> [ ] is a command—basically another way to call the built-in test command.

Literally, '/usr/bin/[', which takes as an argument a final ']', but actually
a different binary than '/usr/bin/test'. And is different from the bash
builtin. Crazy shit.

    
    
      vagrant@vagrant:~$ [ --version
      -bash: [: missing `]'
      vagrant@vagrant:~$ /usr/bin/[ --version
      [ (GNU coreutils) 8.28
      Copyright (C) 2017 Free Software Foundation, Inc.
      License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
      
      Written by Kevin Braunsdorf and Matthew Bradburn.

~~~
kohtatsu
I think it's cute.

------
JoelMcCracken
This seems like a very basic introduction to bash, which is fine, but I was
hoping for something called "Understanding Bash" to help with some of the
understanding about why it behaves so weirdly at times.

For me, the biggest problem when writing bash is when I need to do something
just a little more complex than what is there, but there is no good way to do
it. An example that usually trips me up is this:
[https://mywiki.wooledge.org/BashFAQ/050](https://mywiki.wooledge.org/BashFAQ/050).

FWIW, the above link is probably the most useful thing I have found that
actually helps me understand bash.

~~~
nixpulvis
I've been writing a (hopefully one day) POSIX compatible shell using a parser
generator library. Having to try and fit sh into a parser framework makes you
make some choices, and it's a great way to find these weird things. For
example, I was surprised at first to learn that `{ ls }` isn't a valid
program. Or:

    
    
        FOO=1 echo $FOO
        # vs
        FOO=1; echo $FOO
        # vs
        FOO=1 printenv FOO

------
empath75
> Note that the exit value of true is 0, and the exit value of false is 1.
> This is somewhat counterintuitive, and it's the exact opposite of most
> programming languages.

"Happy families are all alike; every unhappy family is unhappy in its own
way."

[https://en.wikipedia.org/wiki/Anna_Karenina_principle](https://en.wikipedia.org/wiki/Anna_Karenina_principle)

~~~
banachtarski
The article is wrong though. For error codes, 0 being a success value is the
CONVENTION. A non-zero code is an error. This isn't about "programming
languages."

~~~
SAI_Peregrinus
The problem is that for bash `true` is a function (or builtin) that returns 0,
while for C (and thus most languages) any non-zero value (typically 1 for the
builtins) is true. In c99 and later with stdbool.h, `true` is almost always
defined as `#define true 1`.

------
m4r35n357
It is far simpler to just learn the POSIX shell first - much shorter man page.
The the Bash stuff is then just a few additions if you ever need them (other
shells are available).

~~~
frou_dh
Notably, such a concept as a function having a local variable is beyond POSIX
shell.

~~~
m4r35n357
"Variables may be declared to be local to a function by using a 'local'
command."

~~~
frou_dh
Here's the spec. Where is it?
[https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V...](https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html)

~~~
m4r35n357
It is in both ash and dash, the shells I mentioned. Your attempt at point
scoring is irrelevant but expected.

ash & dash are about 100kB executables (POSIX plus a tiny number of non-
interactive improvments like 'local'), bash is 1100kB. Every shell and
subshell.

~~~
frou_dh
My point is that POSIX shell purity is masochism. Understanding how it's the
core and other shells build on top of it is certainly valuable, but it's not
inherently particularly good to use.

~~~
m4r35n357
Maybe we are at cross purposes. My default shell is bash, but my default "sh"
is dash. This is as it should be (and is default in Debian & Ubuntu I think).

If you are talking about interactive use, of course bash is the one to use,
but for writing shell scripts, it is a 10X interactivity overhead" over dash.

~~~
frou_dh
IMO it's a big win for scripts to at least be able to use Arrays in that
otherwise "stringly-typed" world.

~~~
aidenn0
As a POSIX shell masochist myself, Arrays are the single biggest missing
feature. POSIX shells actually do have one array: $@. Having more would make
my life so much easier.

The amount of gyrations I go through to account for the lack of arrays is
easily 1000x worse than accounting for not having local variables.

------
peterwwillis
I've been using bash for nearly two decades, and I just found out that Bash is
apparently the only shell whose built-in _echo_ uses the _-e_ option. I
thought others used this too, but it seems every other shell's _echo_ just
implicitly interpolates escaped characters. POSIX actually says anything
following '\' is undefined behavior. Apparently _printf_ is the only portable
way to interpolate escaped characters in output.

Here's a bunch of very useful tips like that:
[https://www.etalabs.net/sh_tricks.html](https://www.etalabs.net/sh_tricks.html)

~~~
kohtatsu
That resource is lovely, thank you.

------
RMPR
> Second, in principle, there's nothing to enforce that a UNIX shell must have
> echo as a built-in, and therefore, it's important to have the external
> utility /bin/echo as a fallback.

I read here
[https://en.wikipedia.org/wiki/POSIX#Overview](https://en.wikipedia.org/wiki/POSIX#Overview)
that echo has been standardized for POSIX, can you elaborate further ?

------
RMPR
Amazing article, I like to understand the why behind things like that. Kudos
to you.

------
ReedJessen
This is a really good article and worth the read.

