
Anatomy of a shell - emersion
https://drewdevault.com/2018/12/28/Anatomy-of-a-shell.html
======
willghatch
Some other interesting projects around the strict posix shell are Morbig[1], a
more trustworthy static parser for the posix shell, and the work of Michael
Greenberg[2], who is trying to formalize the semantics of the posix shell.

[1] [https://hal.archives-ouvertes.fr/hal-01890044](https://hal.archives-
ouvertes.fr/hal-01890044) [2]
[http://www.cs.pomona.edu/~michael/](http://www.cs.pomona.edu/~michael/)

But my personal takeaway from this sort of research is that the posix shell is
a truly insane language beyond even its readily apparent shortcomings. It's
worthwhile to figure out the exact semantics of a language with so much
existing code and have a precise interpreter for it, but frankly I think the
more important task is to move away from such terrible languages. Bourne-
derived shells have some great ideas, but these could be formalized into more
sensible languages. In particular, I think the best direction is to embed
shell languages inside good general-purpose languages. I'm biased to that end
since I've written a shell embedded in Racket. But there is no shortage of
better shell languages that benefit from better language design, such as
Powershell and Elvish.

~~~
chubot
I agree that not just bash, but POSIX shell is beyond salvaging, at least for
many use cases where it should be applied.

The way I think of it is that I can't tell my (former) coworkers or my sister
to learn to program shell in 2018 "with a straight face". More than 50% of
them had Ph.D.'s in some other field like statistics, math, biology, or
physics -- i.e. the kind of person who has no problem with Python. But they
would struggle with shell or make.

They're the kind of people Mike Bostock (author of d3) wrote this post about
Make for [1]. He's absolutely right that they should use Make, because like
shell it's a great tool to combine different programs that weren't designed to
be combined.

But I feel like this is basically a losing battle -- I don't see that shell or
make is an exciting thing for most people to learn, even though it would make
them more productive and increase the quality of their work. I think that a
language free of legacy would go a long way.

I guess I can see POSIX sh being used when confined to the lower level of an
OS, by specialists. But I think it has a lot more potential than that. I'd say
it's more important now than 10 years ago, because there's so much more
software out there and systems are increasingly built of large heterogeneous
parts.

\----

FWIW I mentioned Morbig here [2], along with several other shell parsers, like
ShellCheck in Haskell and shfmt in Go.

[1] [https://bost.ocks.org/mike/make/](https://bost.ocks.org/mike/make/)

[2]
[https://github.com/oilshell/oil/wiki/ExternalResources](https://github.com/oilshell/oil/wiki/ExternalResources)

~~~
Sir_Cmpwn
I don't see sh as a losing battle. I actually think it's quite good for the
niche it aims to serve - a niche that very much exists, sh is a square peg and
should only go in square holes. I think most folks have had their impression
of POSIX shell tainted by bad shells like bash and - sorry - oil encouraging
complex, non-portable shell scripts. I intend to write more blog posts in the
future helping people gain a closer understanding of POSIX shell and how/where
it's useful and how to use it.

~~~
chubot
Yeah, I think there is a stark difference in how we're thinking about it. I
can't imagine dissuading someone from using local variables "with a straight
face". I didn't get your solution below because "local_variable" is actually a
global.

I could see that perhaps low level scripts need to be portable to different
Unixes. But again I would say that's maybe 5% of what shell is good for.

Also, someone once said "it's easier to port a shell than a shell script". So
that's why I'm paying careful attention to Oil build dependencies. For
example, Rust and Go are significantly less portable than C, and Oil is self-
contained C if you look at it the right way :)

\----

EDIT: Also I think there's no problem adding local variables to POSIX, since
all shells implement the same way as far as I can tell. That would be the much
better solution, rather than trying to convince people to use only globals.

There are many useful / real shell scripts that are thousands of lines long,
so you really need locals.

~~~
Sir_Cmpwn
>I didn't get your solution below because "local_variable" is actually a
global.

A global in a subshell, so the caller's globals aren't affected.

>Also, someone once said "it's easier to port a shell than a shell script".

Yeah, but it's pretty damn presumptuous to make someone install some random
shell on my machine to build your code, imo. The same goes double for e.g. GNU
coreutils extensions, since I have to set up some kind of separate root for
you where true --version works (ditto for assuming /bin/sh is bash, or even
dash so you can use 'local'). It's like encouraging people to use a fucked up
version of libc where the args to strcat are reversed. Fuck that. The standard
exists to _define_ the environment. Sometimes it's within reason to have
dependencies outside of that - but the shell is a good example of where the
case for it is very weak.

For the record, my motivations for working on Simon's mrsh project is
basically 50:50 between "I want to support POSIX-compatible shell scripts" and
"I want a POSIX shell whose interactive mode isn't dogshit". Both are equally
important to me.

>EDIT: Also I think there's no problem adding local variables to POSIX, since
all shells implement the same way as far as I can tell. That would be the much
better solution, rather than trying to convince people to use only globals.

That's all well and good. The Austin Group has public meetings, a public bug
tracker, and a public mailing list. Go there and make a case for it! But be
prepared for "all shells" to include more than dash, bash, ksh, and zsh.

>There are many useful / real shell scripts that are thousands of lines long,
so you really need locals.

Length of shell script is not directly correleated for need for locals.

~~~
willghatch
> it's pretty damn presumptuous to make someone install some random shell on
> my machine to build your code, imo

Is it more presumptuous to make people install a language you used than to
make them install a library you used? Either way people need to get your
dependencies if they want to run your code.

> The standard exists to define the environment.

Standards are great, but not all environments need to be standard. Most shell
scripts don't need to be ported to dozens of different platforms, and most
shell script authors would be much better served by a language with fewer
weird gotchas, saner semantics, and the benefit of a lot of language research
that's happened in the past few decades. For instance, as I've been porting my
personal shell scripts to Rash (my shell in Racket), I've been able to use all
the patterns I use in posix shells but with benefits of better readability,
easier reasoning, better means of abstraction, etc. It's been a solid win that
I would recommend to anybody. Sure, if someone else wants to run my scripts
they need to install Rash. But few people (in practice: nobody) are really
interested in running my personal shell scripts besides myself. If somebody
does want to run them, installing the dependencies is not very hard.

There is a small set of scripts that needs to be portable and be written in a
shell-like language. Until standards catch up, posix is perhaps the best thing
we have. But most of the time when people write shell scripts they could be
writing in any language but choose shell because they want to automate
something they do manually or they really want a processes-and-files DSL.
Choosing a better shell that fits the processes-and-files domain and allows
easy interactive use with better features and less insanity in these
situations is a slam dunk.

~~~
Sir_Cmpwn
>Is it more presumptuous to make people install a language you used than to
make them install a library you used?

A C compiler is also required by POSIX.

>There is a small set of scripts that needs to be portable and be written in a
shell-like language

I reject the assumption that the smaller of the two sets is the set which
should be portable.

Your Racket-based shell requires a supported Racket platform, which is
honestly a pathetic group: Linux, macOS, and Windows, on x86_64, i686, and
ARM(?). Maybe if your software is written in Racket in the first place this
makes sense, but otherwise it doesn't. It's not about you: it's about
everyone. I have a RISC-V machine on my desk and if I wanted to get your stuff
working on it I'd have to start by _writing a new JIT for Racket_. On the
other hand POSIX & C runs just about everywhere. You mentioned using these for
your personal scripts - fine, whatever floats your boat. You'll regret it if
you ever think about playing with BSD, POWER9, or the platform of tomorrow
(spoiler: I guarantee you that platform will support POSIX and C).

It's funny that everyone in this thread who's come out against POSIX shell in
one breath is pushing their own shell in the next ;) maybe when those shells
start to number their users in the double digits, we can talk.

>easy interactive use

A comfortable interactive shell experience is a good thing, and has nothing to
do with POSIX.

~~~
willghatch
>You'll regret it if you ever think about playing with BSD, POWER9, or the
platform of tomorrow

But the platform of tomorrow is clearly a resurrected Lisp Machine! In all
seriousness, though, I'm pretty sure Racket runs on BSD, and it would probably
not be that hard to get it working on any Unix. (In fact, portability is one
of the main thrusts right now in core Racket development.)

>Your Racket-based shell requires a supported Racket platform, which is
honestly a pathetic group: Linux, macOS, and Windows, on x86_64, i686, and ARM

>It's not about you: it's about everyone.

The same could be said about pioneering new architectures. I think you're
doing great work on eg. getting RISC-V available on your build platform. But
by the time I will realistically buy a RISC-V machine to replace the server I
run most of my shell scripts on a lot more software will be ported to it,
probably including Racket. I'm under no delusions that the masses will be
flocking to my shell, but there are so many shells that are better than Posix
for (at least what I perceive to be) most shell scripting. If any of these
better shells becomes very widely used I'm sure it will be ported to
$platform_of_tomorrow (oops! I mean "$platform_of_tomorrow"!) before it is
very widely deployed.

>A comfortable interactive shell experience is a good thing, and has nothing
to do with POSIX.

At least we agree there!

As an aside, I think your projects (sway, wlroots, sr.ht) are really cool, and
I like your blog, despite our disagreement about things like sticking to Posix
shell and C. Cheers!

------
chubot
FWIW yash is an impressive and mature project that seems to overlap with the
goals of mrsh:

[https://yash.osdn.jp/index.html.en](https://yash.osdn.jp/index.html.en)

 _Yash, yet another shell, is a POSIX-compliant command line shell written in
C99 (ISO /IEC 9899:1999). Yash is intended to be the most POSIX-compliant
shell in the world while supporting features for daily interactive and
scripting use._

I've installed the Ubuntu package and poked around at it a bit. Its source
code seems well-written, and it even contains its own line editing library
(i.e. the functionality of GNU readline).

It looks like it's still being developed, and magicant is the original author:
[https://github.com/magicant/yash](https://github.com/magicant/yash)

I mentioned it here (and just added a proper link):
[https://github.com/oilshell/oil/wiki/ExternalResources](https://github.com/oilshell/oil/wiki/ExternalResources)

I wrote some notes about POSIX here:
[http://www.oilshell.org/blog/2018/01/28.html#limit-to-
posix](http://www.oilshell.org/blog/2018/01/28.html#limit-to-posix)

~~~
Sir_Cmpwn
Thanks for sharing! A brief review of the home page shows that it is not
compatible with the goals of mrsh, though. It seems Yash aims for high
compatibility with POSIX, but adds extensions - mrsh is _strictly_ POSIX, such
that in some cases it even detects bashisms and aborts the shell. The goals of
mrsh are:

\- Support the proliferation of portable shell scripts and discourage the use
of non-standard extensions

\- Provide a "POSIX shell as a library" to have a useful standalone parser and
pluggable event loop

\- Provide a moderately comfortable interactive shell experience OOTB

Personally, I'm going to eventually use the second goal to make a new shell
based on libmrsh which has a fish-like interactive experience but with a
strict POSIX syntax.

~~~
hawski
On one hand I applaud, but on the other strict POSIX shell seems overly
limited.

My personal grudges:

\- no local variables (even dash implements them)

\- no process substitution [0] so one has to use temp files, temp named pipes
or possibly joggle fd numbers. I once did an external implementation of
process substitution, so it's doable with POSIX shell, but then pointless.

\- set -o pipefail

I understand that you know about and get over those limitations. But isn't it
too cumbersome when Linux world settled on bash and BSD on ksh? Especially
since ksh is a bit like a subset of bash. At least the things I mentioned are
common for both.

It's sad that shell makes it so hard to make more complicated pipelines than |
the | old | classic. Fds, processes and pipelines should all be first class.
Maybe that's the point to make it harder to make those complicated scripts in
something other than shell. In other languages it may be easier to make
correct pipelines, but with heavy boilerplate. Then it's probably easier to
count on libraries instead of pipelines, even if those libraries spawn
processes in background.

I'm hoping for oil shell more as it means to address all those issues and more
with an escape hatch to convert old scripts. Although I'm not so sure about
the implementation details. For those in the know it made more sense to me
after I read that OVM itself would become a VM for oil. So that there would be
a single VM to run and not two as it is now. At least that's how I understand
it.

POSIX shell is kinda magical, because it's nominally portable. However the
rabbit seems to have died in the hat, because it's really that old. Shell was
conceived in a different era with different limitations. I think that we
should go further.

[0] take-in-and-out <(process-out) >(filter-this | grep that) | grep msg

~~~
Sir_Cmpwn
>no local variables

They're not standard, so don't use them. Instead I often find a subshell to be
sufficient. You can declare a function like so:

    
    
        my_func() (
            local_variable=10
            echo $local_variable
        )
    

This is also a good strategy for helping you make purer shell functions.

>no process substitution, so one has to use temp files, temp named pipes or
possibly joggle fd numbers

I don't find myself doing this often enough to care. It is possible, and if
you're writing something which heavily relies on this perhaps you're better
off with Go or Python.

>set -o pipefail

I agree, we've been talking to the Austin Group about getting this
standardized.

I think we should go further, too, not by attempting to replace the shell with
something that tries to do shell scripts better, but by making full-blown
programming languages in which shell things are nicer to use.

------
nixpulvis
Shameless post of a little shell I've been working on:
[https://github.com/nixpulvis/oursh](https://github.com/nixpulvis/oursh). It
aims to be support a POSIX and modern language.

And some writeups, if you're into shells:

\- [https://nixpulvis.com/ramblings/2018-07-11-building-a-
shell-...](https://nixpulvis.com/ramblings/2018-07-11-building-a-shell-part-1)

\- [https://nixpulvis.com/ramblings/2018-10-15-building-a-
shell-...](https://nixpulvis.com/ramblings/2018-10-15-building-a-shell-part-2)

------
frou_dh
Love this initiative. It's absolutely valid to plant a flag at the extreme end
of the standards-vs-featureful spectrum.

