User defined functions were implemented similarly as external execs in early shells. As the script was parsed, functions were dropped into /tmp without their wrappings and then called as external programs. Since they would still reference parameters as $1, $2 etc, it just worked: function bodies and standalone sh scripts had the same interface! Such a clever idea to avoid managing an interpreted call stack in the parent.
That's been around since the original Bourne shell; /etc/glob, from what I can see from its source, would refuse to run the command if the resulting expansion turned out completely empty; and the globs with no matches would be simply removed.
> PS: I don't know why expanding shell wildcards used a separate program in V6 and earlier, but part of it may have been to keep the shell smaller and more minimal so that it required less memory.
See, I thought it was a nice separation of concerns and wondered why we lost such a nice approach, until I read:
> How escaping wildcards works in the V5 and V6 shell is that all characters in commands and arguments are restricted to being seven-bit ASCII. The shell and /etc/glob both use the 8th bit to mark quoted characters, which means that such quoted characters don't match their unquoted versions and won't be seen as wildcards by either the shell
at which point I suddenly became a fan of ditching it. I do wonder if there's not some better way to factor that functionality out...
Important thing to remember is that even after the move to PDP-11, early Unix systems had to deal with 32kB as entire space available to userland program, both code and data (including stack)
You mean 32k words, not 32k bytes, right[1]? And AFAIK by V5 or V6, Unix could use split instruction and data if the MMU supported it giving a bit more headroom. But, yeah, memory was very tight, and a lot of very clever tricks were used to get around it.
[1] Even worse, the top 4kW/8kB was reserved for I/O.
Why would I want to factor out some syntactic functionality of one specific (and not very well thought out) shell to reuse, again?
But if you really insist, you can write your own glob(1) that would invoke glob(3) for you, sure. There is also wordexp(3) although I believe its implementations had security problems for quite some time?
I use xterm.js a lot and have a "shell backbone" that I use to make shell based access to APIs, S3 and other things "cloud." This is essentially how I implement globbing as well. The convenience is that you can run glob by itself to get an idea of exactly what kind of automated nightmare you are about to kick off.
Anyways.. mine currently has V3 behavior. My shell command exec routine could actually benefit from that hack. What's old is new again?
This is php.ini level of madness, and I'm glad it's gone from (semi-)modern shells. A formal (e.g. programming) language should be defined in its entirety by its formal grammar, its semantics by a formal spec, etc. There's barely any good reason to let the system administrator change the logic and semantics of deployed code.
You could argue that Lisp reader macros also somewhat violate this rule. As a longtime Lisp fan, I dislike reader macros, but I'm more conflicted about macros in general. A good macro system should aim to provide enough context for IDEs and LSPs to aid the developer, but Lisp macros are entirely about just transforming the AST. It's usually just better to evolve the language.
It's not there to give the system administrator flexibility. It's there because early Unix was heavily constrained, and doing thing with lots of little overlays (and what was decades later known as "Bernstein chaining") rather than 1 big program was the way to architect stuff. exit(1), goto(1), and if(1) were all external commands in the Thompson shell.
The other thing to bear in mind is that it’s undergone literally decades of evolution while still being backwards compatible.
The shells weren’t originally intended to be Turing complete. They were just a job launcher. What you use today would have been unimaginable when these shells were first designed.
Whereas all other programming languages have had a drastically smaller evolution in comparison and yet still had a worse compatibility story.
It’s very easy to be critical of the Bourne shell (and compatible shells too) because they are archaic by modern standards. But they weren’t written to solve modern problems. So it’s like looking at a bicycle and complaining how the designers didn’t design a sports car while ignoring the fact that technology didn’t exist and still push bikes are good enough for millions to use daily.
https://www.tuhs.org/cgi-bin/utree.pl?file=V2/cmd/glob.c
reply