Hacker News new | past | comments | ask | show | jobs | submit login
Bash 5.0 released (gnu.org)
534 points by siteshwar 4 months ago | hide | past | web | favorite | 296 comments



With this release, bash now has three built-in variables (um, I mean "parameters") whose values are updated every time they're read:

$RANDOM yields a random integer in the range 0..32767. (This feature was already there.)

$EPOCHSECONDS yields the whole number of seconds since the epoch.

$EPOCHREALTIME yields the number of seconds since the epoch with microsecond precision.

I'm thinking of a new shell feature that would allow the user to define similar variables. For example, I have $today set to the current date in YYYY-MM-DD format, and I have to jump through some minor hoops to keep it up to date.

Does anyone else think this would be useful enough to propose as a new bash feature? Would it create any potential security holes? Should variables like $PATH be exempted?

(Of course this doesn't add any new functionality, since I could use "$(date +%F)" in place of "$today". It's just a bit of syntactic sugar.)


Please no. The parallel universe Me where I didn't happen to read this random thread would never know, consequently never expect random variables to do that. And curse you forever if I found out while banging my head debugging some buggy script.


Yeah what's wrong with functions? Clearly the Bash developers are continuing their tradition of obeying the principle of most surprise.


Judging from this thread, Bash doesn’t seem to be the only shell to support these types of variables, but I agree – it’s unintuitive and surprising.


How does function sound for a syntactic sugar?

  $ today() { date +%F; }
  $ echo Today, $(today) is a great day!
  Today, 2019-01-08 is a great day!


Not as good. I use $today a lot in file and directory names, The overhead of having to type $(today) rather than $today would be, for me, significant.

I do have a workaround for this particular case:

    PROMPT_COMMAND='today=$(printf "%(%F)T\n" -1)'
but it only works in bash 4.2 and later.

I could use

    PROMPT_COMMAND='today=$(date +%F)'
but I'm trying to avoid executing an external command on every prompt. (Maybe the overhead is low enough that it's not worth worrying about.)

My thoughts are (a) if user-defined special variables like this were a shell feature, I could find other uses for them and (b) it seems neater to make such a feature available to users rather than restricting it to three special-case built-in variables.

On the other hand, it might have been cleaner for $RANDOM, $EPOCHSECONDS, and $EPOCHREALTIME to be implemented as built-in functions rather than as special variables.


I have a better (at least cleaner IMHO) workaround:

    PROMPT_COMMAND='printf -v today "%(%F)T" -1'
printf's "-v" option was added in bash 4.0.

printf's "%(...)T" format was added in bash 4.2.

The "-1" argument became optional in bash 4.3.

So here's what I now have in my .bashrc :

    if [[ "$BASH_VERSION" > "4.2" ]] ; then
        PROMPT_COMMAND='printf -v today "%(%F)T" -1'
    else
        PROMPT_COMMAND='today=$(date +%F)'
    fi
(The "> 4.2" test is true for bash 4.2 and later, since the value of $BASH_VERSION for 4.2 is "4.2.0(1)-release". The ">" does a string comparison. I'll have to revisit this when and if there's a bash version 10.0, probably using $BASH_VERSINFO.)


It's not as nice, "cat $today" is easier to type than "cat $(today)" and would give better completion, just declared matching variables instead of matching functions, files and executables.

On the plus side, TIL the subshell syntax plays well with eval/expand shortcut (ctrl+alt+e).


Wouldn't "cat $today" result with "cat: No such file or directory: 2019-01-08"? Did you mean echo instead of cat?


My real life use case for these dynamic variables would be more like "cat/vim/cp $log" to get today's log file which would expand to something like /somedir/logs/201901/09/product.log. Handy when you have a large matrix of products/environments.


Only in the case when no such file exists in the current directory :)


* "cat $today" is easier to type than "cat $(today)"*

Except you should really be in the habit of typing "cat ${today}" ;)


Except you should really be in the habit of typing `cat "${today}"` ;) Quote everything!


Why? The {} doesn't prevent glob expansion or field separation.


It's not obvious that just expanding a variable runs code though.


You mean like Ksh's discipline functions?

  dualbus@system76-pc:~$ ksh -c 'date=; date.get() { .sh.value=$(date +%s); }; echo $date; sleep 5; echo $date'
  1546926637
  1546926642
See: https://docs.oracle.com/cd/E36784_01/html/E36870/ksh-1.html ("Discipline Functions")


Yes, just like that!


Syntactic sugar is a phrase permanently marred for me by the ruby community: it means "cognitive load for other people who know the language but don't keep up with the faddy bits". It made me tremendously sad.


Yeah, but C-style "for" loops are nice, right ? Still, they really are a syntactic sugar for a "while".

The problem is more with a language having fady bits and people using them when they don't make the code clearer, than with syntactic sugar.


C-style "for" loops are nice, right

Well... I know macros are supposed to be bad, but.. I've been programming in C for a long time, but only recently tried 'sugaring' my loops with

  #define _(x,y) for (x=0;x<(y);x++)
..which has worked soo well, I'm sticking with it. It makes programming in C much more pleasurable. (My programs mostly have a lot of nested loops over multidimensional arrays, mostly from 0, inc'ed by 1)

So now instead of e.g.

  for (i=0;i<WID;i++) 
    for (j=0;j<HEI;j++) {
it's just

  _(i,WID) _(j,HEI) {
which makes me smile every time.


Sure, I agree actually - if you are going to have a language that has lots of nice syntax, _let it be built in from the start_. Or only released at major increments, etc etc.


Hopefully Bash 6.0 then


Nicely put.

I'm reminded of the Jargon file that says "Syntactic sugar causes cancer of the semicolon."


$RANDOM is new in bash 5.0? i'm curious. This has been documented for ages, afaik. headtilt


No $RANDOM has been there since 80s. BASH_ARGV0, EPOCHSECONDS, and EPOCHREALTIME are new.


No, it's old; I've updated my comment accordingly.


word wasn't hating, just ... curious. the others do seem new. :)


What is the advantage of this technique over a shell built-in or function that echoes that value?


You can already do this interactively with PROMPT_COMMAND, e.g.

  PROMPT_COMMAND='date=$(date +%D);time=$(date +%T)'


Not quite - that is run before your prompt is displayed.

  $ PROMPT_COMMAND='time=$(date +%T)'
  $ echo $time;date +%T
  12:19:07
  12:20:19
Thus it will show the time after your last command returned rather than the current time.


Anyone knows why a `microsecond since epoch` is named "realtime"? What is so "guaranteed" about it?


bash's EPOCHREALTIME was inspired by mksh, which was in turn inspired by zsh. And in zsh, EPOCHREALTIME was named after the system real-time clock (specifically the CLOCK_REALTIME clock ID passed to clock_gettime()). And i imagine that was named by analogy with hardware real-time clocks, since they both track human ('real') time. But i don't know much about electronics or the history of POSIX clock sources.


And does it even count leap seconds? Regular Unix timestamps do not (time stands still during leap seconds)


Here is my favourite bash script: https://blog.ashfame.com/2018/02/deploying-lambdas-multiple-...

Deploy a lambda function in multiple regions (15 regions!) with just one bash script using Apex up.

Add route53 on top with Latency Based Routing enabled and you've a latency in 10s of millsecond from anywhere on the globe without paying a hefty fee for this kind of latency.


How is that in any way related to the parent comment?


It's not my blog, i thought mentioning a script in bash would be a good idea.


Love Apex Up


Why did we keep the language of the shell and the OS separate? It seems like a needless abstraction which creates more harm than good (read a shell script vs any other language). While I'm at it, why is the filesystem and syscall api not just part of a standard userland language? For example, the filesystem could be exposed like an object tree rather than some syscall ritual. The syscalls could just be invisible, where the language compiler deals with it instead of the programmer. I think that the old LISP machines got this right while we are stuck in a usless POSIX compatibility trap. The only reason I think they didn't design unix this way was because C was too low level, but we could write the OS in a "higher level" functional language.


(re: Bash-vs-OS integration)

bash is a programming language like any other, and you could use anything with a REPL as your shell. Python should do. In fact, I'll try it right now..

Yes, it works. Just sudo chsh -s /usr/bin/python <username> and off you go.

Once you start doing this for a bit, you'll notice that the Python REPL is an incredibly poor UI for repeated execution of subprocesses. It is very elaborate. Having to constantly wrap your strings in quotes, calling subprocess modules, exec, painstaking handing of streams, etc.

Then you start looking for a language that has better syntax for calling external programs.. hmm...

Bash. Or zsh, or ksh, etc. These languages excel at it. But that's all they are: programming languages that happen to be super easy to use when it comes to starting external programs.

This is why it makes little sense to bind them to the OS. As far as the OS is concerned: there is no Bash. Just like there is no Python. There is just syscalls.


You may be interested in xonsh, a shell that supports both python and bash-like expressions: https://xon.sh/


Thank you for this


is this like fish?


It's pretty much like fish, but with Python goodness.


> Python REPL is an incredibly poor UI for repeated execution of subprocesses

Python REPL, even with recent additions of TAB completion, is a poor REPL, period.

IPython, on the other hand, offers a much better programming environment than shell while still allowing easy access to most of the things you mention. Example:

    In [1]: from pathlib import Path
    In [2]: file = Path("~/.sbclrc").expanduser().read_text()
    In [3]: !echo "$file" | awk '!/^;;/'   # EDIT: to clarify, this shows passing
                                           # Python variable to a shell pipeline.

    #-quicklisp
    (let ((quicklisp-init (merge-pathnames quicklisp/setup.lisp
                                           (user-homedir-pathname))))
      (when (probe-file quicklisp-init)
        (load quicklisp-init)))
All things considered, between !, %, %% and /, IPython is a decent shell. I was using it for a few years as my main shell, actually - its QT Console implementation specifically. I was working on Windows back then, PowerShell didn't exist yet, and `cmd.exe` was... well, exactly the same as today.

TLDR: a shell is just a REPL with a few conveniences for specific tasks. Re-implement these and suddenly any REPL becomes a workable shell.


> In [3]: !echo "$file" | awk '!/^;;/'

That's Bash (or at least sh). Which excels at piping the output of programs into other programs, and pay that cost for everything else.


Yeah, that's what I mean by "easy access" - you're working with Python, but if you want to do something that Python is not well-suited for[1] - like easily piping something through many programs - you can do it simply by adding `!` at the beginning of your pipeline, right there in the REPL.

It's actually the same philosophy that shells use: in bash you frequently invoke sed, awk, grep, etc. because it's easier than writing the same directly in the shell. In IPython, you invoke bash to do piping, because it's easier than connecting a host of subprocesses' stdins/stdouts in Python.

So again, there are some things you expect from your shell, and as long as you get those things (no matter how), you can use any REPL as a shell without problems.

[1] because of syntax or lack of library or something like that


Piping program output is just function composition. The downside is that it works over expensive system processes, and data must be serialized. It's a fragile system where instead of using built-in functions, we use "standard" executables.


I think, rather, function composition is just piping program output.


Yep. IPython lets you execute bash.


That was kind of the idea of Unix. C was just a nicer from Assembly. You might write a performance-sensitive or low-level routine in C, much like you might drop to C when writing a Python library. But the high-level language of the system was the shell. The `dc` executable wasn't just meant to be a user-facing calculator program, it was also meant to be the system's "bignum" library. That was big-picture Unix. The syscalls and C were little-picture Unix.

This 2013 thread shaped a lot of the way I think about Unix: https://news.ycombinator.com/item?id=6530180


This makes me think of PowerShell. Individual Cmdlets can be written in PowerShell itself or in any language that compiles to .Net IL, plus one has dire t access to the entire .Net framework.


PowerShell takes it a bit further: There is no "magic" built-in commands needed for e.g. manipulating current directory, environment variables etc. In PowerShell the cmdlets execute in-process and may have access to the session and it's state. Indeed all of the built-in cmdlets are just cmdlets defined not by the shell, but by the core modules


Good insight, thank you


On the same line, let me recommend you a book: "The Design of the Unix Operating System," by Maurice J. Bach. If I am not mistaken, Mr. Bach was a member at Bell Labs.

My personal take on the book is that the kernel was meant to be a portable virtual machine, extensible through processes, and that those would be the building blocks of user applications for which the shell would act as glue.

In other words, the shell and the OS are separate because most of what we commonly call "the OS" may be interpreted as mere encapsulation of less versatile hardware architectures.


If you're doing it right, you are solving very different problems with shell vs any other language. Shell is best used as a tool for orchestrating other programs, you should not be implementing your programs in shell.

Syscalls, in general, are used in lieu of objects or other abstractions because they more accurate mirror what the underlying hardware is doing. This isn't always the case, some syscalls are maintained for POSIX-compatibility and add a lot of complexity to emulate behavior that is no longer reflective of the hardware.

At the end of the day, you'll find that it's very difficult to maintain the highest levels of performance while also presenting an API that has a high level of abstraction. Things like dynamically-resizable lists, using hash tables for everything, runtime metaprogramming, and other such niceties of modern HLLs aren't free from a performance perspective.

If you really want to know more, I would suggest reading one of McKusick books on operating system design (the most recent being The Design and Implementation of the FreeBSD Operating System 2/e, but even the older ones are still largely relevant).

Maintaining this "useless POSIX compatibility trap" has a certain amount of utility; I for one like not having to re-write all of my programs every few years. I imagine others feel the same.

In closing, some projects that are pushing the boundaries of OS design which you may want to check out include:

* Redox OS (https://www.redox-os.org/) - a UNIX-like OS done from scratch in Rust

* OpenBSD (https://www.openbsd.org/) - one of the old-school Unices, written in plain old C, but with some modern security tricks up it's sleeves

* Helen OS (http://www.helenos.org/) - a new microkernel OS written from scratch in C++, Helen OS is not UNIX-like

* DragonFlyBSD (https://www.dragonflybsd.org/) - a FreeBSD fork focused on file systems research

* Haiku (https://www.haiku-os.org/) - binary and source compatible with BeOS, mostly written in C++, but also has a POSIX compatibility layer


It's odd not including Plan 9 here. But I guess that is the fate of Plan 9.


Yeah, I did not include anything without at least some recent development. AROS and plan9 both got cut for that reason.

I was on the fence about including ReactOS but wound up not including that either.


I would consider 9front quite active from their Hg repo.

https://code.9front.org/hg/plan9front


Redox reuses a lot of concepts from Plan 9


It has already pushed the boundaries, and now seems not to be developed or used that actively.


When I first learned one could write programs in Bash, I got very excited. It didn't take long for that to where off :) My Bash got a lot better though!


I think everyone who become proficient in shell tries to implement something large in it, gets burned, and realizes that it’s a bad idea. I certainly had that experience.


> you should not be implementing your programs in shell.

What should one be using instead in the current scenario?

Python, Rust, Golang?


I think parent means (generally speaking) not to implement larger programs in shell.

If the tasks performed by your program can be expressed maintainably and performantly in shell, go for it.

For programs one size up, something like Perl or Ruby or any other language that allows easy execution of shell commands can be a great solution.

Of course, this all depends on any number of variables. Team competencies, target environments, etc. And with some effort even larger bash scripts can be kept maintainable-ish, and in some target environments (can't always count on other languages being available!) shell scripts might be the only available programming environment.


It depends. I don’t think there is a language which is categorically the best. The solution to a given problem may be harder or easier to express in different languages. Some problems are better solved in C, some in Python, ..., and some in Shell.


It's not quite the same thing, but I've been enjoying using eshell (https://www.gnu.org/software/emacs/manual/html_mono/eshell.h...). It has the usual shell interaction, except you can also interleave it with arbitrary elisp. It's not quite the OS, but it allows interaction with any parts of Emacs, which is at least OS-like (and can itself interact with the OS in various ways).


I think you might like TempleOS.


You might want to look at the oilshell project [1]. It's an attempt to bring some sanity to the shell while keeping the ability to run existing scripts.

[1] http://www.oilshell.org/blog/2018/01/28.html


> stuck in a usless POSIX compatibility trap

Exactly. Unfortunately.

Unless we have a solution that is significantly better than POSIX/UNIX, switching to anything else incurs a significant cost that no one is willing to pay.

Short of that, we probably need a technology breakthrough that brings in a complete architectural change.


> Unless we have a solution that is significantly better than POSIX/UNIX, switching to anything else incurs a significant cost that no one is willing to pay.

The problem is that it doesn't even suffice for the new thing to be "significantly better". Because of the huge sunk cost, the new thing needs to be able to do desirable things that a POSIX-compatible OS strictly cannot do. Otherwise, it will always be easier and faster to just glue another subsystem onto the Linux kernel and continue using that.


If you think of the browser as an OS, which it almost is these days, JavaScript is that language.


> The only reason I think they didn't design unix this way was because C was too low level, but we could write the OS in a "higher level" functional language.

You've already lost my interest. Bash/Shell is incredibly powerful and doesn't need a higher-level abstraction. That is what programming languages and CLI tools are for.


That misses my poorly written point. Do you use a separate, awkward tool to call functions you wrote in python? Or do you call the function in python itself? The shell is a useless abstraction that was only necessary in unix because the alternative was c. Now that we have better languages, why not use them to write the OS bottom-up?


> Do you use a separate, awkward tool to call functions you wrote in python?

I use a separate tool to call a lot of functions written in C. That tool is called “Python”. Or “Ruby”.

Sometimes I use a separate tool to call functions written in Python (or JavaScript, or...) and that tool is called PostgreSQL.

Sometimes I use Ruby to call C to call Postgres to call JS.

> Now that we have better languages, why not use them to write the OS bottom-up?

What better languages for implementing an OS? Lisp? We had that before C. Rust? Either way—and I like both languages—I don't want to use either for a shell. Red? Maybe, but I don't think we're to the point of a Red OS yet, and while it's probably eventually usable for that purpose, I don't see it as necessarily ideal for OS implementation though it might be tolerable as a shell language.


Because writing new OSes is hard and writing new OSes where existing software that people want to use keeps working is even harder.


And thus the tech industry piled abstraction upon abstraction, decade after decade, until finally all software collapsed into a singularity and destroyed the earth. Which, frankly, came as somewhat of a relief to all the people forced to use it.


The choice of python as your shell would surely be as arbitrary as bash?

There is a long history of alternative shells csh, tcsh, zsh, fish, etc. All with various level of abstraction and various amounts of "programming language type constructs" (for lack of a better term).

At the end of the day, long arguments have been had over which is the better shell. It's all just personal preference. Hence it can be set in /etc/passwd per user.

You prefer python? chsh is there to change it.


My point wasn't about python, just an example of how a shell around python, made only for python function calls, would be a useless abstraction.


Something like IPython, amirite? XD


I have a different theory; perhaps line printers (which terminals try to emulate) are to blame.

Space on a line is limited and one directional, so thr tools naturally evolved towards strange single line incantations instead of full fledged programs.

Commodore Pet did it right by introducing a navigable terminal where you could move between lines and don't need to open an editor to write multiline programs.


I don't have time to Google now but I think that mainframe terminals worked like that. 3270 should be the model to search for.


This was already a solved problem in the 70s. IBM's 3270 series offered full buffered access to a character matrix display, and DEC's VT52 and later terminals added bi-directional scrolling.

Modern CLIs can usually emulate at least a few of the old scrolling character matrix displays, but only a few bash commands (e.g. top) use the extra features.


It's strange to me to imagine that there would be one language that was the best choice for OS implementation and day to day user interaction/automation. I'm curious what language do you think would be good for both use cases?


A more syntactically friendly Haskell-like language


That’s great for developers who are competent in that paradigm but what about hobbyists or sysadmins who have little interest in software development?

I’m not saying POSIX shells are without fault but they weren’t just created because C is too low level; they were created because people used a terminal who weren’t always techies so Bell Labs created a language which anyone could easily write simple programs with. Granted that need has dissipated a little but your solution of having one language to rule them all creates more problems than it solves (really the only problem it solves is OCD for a few LISP enthusiasts).

The reason there are so many different programming languages isn’t just an element of NIH; some languages do genuinely handle some use cases better than some other languages. Plus there is always going to be an element of user preference. So your idea of a perfect system would be hell for a great many other people - myself included (and I do generally enjoy functional programming).


Yeah I never understood why many people, at least among programmers, seem to be strongly opposed towards shell and shell scripts.

At some point when I wasn't proficient enough in Bash yet, I was writing scripts for automation using Perl and Ruby. The 'logic part' is definitely much easier in these languages. But simple file/directory stuff is far more complex actually. A lot of this has to do with error handling and different expectations how that is supposed to work. In a default shell script, errors are handled very forgivingly which is very much how most people work when performing tasks manually - not every step is super important.

On the other hand Shell scripting documentation is crap. Most of it is from 80s/90s, full of irrelevant details for the practical person. A bit like 90s/00s JS documentation before MDN.


Haskell's lazy-by-default evaluation makes it more difficult to reason about memory usage, because it is all too easy to accidentally build up large expression trees at runtime [1]. This makes it a rather bad choice for a kernel or other low-level stuff.

Compare Rust, which tries to keep allocations (and the size/asymptotic complexity of allocations) explicit and visible.

[1] The optimizer catches some cases, but not all of them.


How would you deal with mutable state like, say, the current directory?


When all you have is a Haskell everything starts to look like a monad


I think that if we use purish FP as our system language it doesn't make sense to just emulate imperative/mutable system.

But mm... why you would want to mutable state like current directory in first place? Or even mutable filesystem? Why aren't those just immutable parameters of your program?


> But mm... why you would want to mutable state like current directory in first place? Or even mutable filesystem? Why aren't those just immutable parameters of your program?

Because that's how just about any OS, application, library, filesystem, device driver, database, nearly everything related to computing at all, works. Reinventing the universe has never, ever lead to success. If you want even the slightest hope for non-negligible adoption of your operating system, you need to be able to interface with the rest of the world. And that's why you don't want the filesystem to be an "immuatble parameter of your program".


You can still build compatibility layer on top of immutable FS. Like being mutable from program(s) perspective while being able to control how these changes are capsulated from rest of the system. You could, for example, see what your

> any OS, application, library, filesystem, device driver, database, nearly everything related to computing at all

-is trying to actually do to your files and operation like reverting changes is trivial. Sure it's not performance optimal and effects what kind of software is ideal to work within such a system. It's just a different approach with different trade-offs.

Anyway if I one day want to build new OS it's for trying different interesting approaches and not yet another 'successful any-OS'.


RIP Terry Davis


Because one of the scenarios we were talking about is interactive use - i.e. command prompt. Current directory (and other implicit state like that) is convenient there, and removing it would make it that much harder to use.


If your workflow is like: 1. Manipulate current directory 2. Run some program 3. Manipulate current directory again 4. Run yet another program...

You can actually consider current directory as immutable already.

It became "badly mutable" for example if you manipulate the current directory and this change is observable from a program point of view which is already running. Would that be really useful?

To clarify I am not trying to say that absolutely everything should be immutable but it's fun thought experiment how far you can go.


I don't see how. The suggestion was to use a Haskell-like scripting language for interactive use on the command prompt. Note, we're not talking about apps that run from that prompt; we're talking about the shell itself. And all shells today have mutable global state in form of current directory (and pushd/popd stack, usually).


Shell is kind of REPL (read–eval–print loop). In OOP you use looping with mutation and in FP you use recursion. This may not sound very informative but still: "Just fold over inputs, what's the problem?".

Being immutable doesn't mean that everything is static but that things aren't changeable in place. So instead of having a global mutable state, you have this local state which you can alter by creating new instances and passing them around. In the end, the difference is that on what scope changes are observable and how that constraint effects on your design. In some sense: mutable = changes are uncontrollable, immutable = changes are controllable.


So a ml like language like sml, ocaml or f#?


Seeing this release makes me cringe. I've used Bash as an interactive shell for decades but really I'm sick and tired of it.

As a scripting language, I loathe it and really don't understand its purpose. I always write shell scripts in POSIX shell for portability reasons. Most of the time I don't need to use any of Bash's features. In cases where advanced features are needed and portability is not a concern, there are other scripting languages much better suited for this (Python, Ruby, etc).

As an interactive shell, the only features I ever use are command history and tab completion. Bash is way too bloated for my use case (it's only a matter of time before the next Shellshock is discovered). Other lightweight shells are missing the couple of interactive features which I do use.

If anyone knows of a shell which meets my criteria of being lightweight but with command history and tab completion (paths, command names and command arguments), I'd really appreciate any suggestions. Otherwise I may have to look into extending dash or something.


Okay... no one is making you use Bash. I use it a lot, depend on its features, and I'm glad they are still improving it. I agree it has lots of warts from retaining so much backwards compatibility, but for certain use cases it's still much easier to write (and read) than any "real" programming language, including stuff like Python and Ruby. Since you don't need to deal with those use cases, that's cool. Not sure why you came here to trash it, though.


I've used fish for a few years now and am a big fan. I'm not sure if it meets your definition of lightweight but tab completion works very well out of the box.


I switched to fish around 2010-2011, and I have never looked back. I’ve never had to set anything else besides my personal functions. These days it’s even faster and much more stable, and I’ve had so much more practice that I know all keyboard shortcuts and understand a lot of its inner workings.


"If anyone knows of a shell which meets my criteria of being lightweight but with command history and tab completion (paths, command names and command arguments), I'd really appreciate any suggestions. Otherwise I may have to look into extending dash or something."

I would also love to know the answer to this question. I am a big fan of shells and shell programming in general and POSIX shell in particular.

Only suggestion I currently have is ksh, of which there are a few implementations, ksh93 still developed (https://github.com/att/ast), pdksh from OpenBSD (of which there is a portable version here: https://github.com/ibara/oksh) and MirBSD ksh (https://www.mirbsd.org/mksh.htm).

Otherwise of interest is mrsh: https://github.com/emersion/mrsh which was recently mentioned by Drew DeVault in a blog post linked here: https://news.ycombinator.com/item?id=18777909.

EDIT: And by mentioning mrsh, I meant it as a better/easier base to extend to get what you are asking for.


I recently switched to fish and was quite happy with how little effort it took: chsh, use fish_config to pick a prompt with Git status, and that's it.


Thanks for all the suggestions.

Unfortunately zsh and fish are more bloated than bash, and dash and ksh are missing the features I use.

I've just found "yash" which looks like a nice compromise. I'm going to give that a try.


If I may humbly ask, what is your definition of bloated? Bloated in technical sense that the software does not operate within your determined hardware constraints (valid!) or the philosophical sense that it has more features than your needs (also valid of course)?


My definition aligns with both definitions you described. Although I don't really have any hardware constraints that would stop me from running bash, zsh or fish, so my reasons for not wanting to use them, probably align better with your second definition.

However, I wouldn't consider the second definition to be philosophical, but technical. It has more features than my needs, therefore it has more lines of code, potentially more bugs, uses more memory and has a greater attack surface.


If you're on Ubuntu, you could switch to dash. It's installed by default, and /bin/sh symlinks to it.


zsh has much better tab completion, you should check it out


I read that often but I don't understand how it can be true. Bash (and I suppose zsh's) tab-completion is programmable so you can make it do whatever you want to.


Zsh, especially with an addon package like oh-my-zsh isn't just programmable but is actually programmed. Like, it just works, near magically - for instance make <tab> looks for the Makefile in the current dir and actually scrapes the targets from it.


AFAIK Bash does that by default too (at least it's been the case on my Debian setups for years). It also works with git for example, not just with its subcommands but also with your commits, tags, branches, remotes, etc.

Many Debian packages comes with a completion script for Bash so you get it when you install them :).


zsh completion supports menus, typo correction, caching...


Oh okay, thanks.


Even more bloated


What is your definition of "bloated"?


zsh.


I think tcsh might hit your requirements.

(Personally, I use zsh, but it's much more heavy than tcsh.)


Don’t hate me for saying this but try powershell or pwsh


ksh


It's sad that lists.gnu.org is running obsolete TLS 1.0 crypto with weak 1024-bit DH. Either upgrade to TLS 1.2 with reasonable cipher suites, or just go back to plain HTTP.


Since I'm clueless on the subject, can I ask how you determined that information and what resource I could use to become better informed?


You can find that information on the security tab of your browser's developer tools. You can also find a lot more info about any site's HTTPS configuration at ssllabs.com. Here's the report for lists.gnu.org: https://www.ssllabs.com/ssltest/analyze.html?d=lists.gnu.org


A good first step is disabling SSL 3.x and TLS 1.0 in your daily browser. I would also recommend the excellent Qualys SSL Server Test: https://www.ssllabs.com/ssltest/


Is there any way to do that on Chrome macOS?


this video might help you [1].

[1]: https://www.youtube.com/watch?v=xA_IBcoQTD4


I can’t imagine what BASH_ARGV0 is for. Can someone more sage supply an example of what problem it solves?


A use-case I've long wanted it for is better "--help" messages. If you want to tell the user how to invoke the program again, argv[0] is the right thing:

Given:

    #include <stdio.h>
    
    int main(int argc, char *argv[]) {
    	printf("Usage: %s [OPTIONS]\n", argv[0]);
    	return 0;
    }
Running it as `./dir/demo --help` gives:

    Usage: ./dir/demo [OPTIONS]
Put it somewhere in $PATH, and run it as `demo --help`, and it will give:

    Usage: demo [OPTIONS]
Perfect!

But with a Bash script, argv[0] is erased, it sets $0 is set to script path passed to `bash` as an argument.

Given:

    #!/bin/bash
    echo "Usage: $0 [OPTIONS]"
Running it as `./dir/demo --help` gives:

    Usage: ./dir/demo [OPTIONS]
So good, so far, since the kernel ran "/bin/bash ./dir/demo --help". But once we get $PATH involved, $0 stops being useful, since the path passed to Bash is the resolved file path; if you put it in /usr/bin, and run it as `demo --help`, it will give:

    Usage: /usr/bin/demo [OPTIONS]
Because the call to execvpe() looks at $PATH, resolves "demo" to "/usr/bin/demo", then passes "/usr/bin/demo" to the execve() syscall, and the kernel runs "/bin/bash /usr/bin/demo --help".

In POSIX shell, $0 is a little useful for looking up the source file, but isn't so useful for knowing how the user invoked you. In Bash, if you need the source file, you're better served by ${BASH_SOURCE[0]}, rendering $0 relatively useless. And neither has a way to know how the user invoked you... until now.

It's a small problem, but one that there was no solution for.


> If you want to tell the user how to invoke the program again, argv[0] is the right thing

Some pedantry: it's actually not. The argv array is a completely arbitrary thing, passed by the caller as an array of strings and packed by the kernel into some memory at the top of the stack on entry to main(). It doesn't need to correspond to anything in particular, the use of argv[0] as the file name of the program is a side effect of the way the Bourne shell syntax works. The actual file name to be executed is a separate argument to execve().

In fact there's no portable way to know for sure exactly what file was mapped into memory by the runtime linker to start your process. And getting really into the weeds there may not even be one! It would be totally possible to write a linker that loaded and relocated an ELF file into a bunch of anonymous mappings, unmapped itself, and then jumped to the entry point leaving the poor process no way at all to know where it had come from.


Sure, argv[0] is a just string, that the caller can set when they call execve(). That doesn't have mean that it doesn't have meaning. You are correct, there is no way to know how your executable was passed as the first argument to execve(). But, argv[0] is specified to mean roughly "welcome to the world, you are argv[0]", and to tell the program what it is. Sure, you could lie to the program, and tell it that it's something it's not by passing a different string to execve(), you can even do this from a Bash script with `exec -a`.

I stand by my original statement: If you want to tell the user how to invoke the program again, argv[0] is the right thing. I didn't say that running argv[0] will necessarily actually invoke the program again, I said that it's the right thing to tell the user. If the caller set argv[0] to something else, it's because they wanted your program to believe that is its identity, so that's what it should represent its identity as to the user.


Surely the user already knows how to invoke the program, since he/she did it literally two seconds ago.

What if I invoke it from a distant path, do I want my 73 character long path to be prepended in the --help ?


Especially on something like NixOS, where /bin/bash is actually /nix/store/3508wrguwrgu3h5y9354rhfgw056y-bash-5.0/bin/bash.


Do note that was my complaint with $0, that when using $PATH it was set to that full gross path.

If /bin/foo is actually /nix/store/3508wrguwrgu3h5y9354rhfgw056y-foo-5.0/bin/foo, then when you run "foo", 0="/nix/store/3508wrguwrgu3h5y9354rhfgw056y-foo-5.0/bin/foo" and BASH_ARGV0="foo".


If you run --help you already have the path to the executable. Press the up arrow and you got it, even on nix.


argv[0] is defined to be the "program name" by ISO C. What exactly this means is obviously platform dependent, and of course the caller can ignore it altogether and put something random there, but at that point they're the ones misusing the API in a way that breaks the spec.

Note that this is ISO C, not even POSIX. So ELF, Bourne shell etc are implementation details that are out of scope on this level.


I'd love to see --help start with a few examples of the most common use cases and the parameters you would use. Include the exhaustive list of parameters after that. That would make the command line much more accessible for a significant portion of users. Or at least me. I'm always forgetting the syntax of commands I use infrequently.


According to the email, it expands to $0, though. So as far as I can see, its only use is to set $0.


Personally, I just don't bother trying to show the user their particular invocation:

    #!/bin/bash
    echo "Usage: $(basename $0) [OPTIONS]"


Minor nitpick, but wouldn't ${0##*/} work as well?


Yep! Not a nitpick at all. There's more than one way to do lots of things.

I prefer my version because it's easier for me to remember and read. I do use pattern substitution for other things, but I usually have to go to an interactive shell and create a test variable in order to remind myself whether I want # or %. I understand favoring built-in features over subshell calls for heavily used code, but I don't think this is especially critical for outputting usage info.


Yes, that's what we've been stuck doing, because the full path included in $0 is useless if the program was run via $PATH.


The “set” command lets you override the positional arguments but starting at $1. You cannot directly set $0. But you can set BASH_ARGV0, and the value of $0 changes accordingly.


“BASH_ARGV0: a new variable that expands to $0 and sets $0 on assignment.”

Changing argv[0] would make utilities like ps show a more descriptive/shorter name, eg in the case of long command paths.


I don't think changing argv[0] in the current process will have any effect in the /proc file system.

And to do what you describe, there's `exec -a NAME' already:

  $ (exec -a NOT-BASH bash -c 'echo $0; ps -p $BASHPID -f')
  NOT-BASH
  UID        PID  PPID  C STIME TTY          TIME CMD
  dualbus  18210  2549  0 19:30 pts/1    00:00:00 NOT-BASH -c echo $0; ps -p $BASHPID -f


> I don't think changing argv[0] in the current process will have any effect in the /proc file system.

Yes it does. This is a standard trick for changing the process name at runtime, several daemons do this to change the process name of child processes created by fork() that aren't separate executable. For instance, OpenSSH's sshd sets the child-process for a session to "sshd: USERNAME [priv]".

`exec -a` lets you set argv[0] through an execve() call, but many times you want to set it without exec'ing a new program.


> I don't think changing argv[0] in the current process will have any effect in the /proc file system.

Yes it does - that’s the whole point of changing it.


I would like to understand how this would work.

argv is a buffer in Bash's process memory space. This is AFAIK, not shared in any way with the kernel.

How would the kernel know that a process wrote to the memory location of argv[0] and then reflect that in /proc?

This is what I tried:

  dualbus@system76-pc:~/src/gnu/bash$ ./bash -c 'echo $BASH_VERSION; ps -p $BASHPID -f; BASH_ARGV0=NOT-BASH; echo $0; ps -p $BASHPID -f; (ps -p $BASHPID -f && : do not optimize fork)'
  5.0.0(1)-rc1
  UID        PID  PPID  C STIME TTY          TIME CMD
  dualbus  27918 20628  0 20:16 pts/5    00:00:00 ./bash -c echo $BASH_VERSION; ps -p $BASHPID -f; BASH_ARGV0=NOT-BASH; echo $0; ps -p $BASHPID -f; (ps -p $BASHPID -f && : do not optimize fork)
  NOT-BASH 
  UID        PID  PPID  C STIME TTY          TIME CMD
  dualbus  27918 20628  0 20:16 pts/5    00:00:00 ./bash -c echo $BASH_VERSION; ps -p $BASHPID -f; BASH_ARGV0=NOT-BASH; echo $0; ps -p $BASHPID -f; (ps -p $BASHPID -f && : do not optimize fork)
  UID        PID  PPID  C STIME TTY          TIME CMD
  dualbus  27921 27918  0 20:16 pts/5    00:00:00 ./bash -c echo $BASH_VERSION; ps -p $BASHPID -f; BASH_ARGV0=NOT-BASH; echo $0; ps -p $BASHPID -f; (ps -p $BASHPID -f && : do not optimize fork)


On Linux, reading /proc/<pid>/cmdline literally asks the kernel to reach into the target process's address space and fish out its argv[0]. This, erm... has some corner cases:

https://github.com/torvalds/linux/blob/v4.20/fs/proc/base.c#...

https://github.com/torvalds/linux/blob/v4.20/fs/proc/base.c#...

Changing the `ps` output in a cross platform way requires a number of platform-dependent strategies, e.g. how PostgreSQL does it:

https://github.com/postgres/postgres/blob/REL_11_1/src/backe...


Awesome, thank you! This is super useful. I stand corrected.


The argv array (and the envp array) are in a page the kernel set up when it created the process, and the kernel holds on a reference to it, and remembers the addres of those arrays in the page. The kernel doesn't need to "watch" that memory, when you read from `/proc/$pid/cmdline` or `/proc/$pid/environ`, procfs literally reads directly from $pid's memory space (remember that the kernel controls the page table, it can look in to the memory space of any process it wants to). The kernel doesn't "know" that the value changed, it just reads the current value from process' memory.


Any process can write into the argv they get from the kernel.

The kernel doesn’t need to monitor for reads - when proc reads it it’s read from the process.

It doesn’t need to be specially ‘shared’ with the kernel. The kernel can of course ready any memory it wants to from the process at any time.

I’ve implemented setting argv[0] in another language myself.


Can you show an example of how this would work with BASH_ARGV0?


Sorry I don’t know anything about how Bash is implemented, but when someone assigns to that variable Bash just needs to write that string to argv.


I tried python, bash and even C, none of them update /proc/self/comm when argv[0] is updated:

  dualbus@system76-pc:~$ cat argv0.c
  #include <stdio.h>
  #include <string.h>
  int main(int argc, char **argv) {
      FILE *fp;
      char buf[256]; // XXX :-)
      strcpy(argv[0], "XYZ");
      //puts(argv[0]);
      fp = fopen("/proc/self/comm", "r");
      fread(&buf, 1, 256, fp);
      buf[255] = '\0';
      puts(buf);
  }
  dualbus@system76-pc:~$ gcc -o argv0 argv0.c  -Wall
  dualbus@system76-pc:~$ ./argv0
  argv0

  dualbus@system76-pc:~$ python -c 'import sys; sys.argv[0] = "XYZ"; print(open("/proc/self/comm").read())'
  python

  dualbus@system76-pc:~$ ~/src/gnu/bash/bash -c 'BASH_ARGV0="XYZ"; cat /proc/$BASHPID/comm'
  bash
Furthermore, https://github.com/torvalds/linux/blob/master/Documentation/... says:

  > 3.6   /proc/<pid>/comm  & /proc/<pid>/task/<tid>/comm
  > --------------------------------------------------------
  > These files provide a method to access a tasks comm value. It also allows for
  > a task to set its own or one of its thread siblings comm value. The comm value
  > is limited in size compared to the cmdline value, so writing anything longer
  > then the kernel's TASK_COMM_LEN (currently 16 chars) will result in a truncated
  > comm value.
Which works as advertised:

  dualbus@system76-pc:~$ ~/src/gnu/bash/bash -c 'echo -n XYZ > /proc/$BASHPID/comm; ps -p $BASHPID'
    PID TTY          TIME CMD
  28797 pts/6    00:00:00 XYZ
Can you show me an example, in any language, where updating argv[0] causes ps (or /proc/self/comm) to show the updated value?

EDIT: formatting.

EDIT2: I stand corrected, see willglynn's comment.


You're looking at the wrong file. Setting argv[0] doesn't update /proc/self/comm, it updates /proc/self/cmdline.

demo.c:

    #include <unistd.h>
    #include <string.h>
    
    int main(int argc, char *argv[]) {
    	strcpy(argv[0], "frob");
    	sleep(100);
    }
Terminal 1:

    $ make demo
    cc     demo.c   -o demo
    $ ./demo --greppable
Terminal 2:

    $ ps aux|grep greppable
    luke     25858  0.0  0.0   2164   752 pts/0    S+   00:32   0:00 frob o --greppable
    luke     25931  0.0  0.0   8192  2356 pts/5    S+   00:32   0:00 grep --color=auto greppable
    $ cat -v /proc/25858/cmdline
    frob^@o^@--greppable^@$


    $ ruby -e "\$0 = 'im not lying to you'; sleep"
    $ ps


>The purpose of set0 in the debugger is that users often write programs that refer to $0. For example to display a usage string or maybe they change the behavior based on how they were called. So this helps in making these programs act the same way and gives an alternative to invoking with "bash --debugger". https://lists.gnu.org/archive/html/bug-bash/2008-09/msg00054...

Yes, 2008.


Any recommended reading for Bash? I'm somewhat new to it and it's interesting ways of getting things done. I've used it minimally in the past, but have found myself writing a 100> LOC script, which I can't help but feel I'm likely over-complicating certain bits and pieces.


I'm a fan of reading code.

I would generally describe a GNU/Linux distro as being a "giant pile of shell scripts". That's a little less true with init scripts generally now being systemd units. But that's where I'd start: Look at the code that distros write, that isn't part of some other upstream software.

- Arch Linux's `makepkg` https://git.archlinux.org/pacman.git

- Arch Linux's `mkinitcpio` https://git.archlinux.org/mkinitcpio.git/

- Arch Linux's `netctl` https://git.archlinux.org/netctl.git/

- Downstream from Arch, Parabola's "libretools" dev tools package https://git.parabola.nu/packages/libretools.git/ (disclaimer: I'm the maintainer of libretools)

Gentoo also has a lot of good shell scripting to look a, but it's mostly either POSIX shell, or targets older Bash (their guidelines http://devmanual.gentoo.org/tools-reference/bash/index.html say to avoid Bash 3 features). I tend to believe that changes made to the Bash language over the years are enhancements, and that they let you write cleaner, more robust code.


There used to be this great resource called Linux Documentation Project. It's not as active as it used to be but it produced some really book-quality documents, including the Advanced Bash Scripting Guide at https://www.tldp.org/LDP/abs/html/ .

Read it! It's great. But know that a lot of bash scripting isn't really in bash, it's really required to be proficient with grep, sed, cut, dc and a few other text processing utilities. Learn those! Then are are a few other tricks to be mindful of: be mindful of spaces (wrap your variable substitution in double quotes, mostly), be mindful of sub-shells (piping to something that sets variables can be problematic), and a few other things that can really only be learned by reading good code.

But it's also good to know when you shouldn't venture further. My rule of thumb is when I'm using arrays or hash maps, then it's a good idea to move to another language. That's probably Python nowaways. A lot of people use tons of awk or perl snippets inside their bash scripts, that can also be a sign that it's time to move the whole script over.


Advanced Bash Scripting Guide on TLDP is a verbose and cumbersome-to-read writeup. Better read the bash man page and then http://mywiki.wooledge.org/BashGuide and http://wiki.bash-hackers.org/start


It can't be done. If you want to write reliable code, and actually notice all of the possible error conditions instead of silently ignoring them, your code needs to get more verbose and complicated than it would be to just use a more capable tool like Python or Node, and it still won't be as reliable.

If you have more logic than a couple of string comparisons, Bash is not the right tool for the job.


I recommend Greg's Bash Wiki ... https://mywiki.wooledge.org/BashGuide. See general notes, then at the bottom of the page are many links to additional considerations.

Like others say, "bash" is a hard tool to get right (and I'm not saying I do it right either, necessarily, but Greg's Wiki was real helpful!). I'm building a hybrid bash/python3 environment now (something I'll hopefully open-source at some point), and bash is just the "glue" to get things set up so most aspects of development can funnel through to python3 + other tools.

But ... things that make bash real useful:

    * it's available everywhere (even in Windows with Ubuntu-18.04/WSL subsystem)
    * it can bootstrap everything else you need
    * it can wrap, in bash functions, aliases, and "variables" (parameters), the
      real functionality you want to expose ... the
      guts can be written in python3 or other tools
Without a good bash bootstrap script you end up writing 10 pages of arcane directions for multiple platforms telling people to download 10 packages and 4 pieces of software per platform, and nobody will have consistent reproducible environments.

EDIT: I think there's a revised version of Greg's Bash Wiki in the works.


It is availbale, almost, everywhere but be careful with the version, different Linux diatros are at different versions, the last time I used OSX it was stuck on a very old version, and I expect the different BSD OSes to run fairly new versions.


BSDs don't ship Bash as part of the base system - you have to install it from packages or ports. And that one is the most recent that the maintainer bothered to package. E.g. FreeBSD is on 4.4.23 right now, which actually appears to be newer than e.g. Debian unstable.


Or you just bundle your application and its dependencies into a single folder for each OS and distribute it.


>Python or Node

Python, sure, but replacing Bash with Node just seems like replacing a language crippled by its need to be backward compatible with (Bourne) sh with a language crippled by its need to be backward compatible with what Brandon Eich came up with in two weeks in 1995.


It wouldn't be my first choice either, but modern JavaScript isn't backwards-compatible, it is at least possible to write reliable and robust code if you take some care and know what options to enable, and it's mostly consistent across platforms. Last month, I spent a week trying to handle all of the possible error conditions in an overgrown bash script. It was immeasurably worse.

Python is the more direct substitute, though: it's built-in to almost every platform, and the Python 3 stagnation has even given us a consistent version: 2.7.


As they say, bash is just good enough not to get replaced.


Sure it can. Treat it as a functional and immutable language, and you can go pretty far with it, pretty safely.

You're forgetting that bash really just calls other code. So you can combine two language's stdin/out functionality (eg, Python or Node) to pair code together. Sure, it won't be fast, but it can do quick wonders as an ad hoc data pipeline.


I just use set -e at the top of the script


That's a good idea, but it's not a complete solution. See http://mywiki.wooledge.org/BashFAQ/105


Nice. But I use C++ so I'm used to shooting off my own foot.


By far my best advice is to write auto-complete scripts if possible: https://iridakos.com/tutorials/2018/03/01/bash-programmable-... . Many of mine will do things like look up database values, so "command<tab><tab><tab>" is the workflow 99% of the time.

Keep them short. There are always exceptions but the "do one thing" mantra is handy, they can always be wrapped into a more complex workflow with a bigger script. None of my frequently used ones are over 100 LoC.

Write them for you and you alone when possible, start off simple and iterate. Fight that developer urge to solve a generalized problem for everyone.

Embrace the environment and global environment variables. We're trained to avoid global variables like the plague but they are really useful in scripts. My auto complete scripts that I mentioned above, they know which database to connect to based off the environment variable and there are separate commands to switch environment.

Make sure you aren't using it where things like make would be more appropriate.


I recommend Greg's Wiki. It's the only resource I use. It covers common pitfalls and anti-patterns.

https://mywiki.wooledge.org/


I've recently picked up a subscription to Destroy All Software and he has a few really good demos of how to architect simple and powerful shell scripts using bash. I'd really recommend it for anyone who doesn't need help with the semantics and instead needs help with the organization of a script.


I've got a subscription too, watching him go in some of the videos as he casually does Bash blew my mind, which is another reason I ask this question. I'm trying to at least watch all his videos, I intend to work through them after watching them all at least once, and then I intend to rewrite some of the solutions (the Ruby ones) in Python and Go just to be sure I understand the concepts more, and I'm not just copying and pasting.


* look into parameter substitution https://www.tldp.org/LDP/abs/html/parameter-substitution.htm...

* use traps http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_12_02.htm...

* read about safe ways to do things in bash https://github.com/anordal/shellharden/blob/master/how_to_do...

* this is pretty helpful writeup of general CLI usage, yet not bash specific https://github.com/jlevy/the-art-of-command-line (related HN discussion https://news.ycombinator.com/item?id=9720813)


From the google Shell Style Guide: "If you are writing a script that is more than 100 lines long, you should probably be writing it in Python instead. "

https://google.github.io/styleguide/shell.xml


LOC alone doen't mean much. I have a backup script in fish of 400+ lines of code because it tries to dump data out of different sources like MySQL, PGSQL, InfluxDB, /etc files and others. Just use what feels comfortable achieving the task.


That guide is really useful for making scripts more readable.


I recommend reading and memorizing all 30-ish of the readline shortcuts

https://tiswww.case.edu/php/chet/readline/readline.html#SEC1...

Readline is the library bash uses for editing the input line, and it has some nice movement keys. For example Alt-b moves the cursor back a word, Ctrl-u deletes to the beginning of the line, Ctrl-w removes one word behind the cursor.

They work in a bunch of other programs, like the Python interpreter's interactive mode for example.


Excellent advice! Thanks.


> have found myself writing a 100> LOC script

Often, this is a good sign that you might want to switch to another scripting language.


That's true, so many times I have used shellcheck and started to improve a bash script, only to realize my time was better spent rewriting the script in Ruby.


Take a look at this, as well:

https://amoffat.github.io/sh/



advanced bash scripting guide: https://www.tldp.org/LDP/abs/html/

also, https://www.shellcheck.net/ not reading, but pretty neat. static code analysis for shell scripts, points out common errors.


I have a list of resources related to bash/cli[1]

I would highly recommend BashGuide[2] and ryanstutorials[3] as a starting point. After that, go through rest of the wooledge site for FAQs, best practices, etc

shellcheck[4] is awesome for checking your scripts for potential downfalls and issues

[1] https://github.com/learnbyexample/scripting_course/blob/mast...

[2] https://mywiki.wooledge.org/BashGuide

[3] https://ryanstutorials.net/linuxtutorial/

[4] https://www.shellcheck.net/


The man pages for bash are fairly comprehensive and I'd definitely recommend referencing them while writing scripts: https://linux.die.net/man/1/bash


The full manual is here. https://www.gnu.org/software/bash/manual/bashref.html GNU projects tend have info pages that are a lot more complete and thorough then their man pages, so that'd be a good place to check too.


> GNU projects tend have info pages that are a lot more complete and thorough then their man pages

Why is that? Why don’t they build both from a single source?


Because they're structured differently. A man page for an app is just that, a page, with some formatting and sections. Texinfo is more like your typical website - lots of small cross-linked pages with a common index. Consequently, a man page is usually an extended take on --help, while info pages are more like product manuals.

This is the theory. The practice is that GNU mandates Texinfo for its projects, and because documentation tends to be the weak spot for all open source projects, manpages end up the most neglected as a result, which can be quite annoying. Especially since pretty much nobody else uses Texinfo - man pages are good enough for most console apps, and those that need more detailed documentation use Docbook, Markdown etc.


They could easily just serialise the info pages under page headings and have that as an additional manpage.


Historical difference in style and preferences. GNU’s Not UNIX and all that Jazz, so they have different conventions, some of which would not translate to a man page that well. Plus Texinfo is a tool for generating multiple output formats from a single source.

Typically (unless this changed at some point) the info pages are the canonical reference for GNU projects and the man pages are to accommodate Unix hackers.[1]

[1] https://www.gnu.org/prep/standards/standards.html#Man-Pages



Great video to give you the right high-level perspective:

https://www.youtube.com/watch?v=olH-9b3VJfs


doing the hackerrank bash series helped me quite a bit


You can’t beat the man page if you’re patient with it. Or you could try my book:

https://leanpub.com/learnbashthehardway


Pardon my naivety, but what do you normally use?


I usually do programming in Python and other languages. This particular case it makes sense since I'm taking advantage of command line utilities, otherwise I'm usually just making software myself, e.g. web services or daemons.


You could learn bash to understand existing scripts but you could also learn fish, which can be a more sane looking script than bash, if you're writing your own.


What is the value in creating built-in replacements for binaries like rm and stat?


My guess is that it's less about performance so much as reliability. I've had three incidents in the last two years which required connecting to a server that had an out of control process pool that had exhausted pids.

I also had a storage controller go and had to figure out how to use shell builtins to create a tmpfs and reflash the firmware.

There are many reasons to provide builtins for basic file commands, from saving the extra process start to the tiny performance boost for scripts that should probably be using find instead.


The built-in replacements run in the same process as the Bash shell, and thus, avoid the fork/exec system calls.

It's a minor performance optimization that might be useful if you're doing thousands of rm's or stat's in a script.


It seems like this would eliminate the need to read a file from disk and fork a new process, both of which take time. If you're just removing a single file, this is probably negligible, but if you have a script iterating over 10k files, i.e., this speed-up may be more welcome.


> if you have a script iterating over 10k files

That's probably a good sign you should advance from shell. If the script is trivial, it's going to be trivial in python / ruby / go / crystal / ... as well. If it's not trivial, that's another reason to move.


I agree that as scripts get more complex, you should migrate from bash. But the number of files a script touches says almost nothing about its complexity.


Wasn't saying otherwise. One reason is complexity. Another is performance.


True, but if you are iterating over 10k files and removing them, then a find|xargs pipe with xargs feeding rm the maximum number of parameters the kernel allows per fork will likely be faster than a bash interpreter loop, even with a bash builtin rm.


You think find|xargs would be faster than (gnu) find -delete?


No, my comment was "will likely be faster than a bash interpreter loop". The key phrase is "bash interpreter loop".

I was not comparing find|xargs to find -delete.


It’s faster to run them as a builtin than to exec the external binary. It makes a big difference when looping over many items.


Yet macOS is still on 3.2.


MacOS doesn't appear to ship GPLv3 licensed code. 3.2 is the last update on GPLv2.

Alternatively - newer versions of ZSH are frequently provided by Apple.


What's wrong with shipping GPLv3 code? Can't they just provide the source (are they making significant changes that they want to keep proprietary?) to comply with the license?


Apple likes to push DRM, which GPLv3 forbids. Apple also is afraid to give patent grants, which GPLv3 requires. Thus, Apple refuses to get anywhere near GPLv3 code, and won't even make it easy to give you GPLv3 source that you can build yourself.


GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin18)


  brew install bash


> The `history` builtin ... understands negative arguments as offsets from the end of the history list

At last!


BASH_ARGV0 < does that mean we can set process title after the script starts?


It would appear so:

> New features [...] BASH_ARGV0: a new variable that expands to $0 and sets $0 on assignment.


how about a good way to pass around associative arrays and arrays


Reminder that 2019 version of macOS ships with 2007 (last GPL2) version of Bash, and will never ship with any newer version.

    /bin/bash --version
    GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin18)
    Copyright (C) 2007 Free Software Foundation, Inc.
macOS used to be an awesome developer machine with good tools out of the box. Now the built-in tools are just a bootstrap for their own replacement via Homebrew. Like IE for downloading Chrome.


It's just GPLv3 software though. In contract, SSH is currently reporting OpenSSH_7.9p1, and that is fairly up to date.

I do think that if Apple can't ship a current version (for whatever reason), they probably shouldn't have it pre-installed at all, much like the BSD's don't come with bash by default either. Maybe they could ship with ksh/pdksh/mksh or something instead.


MacOS used to ship with tcsh as its default shell for years.


I still don't understand how a gnu binary could taint their OS. They can provide the source on opensource.apple.com. Case closed.


The problem is that GPL3 has more restrictions than just "publish the code". That's why projects such as FreeBSD, OpenBSD, etc. also won't include GPL3 code, as including GPL3 code would restrict what you can and can't do with the entire system.

I assume that Apple's reasoning is similar.


Can you name them? It's not exactly snark. My experience is that almost everyone with strong GPLv3 opinions turns out not to actually object to the terms themselves when discussed in isolation.

To head off: it's not the patent grant. Apache 2 has a very similar patent grant and everyone is fine with it.


Apple doesn’t support it because GPLv3 requires that you are actually able to run the modified source code (not just see, change and distribute). While they probably could do that on macOS, macos and iOS share the same kernel and many common source trees. So by providing macOS software with GPLv3 code, some of that code could potentially be shipped with iOS which would land them in hot water because you are not able to just execute code on iOS. Even with a developer license, you couldn’t just recompile the kernel (or even Library) and replace it with the default one.

Edit:

Here is more information about that:

> Protecting Your Right to Tinker

> Tivoization is a dangerous attempt to curtail users' freedom: the right to modify your software will become meaningless if none of your computers let you do it. GPLv3 stops tivoization by requiring the distributor to provide you with whatever information or data is necessary to install modified software on the device. This may be as simple as a set of instructions, or it may include special data such as cryptographic keys or information about how to bypass an integrity check in the hardware. It will depend on how the hardware was designed—but no matter what information you need, you must be able to get it.

Source: https://www.gnu.org/licenses/quick-guide-gplv3.en.html

That means if there was GPLv3 code in iOS, Apple would have to send you their private keys to sign any source code so you can run it on iOS.


Sure. But iOS doesn't have Bash (I don't think, even if it does: Then just ship GPLv2 Bash on iOS, and modern Bash on macOS where users might actually care). The GPLv3 of Bash doesn't infect the shared kernel or libraries between iOS and macOS, it's just the `/bin/sh` program. It's easy enough to check if it's being included in iOS.

"The reason we can't ship modern Bash on macOS is that we're concerned we might accidentally include Bash on iOS."?

(and /bin/sh on macOS is already sufficiently user-modifiable to satisfy GPLv3)


Apple is pretty firm on no GPLv3 because accidentally shipping it on iOS can be pretty dangerous for them. Whether you agree with the decision to just not ship GPLv3 software (I don’t), I can understand apples concerns regarding GPLv3.

Furthermore, it wouldn’t surprise me if iOS had bash support even for just debug builds and Apple either had to worry about testing, building and maintaining two version of bash and still risk getting them mixed up and accidentally shipping some parts of the bash source code. This much easier to just never use GPLV3


Accidental shipping of IP which a company do not own is a constant risk. Companies sometimes accidental ship proprietary software, copyrighted images, copyrighted text, patented techniques, trademarked names, trademarked images, and more.

A Swedish copyright lobby organization committed copyright infringement on their own website. Bethesda game company had a trademark dispute with the developers of Minecraft. Apple and Samsung are constantly in patent war with each other. There is an almost unlimited number of cases where one company sue an other for contract and license disputes.

Shipping software is pretty dangerous, and there is no data to show that shipping GPLv3 licensed software is more likely to cause accidental infringement of someones IP than other licensed software.


> accidentally shipping it on iOS can be pretty dangerous

I read the same argument about GPLv2 in like 1994. To date, there remains no case law I'm aware of where a copyright holder "lost" anything by "accidentally" shipping GPL software. How many decades does it take for us to put this myth to rest?


A lot, because it's hard to derive meaningful conclusions from something that's used so relatively little precisely because of licensing concerns.

Anyway, with v2, the concerns were more vague - you'd have to accidentally link something GPLv2. With v3, even just shipping the binary signed with a private key on a platform that requires said key to load it, is enough to potentially force disclosure of that key (and that is by design of GPLv3 - it wants to kill the TiVo model).


The outcome of violating the GPL is license termination: you cannot distribute it any more. I don't think there has ever been a case of enforced further distribution or disclosure.


Termination means recall / reimage all device on store shelf and maybe compensation for those distrubited without license.


My main point was that to my knowledge there has never been a case of forced further distribution.

About termination, what you mention is kind of stopping further violation of the terms and maybe compensation for the past violations. What I wanted to mention (but wasn't at all clear about) is that after termination, you cannot even distribute if you intend to abide by the terms unless you are forgiven by the copyright holder. (In the case of the GPL version 3 there is some grace period during which for the first violation, if you abide within that period, you are explicitly permitted to resume distribution under the license terms.)


Right, so the risk can't be quantified and the whole thing is just isomorphic "because I don't like it". Which was as true then as it is now. It's not about license terms, it's about what amounts to politics.


Experimenting is not the only way to quantify risk. You can also e.g. have several highly paid lawyers look over the text of the license, and see what each of them had to say.

The big players in the industry did just that, and there appears to be a remarkable consensus on it.


The remarkable consensus behind IBM's purchase of Red Hat, you mean? Seems like their highly paid lawyers were down with it.

Again, it's just stunning the extent to which this FUD hasn't changed at all over a quarter century.


The GPLv3 was intentionally designed to create the risk that Apple and others are trying to avoid. IBM doesn't have that risk because they don't sell hardware or software packages, they sell expensive service contracts.


Red Hat, a division of IBM, literally does almost $3B of revenue selling a Linux distribution that includes almost the entirety of the GNU corpus, all GPLv3. IBM bought them because they already resell packages and contracts based on the same stuff and wanted the vertical integration. So yeah, you buy an expensive service contract from IBM and it comes with (gasp) a giant license for GPLv3 stuff in RHEL.

That point is silly. If what you say were true, then the purchase would have been poison. It's not true. It's FUD. And it's hilariously the same FUD that people were flinging around in years before most of the existing FUD-flingers were even born.


The corporate risk profile of Apple vs IBM is so drastically different that it's not even a serious comparison.


> It's not about license terms, it's about what amounts to politics.

Which is exactly what the GPL is! Especially v3.


The product in question is bash, though, which doesn't ship on iOS at all and which obviously could be (and should be) modifiable on OS X.


> Apache 2 has a very similar patent grant and everyone is fine with it.

OpenBSD apparently isn't fine with it, fwiw.

> The original Apache license was similar to the Berkeley license, but source code published under version 2 of the Apache license is subject to additional restrictions and cannot be included into OpenBSD.

> -- ref: https://www.openbsd.org/policy.html


Eh, is it 2007 again? Hasn't this been discussed to death already? I have little interest to repeat it again; this discussion has been on-going for ten years already (longer if we count GPL vs. BSD discussions, which are roughly similar). Everyone should be familiar with the arguments by now; I trust you're already familiar with mine.

I understand that you're well-intentioned, but I find the suggestion that I'm somehow not informed to be rather patronizing to be honest.


So... you don't really know what restrictions you're talking about, right? I mean, you brought up the subject:

"There are restrictions!" "What restrictions?" "Everyone knows. It's patronizing to ask me. I won't say."


The whole tivo-isation thing has been discussed extensively, as well as the problems with trying to prevent it with the license.

I didn't include an entire summary of it, as it can be easily looked up on e.g. Wikipedia, as well as many other places.


On FreeBSD, as I recall, they don't want any GPL'd code in the base system, simply because the ideological goal is to have everything under BSDL. It's not that it affects other bits of the system though.


Yeh, GPL2 is also considered less than ideal, and replacing GPl2 code has been a low-key long-term goal. GPL2 is still considered acceptable though, whereas GPL3 is outright barred.


Is there any GPL code remaining in the base at this point? I thought it was all gone once they replaced gcc with Clang.


I think gcc is still in base, right? They just added clang and set that as the default, AFAIK. I haven't kept up with FreeBSD much in the last few years, so not sure what the plans are to remove gcc.

Also, a quick look at the source tree shows there are still some other parts as well. GNU binutils, GNU diff, dialog is GPL, etc. See: https://github.com/freebsd/freebsd/tree/master/contrib

In the kernel, I think there are some drivers that are based on the Linux GPL code, for example: https://github.com/freebsd/freebsd/blob/1d6e4247415d264485ee... (just first hit from GitHub search for "gpl")

So yeah, looks like there's still plenty of GPL code in FreeBSD that's not trivially replaceable. The same applies to OpenBSD, although in general the OpenBSD people tend to be a bit more proactive in replacing GPL code.


It also ships a recent version if zsh. But I agree, you're better off with Linux if you want a developer box and deploy on Linux.


How's that? The quality and amount of GUI tools are far more complete in macOS than Linux desktop and you can just run a Linux VM to mimic the deployment server.


I don't care about the GUI tools, I just need an editor, a terminal and a web browser. And maybe Sequel Pro once in a blue moon. The MacOS GUI is quite annoying when you want to do stuff fast like for instance move windows between virtual desktops or switch the desktop with the mouse wheel.

The tooling is just not there (old Python, old Perl, old Ruby and of course different versions from your deployment environment), you have to resort to third party tools such as homebrew or macports, you have to install Xcode to get gcc, you need an Apple ID to do that, the system level API is incompatibile with Linux, the filesystem is or at least was case insensitive. New MacOS versions after El Capitan are also getting worse at compatiiblity with other Unix-like platfroms. It's a pain to set up a development environment really, especially if you use any dynlibs. Instead of a VM we have a staging server where we deploy and there are almost always surprises.

In Linux the tooling is just there, a few seconds and one package manager command away. If your package is not there then there are PPAs or OBS repos. You can reproduce the platform you're deploying as closely as possible and there are less surprises.


> It also ships a recent version if zsh.

Recent compared to bash maybe. 10.14's zsh is two years old.


Thank the GPL v3 for that one. Same reason GCC is frozen at 4.2.1 on mac.

I switched to zsh, personally - the one Apple ship is pretty current.

macOS is still a pretty solid developer machine.


Some application still requires bash, like wireguard


WireGuard doesn't require bash. The main configuration utility, wg(8), is written in vanilla C. However, the tools ship with a little convenience script, called wg-quick(8), which is indeed written in bash. It started out as the thing I was using to configure my laptop on the go, but then others found it helpful. It's by no means essential or central to WireGuard, and lots of people use different things to wrap wg(8) and the various networking tools.


Based on your documents[1] of Mac OS, it looks very like wireguard requires bash version 4 or newer because wireguard-tools will install bash as dependency. Maybe you should edit the docs or shouldn't deliver wireguard-tools?

   $ brew install wireguard-tools
[1] - https://www.wireguard.com/install/


Sorry if I didn't make it clear before: wg-quick(8) is part of wireguard-tools, because people find it useful sometimes. But it's in no way essential and WireGuard functions entirely without it. That's what I mean when I say WireGuard doesn't require bash. That little helper script is just a small convenience thing some people like.


Totally aside: is there a tool that can parse a programs code, then consume a folder of (headers for?) versions of a dependency and say the versions that are [a priori] compatible?


What's the problem running it on a 10 year old bash?


all the 10 years of news and bug fixes?


Which is? And I doubt those scripts rely in the 'new' features out of some 30 years old history.


bash 4 introduced associative arrays (aka dictionaries or hashmaps).

Which, while you can do without, is a nice feature that makes some things easier.


I assume you can install newer versions of Bash on macOS though?

(But it certainly suggests to me that macOS isn't actually the easiest out-of-the-box solution if your workflow includes anything more than web browsing.)


On macOS almost everyone uses the homebrew package manager https://brew.sh. Installing bash is harder because you have to add the filepath of the new bash binary to /etc/shells and then set that filepath as your default shell, but

    brew install bash
    echo /usr/local/bin/bash | sudo tee -a /etc/shells
    chsh -s /usr/local/bin/bash
and if you're on macOS and still haven't heard of homebrew, you first need to install it with

    /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"


That command makes my eyes twitch. This sort of "download random stuff from internet, then pipe them to ruby/perl/python" madness should end, like right now. I'm assuming you're a reasonable person who knows what that commands does. Can you really not see how people can misuse commands like this to execute arbitrary code on people's computers, even experienced engineers if they carelessly copy-paste code like this. As a rule of thumb, to whoever reading this, please don't trust random strangers online and run their code. You never know. Always download code from trusted sources and check sha hashes. Piping anything else to ruby/perl/python is playing with fire.


That command curls a ruby file from the Homebrew repository on Github. If that Github repo (or Github) is compromised, then no matter how you installed homebrew, you are going to get hacked the next time you run `brew update`, which by default is run on every `brew install`.

For me, the odds of that repo being compromised for long enough are outweighed by the convenience. This has been brought up on HN before and I'm not convinced.


Would you notice if that URL was “raw.githubusercontent.someotherdomain.com” instead of “raw.githubusercontent.com”?

I’m pretty sure at least 50% would merrily copy & paste it in their shell.


Alternatively, follow the homebrew installation instructions: https://docs.brew.sh/Installation


You probably don't need to update /etc/shells unless you really want to. chsh just won't let you switch to a shell that's not listed there without using sudo, which isn't a problem for most (generally single-user) Mac systems. You can also change your shell from the Users & Groups system preferences (in the advanced settings).


You mean, every developer, not everyone.


Sure you can install newer version of bash via homebrew then update PATH in your shell profile.


Yeah just use homebrew and forget about it


Just use zsh.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: