Hacker News new | past | comments | ask | show | jobs | submit login
How Bash Completion Works (tuzz.tech)
299 points by cpatuzzo on Oct 6, 2019 | hide | past | favorite | 47 comments



I always disable shell completion after it burned me a few times:

1. Completion that blocks the shell, by doing a lookup across a network (e.g. to complete a remote path). Can hang for several seconds.

2. Completion that gives me a misleading/incorrect view of the filesystem, by "intelligently" filtering which files/filetypes it will let me complete. For example, if I have foo.mp3 and foo.txt, and a media-player command tab-completes and only "sees" the mp3, but I actually wanted to know that the .txt was there as well (and in some cases, open it with the media player!)

3. When the completion has a wrong view of the options provided by a program (e.g. completion is not aware of some useful options). Instead of looking at the man-page, I've been tricked by a bad completion-set.

I would love a better shell with good completion, but I need to avoid these types of problems. Does anyone else run into this?


Coincidentally just this week I've working to resolve point 1 in my own shell (https://github.com/lmorg/murex). The compromise I came up with was:

- The shell would attempt to return all suggestions immediately

- However a subset of functions which are known to be slower queries (like recursive directory look ups on larger file system hierarchies) are given a "soft timeout" -- where after that timeout is reached, the slower faster subset of functions are returned as the suggestions while the slower subset continues to run in the background

- The slower subset are also given a "hard timeout" where when that point is reached, if the function still hasn't completed then it is just killed. Any results is has produced (if any) by that point are appended to the auto-completion suggestions.

- If the slower subset finished before the hard timeout then it is appended to the suggestions. If it finishes before the soft timeout then it's just part of the suggestions.

At the moment it's only been implemented for file and directory suggestions. The fast function is just any files and directories in the current directory level; where as the slow function will return a larger directory hierarchy (for quick navigating akin to fzf). But the plan is to extend it further to support other dynamic completers as well.

This wouldn't completely solve your point though. If the "fast" query actually runs slow (eg you're trying to complete when inside a network mounted filesystem when the network is dropping) then the shell would still hang. At some point I'll add timeouts on them as well. But I am inching towards that goal.

Be warned though, because this feature is less than a week old, it's still only in a feature branch. :)

Your point on 3 is apt too. One of the things I built early on was man page parsing for auto-completion suggestions. I've since learned that I'm not alone in that regard either (Fish does it as well -- and from some of the demo's I've watched it looks like they've done a nicer job there too).

More recently I've also been writing tools that parse binaries for flags as well (in instances where you download a statically compiled binary - eg terraform - and thus don't have any corresponding man pages). That work is very much in it's infancy though.


Some of this could be down to the completions that are provided. A package such as bash-completions comes with completions for a lot of common software. One specific solution to your problem, well a suggestion is to use oksh (https://github.com/ibara/oksh) a portable version of pdksh (public domain Korn shell) that comes from OpenBSD. The reason why I recommend this shell is that it has programmable completion, although it is much simpler than what's provided by bash and that aside from the built in file name completion it won't come with anything else and you won't find (aside from people's dotfiles on Github) any packages that provide you with anything. Basically you have to customise it to your needs, but because the completion is so simple, whenever you find yourself in a situation where you'd want completion, spend a few minutes adding it to your .kshrc and move on.


I've run into all of these, but, FWIW, I still prefer smart autocompletion in bash by a large margin. Somehow I learned to avoid the bad areas I guess.


Yeah homebrew does that remote lookup thing when I try to complete `brew install xx<TAB>`. Because of the delay, it's not a good way to discover packages at all especially if the package name has an unexpected prefix that causes bash completion to fail. Much better to just `brew search xx` then browse the entire list of possible candidates than to rely on tab completion.


> Yeah homebrew does that remote lookup thing when I try to complete `brew install xx<TAB>`.

AFAIK Homebrew doesn't do remote lookup, you can try it out by turning off your Wi-Fi & autocompleting formula names still work.

> Because of the delay, it's not a good way to discover packages at all

The delay is due to ruby's slow speed, it looks like home-brew just searches the whole formula directory with ruby.


I use Fish for that kind of thing. I dont write shell scripts for Fish thats what I use Bash for (portability) but Fish has intelligent auto complete hints.


> 1. Completion that blocks the shell, by doing a lookup across a network (e.g. to complete a remote path). Can hang for several seconds.

I've had this happen to me. Would be nice if it could be interrupted with Ctrl-C. My workaround has been to wrap the argument that's triggering that in quotes (when redoing the command in another terminal).

> 2. Completion that gives me a misleading/incorrect view of the filesystem, by "intelligently" filtering which files/filetypes it will let me complete. For example, if I have foo.mp3 and foo.txt, and a media-player command tab-completes and only "sees" the mp3, but I actually wanted to know that the .txt was there as well (and in some cases, open it with the media player!)

Maybe a keybinding can be added to enable/disable "smart" completion (leaving only simple filepath completion). Seems like it should be a relatively quick configuration fix.

My workaround for this, for when I want simple filepath completion, has been to prefix the command with `echo`. With vi keybindings it's just `<Esc>Iecho <Esc>A` (it's faster if you make your CapsLock an Esc); with emacs keybindings it's `<Ctrl-A>echo <Ctrl-E>`. Afterwards, to delete the `echo`, it's `<Esc>^dwA` with vi keybindings and `<Ctrl-A><Alt-D><Ctrl-D><Ctrl-E>` with emacs keybindings.

> 3. When the completion has a wrong view of the options provided by a program (e.g. completion is not aware of some useful options). Instead of looking at the man-page, I've been tricked by a bad completion-set.

Maybe the list of options could be parsed out of manpages and --help output. Then, it would coincide.


> Maybe a keybinding can be added to enable/disable "smart" completion (leaving only simple filepath completion). Seems like it should be a relatively quick configuration fix.

There is a keybinding to do simple filename completion: Alt-/ (https://www.gnu.org/software/bash/manual/bash.html#index-com...).


At least in zsh, you can interrupt slow autocompletes with ^C.


Zsh also lets you configure, via the `remote-access` style whether you want to allow it to make remote connections for completion to begin with.


My experience actually is with zsh, it is about remote-access, and I cannot interrupt it with Ctrl-C.

The particular case is normal filepath completion on an NFS mount where the NFS server is a very old Unix OS using a very old version of the NFS protocol. The thing is that while the server hangs under particular circumstances, and zsh is trying to read the directory to autocomplete paths, the kernel cannot return from the normal FS system calls that are invoked, and the process gets stuck in uninterruptible IO. Not even SIGKILL works then.


There's a good chance that setting the path-completion style to false will help in this case. That does disable a useful feature - though it is only a useful feature if you know it is there and use it. If not even SIGKILL works, it is stuck in kernel code. That doesn't surprise me much with NFS. That's not easy to workaround in the zsh code - other than by disabling functionality as with my path-completion suggestion.


Regarding (2), I hate the completion for fdisk [1]. It filters device filenames starting with /dev/disk/{by-id,by-path} even if those names make it easy to select a particular disk.

[1]: https://github.com/karelzak/util-linux/blob/master/bash-comp...


Regarding 3: I wish that all completions were provided by the programs (ideally by invoking the program) rather than by a hard-coded list in a bash-completion file.


In haskell the parser library for your cli auto-generated a completion file which is very convenient.


Fish shell?


My issue with bash completion is that it requires a completion script (i.e executing `complete`) for each command you want it to complete. The shell cannot automatically deduce appropriate completions when possible.

This problem is not specific to bash. Fish and other shells can't automatically complete commands either. There is simply no standard way to detect what type of auto-completion a command supports.

I know this comment is a not about the OP, but I've wasted so much time dealing with this that I feel compelled to post this here. I wish shell developers would form some sort of consortium (like XDG) where they can agree on standard solutions to issues like this. No one benefits from having developers waste time manually writing and testing completion scripts for each project (times the number of shells they want to support).

Side note: I remember that there was a project that modified bash internals to detect automatic completion support. It did this by scanning for the presence of a magic string in the first N bytes of a file (or in some ELF section of a binary). If that magic string was present, bash would automatically generate a completion set by passing `--_complete` to the command. I think this is a simple and elegant solution to this problem and one I hope shell developers would consider.


> The shell cannot automatically deduce appropriate completions when possible.

I don't think this even _can_ be possible for a large number of programs.

For some small number of programs it _might_ be possible, if somehow (ignoring how for now) you were exposed some standard interfaces like getopt or argparse and didn't have to care about argument ordering.

But a ton of other programs, with extremely complicated interfaces, simply wouldn't be possible. And these include commonly used programs.

Auto-completing awk would require a full language parser at the least, for example.

Another, ffmpeg, has a dizzying array of options, and the order of those options can completely change the intent of the command, and what options are allowed to follow without being ignored silently.

Even if we had perfect interface detection, we could only probably generate a subset of auto-completion options because that particular problem might not _always_ be solveable.


You raise a good point about solvable completion, and one that I agree with. However, automatic completion support doesn't have to cover every use case. It only needs to be good enough. Complex applications can fall back to providing their own completion scripts.

One point I'm not clear on is why you think such a scheme wouldn't be feasible for a large number of commands. Completion wouldn't be invoked until the fist argument (the command's name) is complete just like how bash doesn't scan the system's completion directory until it knows what command is being invoked.

And as for completion for the command's arguments, the command is only ran once to generate the list of completion candidates.

Note that there are projects that already do exactly this like [1] .

[1]:https://github.com/kislyuk/argcomplete


> One point I'm not clear on is why you think such a scheme wouldn't be feasible for a large number of commands. Completion wouldn't be invoked until the fist argument (the command's name) is complete just like how bash doesn't scan the system's completion directory until it knows what command is being invoked.

Simply because some command pipelines are extremely complicated. For example, with ffmpeg, some flags can disable earlier flags, or re-enable them. Completion isn't a simple left-to-right thing. There's other programs that can be similar.


> Completion isn't a simple left-to-right thing.

Sorry, I still don't understand why that's relavent. It's probably my fault for not being clear enough. Please be patient with me as I try to further elaborate:

If the command is in charge of both generating completions and also for parsing arguments, wouldn't it be aware of all inner-dependencies between flags? Wouldn't such awareness make it possible to generate candidates based on previous flags?

I'm not familiar with ffmpeg, but let's assume that there was another similar program called ggmpeg. Let's assume that it accepts the flag '--input-format' and then either the flag '--repeat-video-range' or '--repeat-image-frames', depending on whether the format is an video or an image.

That is, only the following flag sequences are valid:

    ggmpeg --input-format=video/webm --repeat-video-range=.. ...
or

    ggmpeg --input-format=image/png --repeat-image-frames=.. ...
    
Let's also assume that ggmpeg signals to the shell that it is responsible for generating its own completions by passing the flag --_completion to it.

Now, if the user types the following command

    ggmpeg --input-format=image/png --repea[CURSOR]
and then presses tab, thereby invoking the shell's auto completion routine, which in turn ultimately executes

    "ggmpeg" "--_completion" "--input-format=image/png" "--repea"
wouldn't ggmpeg have all the necessary information to only suggest '--repeat-image-frames' and not '--repeat-video-range'?


Ok, here's a simpler example that might demonstrate why it can't always be solved.

    ffmpeg -i input.avi -metadata author=shakna output.mp4
What metadata keys are allowed, if at all, actually depends on the output file. If I'd chosen to make an ogv instead, then that metadata flag would become disabled, because it doesn't accept most sets of metadata.

I can't complete:

    ffmpeg -i input.avi -metadata a[TAB]
Because the entire set of CLI commands isn't simple left-to-right solveable.

However, if I do:

    ffmpeg -i input.avi -metadata author=shakna output.mp4 output.ogv
That means I'm outputting one file with metadata, output.mp4, and one file without it, output.ogv. So the flag is conditionally enabled or disabled.


I can now see what you meant by completion not being left-to-right solvable. Thanks for taking the time to lay out a clear example.

While completion isn't be solvable in general, surly we can agree that its usefulness (when solvable) merits making it more readily available to users.

Having the shell implementation automatically deduce suggestions from the command itself (when available/possible) would go a long way towards increasing availability in my opinion.


Some shells do parse manages and use other tricks to auto-determine completes. Though they're not POSIX compliant shells so you'd have a to retrain a little there.

I'm interested to know more about that ELF / --_complete trick though. Do you have any more details? (when DDging the only result I can find on the topic is your comment)


> (when DDging the only result I can find on the topic is your comment)

You weren't kidding at all.

After spending more time than I'd care to admit searching for the library, I couldn't find traces of it anywhere on google, ddg, or bing. I relented and digged whatever information I could find in my history.

Revisiting these projects, it's painfully obvious that I have jumbled a few projects together. Sincere apologies for the unintended goose chase.

This project [1] is the one that introduced the `--_completion` flag. It registers itself as a default completion function, but blindly and dangerously tries to run any command it gets with this flag. (This project is completely de-indexed from search engines and I'm not sure why).

Like [1], project [2] registers itself as a default completion function, but checks the first 1024 bytes for the magic string "PYTHON_ARGCOMPLETE_OK". It seems to be relying on argparse for completion.

I couldn't manage to find the project that was scanning elf sections. I'm not entirely sure if it was relying on elf sections specifically or if the fact that it was scanning elf sections was only an implementation detail.

[1]: https://github.com/dbarnett/python-selfcompletion [2]: https://github.com/kislyuk/argcomplete


Awesome. I really appreciate you taking the time to research that for me. Some great material for me to read through. Thank you


I'm only guessing this is what PowerShell was supposed to enable. I've never used it but I gather programs are supposed to offer enough metadata to do this?


Nice article, it’s usually hard to keep a reader’s interest when writing about shell.

I will nitpick that COMP_LINE et al are shell variables, not environment variables - you can tell by the fact you don’t need to export the “return value”. (Scare quotes, as the function’s exit code, manually specifiable with “return NN”, is also called its return value.)


I think they are env vars. complete is a bash built-in. It is executed in the process of the shell. You only need to export for child processes.


Love the implementation.

Can't help but think about another "startup" or project that uses Machine Learning to predict your bash commands... which in most if not all cases is probably not necessary XD.



Yikes. Yes.

Not sure if other developers agree with me on this: I probably don't need autocompletion, I'm not writing an essay; I'm writing code where every character is a lot more impactful than forgetting a semicolon here or there. For this reason, having autocompletion could probably be an annoyance than helpful.


Personally I'm the exact opposite - weirdly for the same reasons you cited too. When programming I can usually predict what will come next or the compiler will usually slap me if I've done something dumb. Whereas it's easier to miss something in a shell because everything is a little more "golfed" and mistakes can go unnoticed for a while if you're being down-right careless.

I'd love it if there was a way to do a "test run" of a command line where it prints what it would do without actually touching the file system (for example).


Pretty sure a simple "dry run" is what you're looking for. Not sure what programs support it, but I know Docker for one does.


Sorry yes, "dry run".

There are definitely ways of doing it on a command by command basis but what I was more thinking about was a dry run on an entire pipeline. It's not a use case I tend to run into much these days though because there are usually better tools.


> `function _fizzbuzz () {`

I find that extra keyword distracting, when [POSIX.2 doesn't require it][1] and some folks going so far as to say to [actively avoid it][2]

Is there some good reason for it, or people are just used to a programming language requiring `fun` before a declaration, so it's programmer muscle memory?

_1: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V...

_2: http://mywiki.wooledge.org/BashPitfalls#function_foo.28.29


Simple keywords for definitions can make grepping for stuff much easier. Given that shell allows other forms, it isn't that helpful in practice. I still use it because my eyes find start-of-definition faster with common patterns that are not just special-chars. YMMV.

Edit: I usually don't care for portability, and just use bash. Shell is so ugly already, I refuse to write my personal stuff in portability hell mode. Rather level up my Civilization skills instead.


Great tutorial! I’ve always wondered how this is implemented, and how I could add it to my scripts. The interface seems simple enough.


Cool read. I was recently wondering how this worked. I was using zsh but I assume the implementation is similar.


Is there anything like completion for ash within Alpine containers, or do you have to install Bash?


I don't believe Almquist ever incorporated tab completion. Korn did, so a number of more limited shells may support it, but probably not ash.


To be fair. Bash does not have a large footprint.


As an alternative, as mentioned already in response to another comment is oksh (https://github.com/ibara/oksh), a portable version of OpenBSD's pdksh. It's in the same Bourne shell lineage as bash but is lighter and the completion is simpler: you have to customise it if you want it to work for you, but it's very easy and quick to do.


TLDR version: duct tape and hundreds of people helping prevent the contraption from falling apart :-)

That’s not the scary part, though. The scary part is that those completion scripts run with the user’s privileges, unsandboxed. I could find only a single CVE (https://www.cvedetails.com/cve/CVE-2018-7738/), but I think it would be wise to sandbox these scripts.


If you sandbox, can you actually guarantee completion?

Some more complicated commands might depend on the contents of a file, or the permissions of the file, and an environment variable at the same time to find valid options.

Others may require accessing a list of processes running under the active user, which could only be accessible when running as the user in some circumstances.

Tacking on any kind of permission system will probably break thirty-odd years of programs, and would likely be a hack because of how POSIX is expected to behave.

Duct tape it certainly is, but backwards compatibility isn't something to hate either. We run code all day everyday. That completion runs code shouldn't be a surprise, and you need to judge whether or not you trust it before using it.


Thank you!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: