I recently had the same problem with Conda's and Pandoc's initializations in Bash.
At first I did the same as the author and just dumped the output from `pandoc --bash-completion` et al. into a file, but then I'd have to deal with cache invalidation on every machine that I use. Doing it manually isn't that bad, but remembering to update the file in a bunch of machines is a bit too much ugly for me.
After a surprisingly short Google session, I ended up settling with lazy initialization [1]. Unlike the post I found, I did it all manually, though. Now I just have this tucked at the end of my `.bashrc`:
I've also spent time profiling shell startup and found conda's init wasn't fast. (And that's consistent in fish shell, tol) Your found solution is elegant; I'll adopt it. nvm and rvm both have initializations have a decent perf hit, too.
Not really related, but this reminds me of an issue I once saw where starting vi on one of our servers took something over a minute. Everywhere else it was imperceptibly fast. It eventually turned out that in the distant past someone had accidentally piped a large file into vi, and so every time it started up it was loading up a command history that included this stupidly big file to absolutely no purpose, annoying everyone but not quite enough that anyone else had tracked down the cause.
Seems the default limit is 20 commands - perhaps someone had overridden it to be much higher. It was about a decade ago though so I may be misremembering something about the issue.
Bonus points if zsh measured startup time by default and issued periodic warnings about slow startup dependencies. I wouldn't mind seeing the occasional message to STDERR telling me that `kubectl xyz` is actually adding 250ms to shell startup.
Answer: Because it is sourcing the output of `kubectl completion zsh`, which takes 130ms to execute. (The article says something about zsh “caching” a static sourced file; does it actually do something like that? It still has to run the shell code, right?)
Question: How the hell do you take 130ms to print out the commands to set up shell completion? Is there something meaningful happening in that time, or is this simply the cost for the kernel to load a 43 MB do-everything kubectl binary? (43 MB is a lot, admittedly, but 130ms when most of that code stays untouched still sounds high to me.)
> The article says something about zsh “caching” a static sourced file; does it actually do something like that? It still has to run the shell code, right?
No it does not, the article is wrong or phrased incorrectly.
Zsh completion functions, when properly configured, are autoloaded. The function is only read from disk when called. Say with _kubectl, if you do it correctly with a static _kubectl instead of following the stupid advice of `source <(kubectl completion zsh)`, on shell startup your _kubectl should be uninitialized:
After you actually try to complete kubectl once, run that again and this time you can see the actual function body.
$ which _kubectl
_kubectl () {
local shellCompDirectiveError=1
# ...
}
That function is read from disk on use, not cached.
What zsh does cache, with compinit[1], is association of commands with completion functions. compinit reads the #compdef line of functions on $fpath (e.g. `#compdef _kubectl kubectl`), and generates a ~/.zcompdump (or some other path, mine is ~/.local/share/zsh/compdump for instance), which is basically a list of command and completion function pairs, plus a huge autoload invocation registering all the completion functions. This way zsh knows which function to load when a command needs completion.
Kernel does not load 43 MB. It maps file to memory and then executes it from start address. During execution, kernel paging mechanism reads parts of the file into memory as CPU touches it. So ideally this print algorithm should take few kilobytes of those 43 MB and 2-3 pages loaded from the disk. With SSD it should be significantly faster than 130 ms.
What exactly happens with kubectl is hard to answer and need someone to profile it if that's true. May be it checks for updates or something like that.
I agree. I never found zsh inherently slow, but the reason I initially used it was to play with all the additional 'bloat' its ecosystem offered, so to speak.
It was that experience which led me to look try other shells, in the end I found ksh's simpler feature set and manpage more to my liking.
I have used the original ksh[1] for years, because it seem to have more features for complex shell scripts. It also supports real numbers for people that need math functions other than integer.
The ksh that ships with OpenBSD, it's available under Linux as loksh. I've never needed a tutorial as such, because found its man page to be well written and comprehensive. However, I can recommend the old book The Unix Programming Environment by Kernighan and Pike, it also is well written and helped me quite a bit. Just understanding one smaller tool (ksh) better has helped me do more with it and use fewer addons extensions, which personally make me happy; works for me.
It seems odd to run kubectl every time you open the shell. I just save the output and source it in every shell. On my machine it's 40ms to create a shell and eval the output of "kubectl completion bash". It's <10ms to "source ~/.dotfiles/compleation/kubectl".
Should this include when loading autocompletions in the title?
zsh loads in 105ms for me on a 7 year old workstation but I don't have that source command for Kubernetes' autocompletion in my zshrc. I'm only loading a few plugins such as fast-syntax-highlighting and zsh-autosuggestions. It doesn't load slower than bash in a way that I can perceive.
Nice tip on using the profiler but how do you read the profile output compared to `time zsh -i -c exit`? The blog post you linked mentions running `time zsh -i -c exit` to get the true time which is where I saw the 105ms but none of the columns in the zsh profile table add up to 105ms. If the table reports in milliseconds, the summary table up top adds up to about 41ms for the first time column.
One tip if you are using zprof: it only reports time spent in function calls, not in commands run at the top level of .zshrc. Last time I profiled, I ended up binary searching by wrapping different parts of my .zshenv and .zshrc in "zinit1" and "zinit2" then tracking which one took longest. My problem turned out to be a buggy alias that was calling "curl" during construction rather than when it was called.
Since nobody has mentioned a tip for his cache problem: I would just solve this with cron. Run it daily and just write the file at 5am then forget about it. Could also spawn a subshell at login to generate the file in the background
This article reminds me of my shellrc which used to have a progress bar. I had it print \r{##..}, with increasing number of # after every big source (2-3 second start. Looking back I don't know how I used that daily
and then repeat the "mean,std.dev of best 5 out of 20" above, I get
0.07270 +- 0.00015
So, basically 2x faster start up time by just by doing zcompile on those two files. Also, if I re-run the (repeat..) then I get numbers within +- 2..3 std.devs. So, the numbers and run-to-run consistency all get along fine.
A good way to isolate which work is causing slowness is the PS4 variable. This is what is printed in "-x" tracing mode. E.g.:
PS4='+$EPOCHREALTIME ' zsh -ilx -c exit 2>/t/zpro
You need to have `zmodload zsh/datetime` loaded early for that to work..E.g. at the very top of your .zshenv.
It's not hard to do a little post-processing script to have awk do entry-to-entry deltas on that /t/zpro file and then sort those to get a "seconds cmd" kind of report. Personally, I have found this more effective at identifying hotspots than the zprof mentioned in the article.
EDIT: Caveat - There can be a bit of a Heisenberg style disturb what you are measuring effect from trace overheads. For example, the `_comps = (32 kiB report)` always shows up as the most expensive thing for me, but I think that 3.3ms comes mostly from the `-x` just printing the thing out after the timestamp or maybe formatting it for print. If I just do a t0=$EPOCHREALTIME before and echo $[EPOCHREALTIME-t0] after it takes only 1.0 ms in non-traced mode. So, 2.3/3.3 = 70% of the time is just the formatting/printing/-x work.
I really enjoy zr(at). I don’t think it will help with specific completions that are slow to load though, but it’s simple, easy to use and can be pretty bare bones.
I suppose this is something that Nix would excel at; just inspect the hash of kubectl (or whatever you are running completions for), and reinstall a downstream derivation that caches the completions of it changes.
Anyone know the best way to clean up your shell init scripts? Maybe rephrased a better way - what are all files my shell sources on startup? (fish in this case.)
I setup zsh to work like my tcsh setup to give it a try and used it for a while. Yes the init files in zsh are a bit larger, but I noticed no speed difference at all. In the article it seems they are loading all kinds of things. So I would expect zsh would be slower.
BTW, I am still on tcsh only due to my "muscle memory" and on history the cursor positions at the end of the line instead of the beginning.
Besides large init files, another way to get slow start ups is large history files. I used to set mine to hundreds of thousands of entries until I realized this was slowing down start up time by a factor of several. :-)
You can use a plugin manager like https://github.com/marlonrichert/zsh-snap to cache the output of these commands. Using something like the packages version number as the cache key will ensure that it gets regenerated only once per package update as opposed to every shell launch.
This doesn't actually make your zsh start instantly, hence the reference to the "one weird trick" advertisement. It does, however, make it feel like zsh is loading faster. Or, put it another way, your original zsh wasn't so slow to load, but you thought it was slow. Now you see that it was pretty fast all along.
Here's how it work. To make your zsh "start instantly" all you need to do is print your prompt immediately when zsh starts, before doing anything else. At the top of ~/.zshrc is a good place for this. If your prompt takes a long time to initialize, print something that looks close enough. Then, while the rest of ~/.zshrc is evaluated and precmd hooks are run, all keyboard input simply gets buffered. Once initialization is complete, clear the screen and allow the real prompt be printed where the "loading" prompt was before. With the real prompt in place, all buffered keyboard input is processed by zle.
It's a bit gimmicky but it does reduce the perceived ZSH startup latency by a lot. To make it more interesting, add `sleep 1` at the bottom of zsh and try opening a new tab in your terminal. It's still instant!
Using zsh now but I had a similar issue with bash-it and it turns out the culprit was that I was running brew —prefix multiple times to get the home brew path instead of using an environment variable.
I think zsh and others (bash, ash, etc) are a problem for many reasons:
1. Poor start-up time. It's not so much a problem if you are opening a terminal, but if you're running shell scripts in a large loop, they could soon become a significant factor of your run time.
2. Too much RAM. I'm looking at my server and bash is taking 2-3MB per script. When you're running a 10's or maybe 100's of little scripts this really adds up.
3. The syntax is wrong. For example in bash, there are more ways to write a loop that I can to mention. Often I find myself wondering if I need no brackets, [], [[]], (), (()) - it really shouldn't be this hard or varied.
4. Proper types. Sometimes you simply don't know if what you have can be parsed as an array or not, or whether you have a string representation of a number.
I haven't yet thought of a better way though. For example, it does a few things right:
1. Piping is really powerful. Where possible I use '|' as it's usually the simplest to understand, when you have multiple arrows < > with numbers on the end it can take a moment to figure out what gets passed where.
2. 'Forking' is super simple. Just throwing an '&' at the end means the command no longer blocks, super cool. I think I would like to see some form of "join" and I know this is possible, but it would be cool if there was a super simple syntax for this too.
Maybe:
pid=$(some command &) # & would return a PID number
# some time later
join $pid # How would you know if the PID wasn't re-used?
3. No compilation is great for writing scripts and maintaining portability.
4. The completion saves loads of time. Being able to type 'ls' and tab a few times is great for finding where something is or an option for a command. In the same strength, the history accessible with arrow keys is also really cool.
One thing that could save some time during startup is to have a spare shell already spun up and waiting to be allocated. This would cause some issues with sourcing, but it could be possible to check the timestamps on the sourced files and see whether they require another source just before handing the process over. It's a little hacky though and a better solution would still be to address the performance issues directly.
> 3. The syntax is wrong. For example in bash, there are more ways to write a loop that I can to mention. Often I find myself wondering if I need no brackets, [], [[]], (), (()) - it really shouldn't be this hard or varied.
If you want even weirder syntax, just never use square brackets. Write out `test` commands manually. E.g. instead of `if [ ${string1} = ${string2} ]; then` write `if test ${string1} = ${string2}; then`. `[` is just an alias for `test`. Combining comparisons can get a bit harder to keep track of though, since there's no `]` and `test` allows logical operations with `-a` and similar. Better to call `test` multiple times and use the shell-native `&&` and similar IMO.
I don't think a clean script is much to ask for. Sometimes a programming language is not particularly appropriate. There's some middle ground where it's complex, but re-writing it in a programming language is a super pain.
Pro tip: add `zmodload zsh/zprof` at the beginning of your zshrc and `zprof` at the end, open a new shell to see those pesky long-loading sources as a nicely formatted trace with timings.
The core problem is using too many modules, and frameworks like oh-my-zsh look good, but don't give you much visibility on the internals.
When I was using MSYS2 (cygwin based, so with a slow fork) I ended up writing my own solution for a cute prompt and history logging to sqlite with a fzy frontend because the existing solutions were all made for Linux and assumed among other things fast forks.
> the tl;dr is to add zmodload zsh/zprof add the very top of your ~/. zshrc and zprof to the very bottom, then restart the shell. On startup, you will see a table with everything impacting your shell startup time.
My wording is pretty clear, that’s no claim to have proof. My working understanding is that .rc files are common for shell init. If it turns out that .rc files slow down any shell du jour significantly, then TIL, I don’t know much about the inner workings of shells and haven’t had a practical reason to learn.
In any case, I’d like to apologize because I clearly offended you somehow, despite being a fellow zsh user and for that I am sorry.
Whether or not the title is misleading, my experience is that my zsh loads slowly. You are downvoting me for sharing my genuine real world experience with a program. I’m sorry I hurt your feelings with my real life.
My feelings? Your comment demonstrated a misunderstanding of the article. Also I don't have enough karma to down vote anyone on hackernews.
If your zsh is slow to start, I suggest commenting your entire zshrc and then binary searching it to see what's slow (uncomment the top half only, then bottom half, then the top half of the slow half).
At first I did the same as the author and just dumped the output from `pandoc --bash-completion` et al. into a file, but then I'd have to deal with cache invalidation on every machine that I use. Doing it manually isn't that bad, but remembering to update the file in a bunch of machines is a bit too much ugly for me.
After a surprisingly short Google session, I ended up settling with lazy initialization [1]. Unlike the post I found, I did it all manually, though. Now I just have this tucked at the end of my `.bashrc`:
[1]: https://dev.to/zanehannanau/bash-lazy-completion-evaluation-...