Hacker News new | past | comments | ask | show | jobs | submit login
Lessons learned from writing ShellCheck (vidarholen.net)
307 points by r4um on Feb 9, 2020 | hide | past | favorite | 46 comments

> discovering late in the interview process that Apple has a blanket ban on all programming related hobby projects

O_o There must be more to it then that. What the hell does anyone do in their spare time at home for fun then?

From personal experience, that's a proper reading of the rule. You can do things for "personal education" but you can't release anything that's code (open or closed source, doesn't matter, including things already released...you can't maintain them) or code-related (e.g., blog posts, books, etc.) without a ridiculous number of approvals (some from people at like the VP or SVP level).

How do they stop you from just ignoring their rules and doing it anyway?

Author here. I've heard from current and former employees that they believe that Apple doesn't actively police it. However, telling people not to makes it easier to shut down later if they want.

Thanks for writing such a fantastic tool. It's recently become for me a must-have when writing shell in any codebase.

One thing I've really appreciated is the Github wiki pages for individual shellcheck errors. Without those pages I wouldn't have learnt so much more about shell and the shellcheck tool would be more difficult for people to use :)

It is possible to get an exception approved by SVPs -- legal/general counsel merely advises management after all. But Shellcheck's GPLv3 license is a virtual dealbreaker.

Really, really expensive lawyers I would guess. I look to Apple's behaviour towards the Right to Repair movement for proof of past behaviour/culture as an organisation.

They can terminate your employment if caught?

Yup, exactly. I can’t say I heard of that happening but it was pretty understood that there would definitely be severe consequences.

I can speak from personal experience that it's happened, though it's unclear if the violation of this rule was the real reason for the termination or just the scapegoat.

Hello, author here. Yes, it was very surprising. The rationale given to me was this:

You can't compete with Apple, which is fair. However, you're also not privy to what Apple is or isn't working on at any given time.

If Apple gave you permission to work on something, you could infer that they're not working on it, so it's not allowed.

Wow, are they so insecure about quality of their software, that they squash any possible and impossible chance of competition so paranoidly?

I'm unclear how you arrived at this being about their quality.

The statement reads simply as: apple works on many things, and doesn't want you working on things outside of apple that may compete with current r&d or future r&d.

We can disagree with them on if it seems appropriate, but it doesn't seem to be about their software quality at all.

> apple works on many things, and doesn't want you working on things outside of apple that may compete with current r&d or future r&d.

And yet Apple is totally fine with benefiting from the open source work of others...

> And yet Apple is totally fine with benefiting from the open source work of others...

And? Every other person benefits the same with the code contributed under the licenses in play. I fail to see the issue. With llvm for example they seem to be upstreaming a lot of their xcode backend stuff as they get time. So that statement of many copyleft proponents that only copyleft encourages upstreaming rings hollow to me.

Hell go talk to people from Redhat about custom gcc forks that target chips that aren't upstreamed to gcc. Just because you're using gcc and modifying it doesn't mean you'll actually be contributing the code if its all internal.

To answer you, the key part of your reply is: "...that may compete with current r&d or future r&d".

It seems to me that they are afraid (or at least unwilling) to compete on even grounds, despite likely having more budget than a single random employee competing in their free time.

Or their lawyers are extremely good at getting employment contracts drawn up. Never attribute to malice that which can be explained by paranoid lawyers that got the c levels ears.

Who would sign something like that in the current market?

For enough TC I would contractually agree to stop touching a computer between the hours of 6pm and 8am.

More realistically... most engineers don't code as a hobby. HN, /r/programming, and other software dev hangouts are small echo chambers.

> More realistically... most engineers don't code as a hobby. HN, /r/programming, and other software dev hangouts are small echo chambers.

This sounds unnecessarily dismissive to me and also unsubstantiated. It removes the focus from the more important fact that this is a foul encroachment of personal freedom regardless of how many professional coders like coding as a hobby.

> This sounds unnecessarily dismissive to me and also unsubstantiated.

Anecdotally, the programmers I work with don't seem to read anything outside of StackOverflow or contribute to any OSS projects. Albeit, what OSS projects we contribute to or what communities we participate in isn't usually a topic of conversation among my coworkers.

> It removes the focus from the more important fact that this is a foul encroachment

I agree that this is an encroachment on personal freedom and I doubt this clause is legally sound (in California at least). I am quite surprised that Apple has such a policy. But I am even more surprised that they are able to find programmers who agree to it.

Someone who wants to work at Apple despite that.

The vast majority of Software Engineers I’ve come across don’t write OSS code, don’t write blogs, maybe read Hacker News. For most of them this is sort of a non issue.

I would not say non-issue.

It might be a non-issue as in "I probably wouldn't have exercised that right anyway", but that doesn't mean that one feel the restriction is acceptable, fair, or justified.

I would not. Regardless of whether I had any intention to do it or not.

I suspect that that would be unenforceable in the EU, UK, and Scandinavia.

Is there a lawyer with experience in the field reading this?

Not a lawyer, but I have previously contracted one to evaluate and discuss something closely related, which are the Swedish laws about non-compete clauses in employment or contractor contracts.

In summary. They are allowed to some extent, but if a court would consider a non competition paragraph as being too broad, the paragraph becomes void in its entirety, as if it was never entered into the contract at all.

For non management employees the exclusivity must be quite specific, and unless the law has changed recently without my knowledge, and entire industry, or a broadly specified skill would fail the test in any case I can think of.

Applying the most basic levels of those principles, I would assume that attempting to limit a programmers ability to engage in programming in general would lead to nothing but a complete dismissal of any claims related to the non-compete clause if someone brought it to court.

Effectively, if you try to prevent someone to apply their trades and/or talents to broadly, they could essentially moonlight for your worst competition without much of a risk as long as they don't convey what is clearly trade secrets as they are differently protected.

Suffice to say that companies with decent legal departments tend to write specific non competition clauses especially outside of the realm of management positions, as they get no value at all from too broad non competition clauses.

Of course it's enforceable. If you gain certain areas of knowledge working at a company where IP is restricted, then that restricted IP may be reflected in your personal code. This kind of practice is pretty common on a lesser scale, i.e. if you work on say a libc at work, then arguably you couldn't work on a libc or something very similar at home without risking a breach of IP. It just so happens that in Apple's case, they have fingers in such a large number of pies that it must take a great deal of work to verify that something /doesn't/ infringe on their IP.

Of course, they can still sue you for a long time.

I want to thank the author for an amazing tool.

I've used ShellCheck to catch/fix numerous bugs on some mission critical financial systems (that depend on shell scripts!).

Added it to pipelines and as a mandatory pre-commit hook in git, so all teams started using it (whether they like it or not).

It even caught some places where a rm -rf / could happen because default parameter substitutions were not set ( Same as the Steam bug where it would remove your home directory, but on some big financial systems :) )

All in all - an amazing tool that saved me from a lot of grey hairs throughout the years.

ShellCheck's unique error/warning codes and wiki descriptions for every code with the explanations and example are the work of genius.

It seems so "obvious" in hindsight but not many other tools are so good.

Also, tangentially, this is also so perfect:


and, unless somebody proves otherwise, makes me believe that with ShellCheck and using the described conventions a shell script could be better diagnosed for mistyped variables than a Python script can (namely at "compile time" and not during the run).

Vidar, your work significantly improved the life of people confronted with shell scripts. Thanks!

Hmm, this can be done with Python code too. And there are already tools doing so.

> this can be done with Python code too

Can it? That's what I would like to know, how to know without executing the code with the argument 4 that there's an error in this Python code:

  import sys;

  def f( x ):
    if x == 4:
      x += q
    print( x )

  f( int( sys.argv[1] ) )
That will "work" until 4 is passed at the runtime:

    $ ex-undefined.py 4
    Traceback (most recent call last):
      File "/tmp/ex-undefined.py", line 8, in <module>

        f( int( sys.argv[1] ) )
      File "/tmp/ex-undefined.py", line 5, in f
        x += q
    NameError: name 'q' is not defined
With this shell code

  f() {
    if [ "$x" = 4 ] ; then
    echo $x
  f "$1"

and shellcheck I get:

   Line 5:
             ^-- SC2154: q is referenced but not assigned.
even if I have never executed the code.

The standard python linters can do it:

    $ flake8 foo.py
    foo.py:6:14: F821 undefined name 'q'

    $ mypy --check-untyped-defs foo.py
    foo.py:6: error: Name 'q' is not defined
(They also have some other complaints about your code that you'd either have to disable or adjust to.)

My note: invoking the first on the given example produces 17 lines of different complaints, most totally irrelevant to the validity of the code (of course it complains about the "style" -- it was written as a "Tool For Style Guide Enforcement"). Invoking the second without the magical switch --check-untyped-defs produces:

    Success: no issues found in 1 source file
and additionally produced a .mypy_cache folder of 2 MB at the place where the script was.

So the style of "unreasonable" defaults of Python alone (not detecting the error, speaking from the point of view of a user of Perl) propagates to the "unreasonable" defaults of the checkers.

Still thanks Ded7xSEoPKYNsDd, I really wasn't aware of these! Yet, even if I haven't formally specified that, I was looking for the way to do it with the Python as the language and its default interpreter alone, as for Perl nothing additional has to be installed:

    use 5.010;
    use strict;

    sub f {
        my $x = shift;
        $x += $q if ( $x == 4 );
        say $x;

    f( $ARGV[ 0 ] );
Gets me:

  Global symbol "$q" requires explicit package name (did you forget to declare "my $q"?) at ex-undefined.pl line 6.
  Execution of ex-undefined.pl aborted due to compilation errors.
I am aware that for shell I'd need an additional shellcheck but Python is many, many times bigger than the shell binary alone (or even the sum of the shell and shellcheck binaries), and actively changed, whereas the shell semantics is standardized and effectively frozen in time, and from the shell interpreter alone a very low startup overhead is expected.

> Converting them to a cleaner ReaderT led to a 10% total run time regression, so I had to revert it. It makes me wonder about the speed penalty of code I designed better to begin with.

I have heard from other Haskellers that you can sometimes get good performance by hand-rolling an application monad, and then writing out the MonadFoo instances so you get good ergonomics:

    -- Instead of this:
    newtype App a = App { runApp :: ReaderT AppEnv (ExceptT AppError IO) a }
      deriving (Functor, Applicative, Monad, MonadIO, MonadReader AppEnv, MonadError AppError)
    -- Try this:
    newtype App a = App { runApp :: AppEnv -> IO (Either AppError a) } deriving Functor
    instance Applicative App where ...
    instance Monad App where ...
    instance MonadIO App where ...
    instance MonadReader AppEnv App where ...
    instance MonadError AppError App where ...

Your two datatypes are representationally equal. So would be very surprised by handrolling the datatype maling any difference. The instance however might make more sense

Indeed. The folklore I heard is that you can get performance gains because with the handrolled version, GHC is able to inline the instance dictionaries.

I had commented in a previous discussion regarding Haskell that a common criticism of Haskell is that it’s lazy evaluation can lead to problems reasoning about runtime performance, a criticism which is repeated here. I wonder if there any resources, whether language specific tooling, or general theory, that could help developers struggling with this. I suppose that flame graphs would be a useful tool to see where your time is being spent.

The article does not blame lazy evaluation for the problem. The author just could not figure out where the leaks came from.

I believe I know where some of the leaks may be coming from. I’m one of the authors of hadolint, the docker file linter. We used ShellCheck as a library to lint the bash code found inside the files.

When we attempted to compile Hadolint to Javascript to embed it on a web page, we discovered that the code in ShellCheck was not optimal. The culprit for our use case was the abuse of regular expressions for checking equality of bash snippets.

In Haskell, most regexp libraries interface with external C libraries and have pinned memory objects for performance, but one must be careful not to accumulate too many of them.

The solution for us to have a usable Javascript version was to reimplement the equality operator for those cases we needed, and now memory usage was down again.

We haven’t stumbled upon our first space leak yet.

Over the years I've collected various links on topics related to Haskell performance. I'll spare you the whole list, but here are a few that may be of some use.

Haskell Performance Patters - Johan Tibell (2012) http://johantibell.com/files/haskell-performance-patterns.ht...

Performance (Haskell Wiki) https://wiki.haskell.org/Performance

Detecting Space Leaks - Neil Mitchell (2015) http://neilmitchell.blogspot.com/2015/09/detecting-space-lea...

The Haskell performance checklist - Chris Done, et. al. (2017 - 2019) https://github.com/haskell-perf/checklist/blob/master/README...

ThreadScope (Haskell Wiki) https://wiki.haskell.org/ThreadScope

Top tips and tools for optimising Haskell - Will Sewell (2015) https://making.pusher.com/top-tips-and-tools-for-optimising-...

I’m in a Haskell learning phase and honestly can’t get enough recommended study pointers. Share as many as you feel comfortable sharing, there’s always some like me who’d benefit from the curation.

Not the OC and (mostly) not related to Haskell performance, but the following could be useful to you:

State of Haskell Ecosystem: https://github.com/Gabriel439/post-rfc/blob/master/sotu.md#c...

Guide to Haskell (2018): https://lexi-lambda.github.io/blog/2018/02/10/an-opinionated...

Haskell learner milestones: https://stackoverflow.com/a/1016986

GHC optimizations (rather advanced, but interesting): https://stackoverflow.com/questions/12653787/what-optimizati...

Compilation of various Haskell tutorials on more advanced topics: https://markkarpov.com/learn-haskell.html

Despite the popular saying, there’s a wealth of knowledge about Haskell beyond just academic papers, and in digestable form, but it’s scattered around little-known blogs. I’d recommend you to check out r/haskell from time to time, or subscribe to the Haskell newsletter - at least for me, those are the places where I got to know most of Haskell blogs.

The Haskell runtime itself helps out here! Compiling the program in “profiling” mode supplies a large number of runtime switches so that you can profile performance, space usage over time, and so on. It’s not perfect, but it’s much more than many languages give you built-in.


It also gives you the ability to “tag” sections of code for profiling, so if you suspect that the internals of some named function are giving you grief, you simply add a tag (some pragma comment thing I think) and get profiling information for it. It’s much quicker than having to refactor just to profile sections of code.

Simon Marlow's book is also good source of information

"Parallel and Concurrent Programming in Haskell"


He also has some interesting articles like


If you can, get Don Stewart to help you.

> I’d also put serious consideration into how well the language runs on a JSVM

What's a JSVM? I'm not getting anything from Google that I think makes sense in the context of post.

JavaScript virtual machine, you know the thing that's the compilet target for ClojureScript, TypeScript, ES5, asm.js etc.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact