
Shelling Out Sucks (2012) - Immortalin
http://julialang.org/blog/2012/03/shelling-out-sucks/
======
billyhoffman
> Instead, code that shells out with programmatically constructed commands is
> typically riddled with potential bugs in the best case and massive security
> holes in the worst case.

This sentence sounds eerily similar to what we have with SQL Injection. Same
concept: constructing raw commands for untrusted user input. The solution is
equally simple: input validation (where possible), parameterized
queries/shellescape.

The barriers to people shelling out securely are the same we had/have with SQL
Injection:

1- Education: Many people are unaware of these solutions.

2- Bad habits reenforced by bad code examples. People don't use these security
best practices in the simplified code examples included in a blog post,
documentation, or a Stack Overflow answer. I joke that the only SQL code
example on MSDN that isn't SQL Injectable was the example for parameterized
queries (I'm not that far off). Sadly It's not much better for
escaping/defanging shell code.

Shelling out also had the added problem that far fewer people do it than use
SQL, so the propagation of security best practices is going to be slower.
Luckily it also means there are fewer places that screw it up.

~~~
StefanKarpinski
There is a common denominator between shelling out and making SQL queries:
using strings to construct executable code. Metaprogramming with strings and
eval, e.g. in Perl/Python/Ruby, is another place where this is done – and is
fraught with similar issues. Lisp has metaprogramming, yet doesn't have this
problem. Why? Because you don't do unstructured metaprogramming with strings
in Lisp. Instead you do _structured_ metaprogramming using lists and symbols.
Julia also has structured metaprogramming [1]. For running external commands,
Julia gives you a structured metaprogramming solution: the `cmd` form is
really a list construct in disguise – one that understands shell syntax and is
therefore easy and intuitive to use. The same approach can and probably should
be used to make SQL query construction both convenient and safe.

[1]
[http://julia.readthedocs.org/en/latest/manual/metaprogrammin...](http://julia.readthedocs.org/en/latest/manual/metaprogramming/)

------
mappu
_> Although shell escaping does work (assuming that there aren't any mistakes
in the shell escaping implementation), realistically, no one actually bothers
— it's too much trouble_

I bother. It's not hard. If you're not escaping outputs, you're part of the
problem.

I made substantially the same comment when this article was originally posted
(
[https://news.ycombinator.com/item?id=3689561](https://news.ycombinator.com/item?id=3689561)
)

~~~
StefanKarpinski
The point of this post – and the design of how one calls external programs in
Julia – is that it makes "being part of the solution" the default, easiest
thing to do. It's great that you bother (I do too), but many won't unless you
make the virtuous option also the easiest option.

------
pixelbeat
There are large advantages to shelling out though.

1\. Not reinventing the wheel

2\. Implicit use of multiple cores

Things could be improved I agree.

For example posix_spawn() could be efficiently implemented on glibc to avoid
kernel overhead for fork()+exec() from large processes. Then language runtimes
could use that to implement their "shell out" routines.

Also pipefail doesn't cater for SIGPIPE as detailed at
[http://www.pixelbeat.org/programming/sigpipe_handling.html](http://www.pixelbeat.org/programming/sigpipe_handling.html)
which can be awkward.

~~~
StefanKarpinski
These are advantages of calling external programs, not of shelling out – which
specifically entails using a _shell_ as an intermediary to call an external
program. All of the pitfalls in this blog post are due to using the shell, not
to calling other programs. If you avoid the shell, then you avoid these
issues.

------
keypusher
Should be titled "Shelling Out Sucks In Some Languages". Python gets this
right with the subprocess module in stdlib, as the author points out at the
very end.

~~~
galonk
As the author also points out, subprocess DOESN'T shell out. It starts the
external process directly.

------
eloisius
I don't disagree that shelling out sucks, but the entire silent failure
scenario in their example is the result of using the explicit to_i conversion
method. Using to_i on an inconsistent data source is always prone to failure
because of how it defaults to 0.

    
    
        Integer(`find #{dir} -type f -print0 | xargs -0 grep foo | wc -l`)
    

is unlikely to fail because of bad input.

------
frou_dh
Nice article. I've noticed people often call the act of running a child
process "shelling out" even when there's no shell involved. It's good to know
what fork & exec (& dup?) are and to grok the non-magicalness of shells (you
could conceivably write your own shell just as privileged as bash - which is a
normal userspace program after all).

------
jrochkind1
While I agree that shelling out is inherently problematic, all of these
examples make me think ruby really ought to provide a shell-out wrapper that
provides many of the partial solutions that "nobody does" for you. If not in
stdlib, we could write easily a gem (that nobody would use, heheh).

    
    
         result = shell_out('find ? -type f -print0  | xargs -0 grep foo | wc -l', dir_path)
    

Would automatically `Shellwords.shellescape` interpolated variables, same as
ActiveRecord does (hash params could be supported too). Would by default
automatically `set -o pipefail` for you too, and perhaps even by default raise
for a non-succesful outcome. All of these defaults could hypothetically be
changed by option arguments.

In fact, I wonder if there already is a gem that does this, that nobody uses?
:) If not, it makes me want to write one just for fun, link back to OP as
explanation of why you want this. That still nobody would use except me, heh.

Shelling out is a hacky solution nonetheless, with a terrible performance
profile, but there are reasons people use it anyway, and a wrapper like this
could take care of many common gotchas.

~~~
e911
I think you can already do this without escaping strings? execve("/bin/bash",
"-c", "command $1 $2", "-", "parameter1", "parameter2");

------
mzs
It's also wrong irregardless of the shell escaping. That will not always count
the number of lines in all the files under a directory that contain at least
one "foo" \- when shelling-out, you have to be sure to use the tools
correctly:

    
    
      $ mkdir sh
      $ cd sh
      $ >foo find . -type f -print0
      $ >bar head -1 foo foo
      $ # four lines with "foo" in file bar
      $ >foo echo foo
      $ mv foo 'bar
      > baz'
      $ # one line with "foo" in file bar?baz
      $ find . -type f -print0 | xargs -0 grep foo | wc -l
             3
      $ # Wait a minute that's not five, I KNOW there five lines!
      $ find . -type f -print0 | xargs -0 grep foo
      Binary file ./bar matches
      ./bar
      baz:foo
      $ # oh...
      $ find . -type f -exec grep -c foo '{}' \;
      4
      1
      $ # might as well get everything closer to correct
      $ env -i /usr/bin/find . -type f -execdir /usr/bin/grep -c foo -- '{}' \; | \
        env -i /usr/bin/awk '{t+=$1}END{print t}'
      5

------
ejcx
One awesome thing I found recently was the golang exec package.

I do a lot of security things and I attempted to write vulnerable code that
shells out to do some purpose with some user provided data concatenation in.
I.... Couldn't figure out how. If you figure out how, let me know. More
languages should be doing this parametrized approach for shelling out like Go
does

~~~
jacobparker
[http://linux.die.net/man/3/exec](http://linux.die.net/man/3/exec)

Not a Go thing. "Shelling-out" is different.

------
joelanders
Is there a reason "fork over" and "shell out" can both (roughly) mean "fork
and exec a shell" _or_ the seemingly unrelated phrase "pay?"

~~~
cooper12
Well "fork over" comes from the system call fork() which creates a new
process. "Shell out" as explained by the author means to run an external shell
command, which has been implemented in some languages by forking a shell, and
then spawning the process. fork() is known as so because it results in a
hierarchy where there is a parent process and a child process, and so forth.
Shell just seems to be a general term for an interface used to operate an OS.
[0] So no, just a coincidence.

[0]
[https://en.wikipedia.org/wiki/Shell_(computing)](https://en.wikipedia.org/wiki/Shell_\(computing\))

------
danidiaz
Additional problems: running into deadlocks because a standard stream is not
being read, ensuring cancellation of the external program when the parent is
cancelled.

------
gwu78
Why not use file descriptor passing instead?

------
macoda
just write everything in bash..

