> Why provide an exploitable API, while a safe version is possible and is more direct? I don’t know, but my guess is that it’s mostly just history.
Its to ensure you can take a shell oneliner and turn it into a program one liner. Ie. anything you can do in a terminal / copy from the web you can trivially do in our program! Whilst this is a footgun for application development this is a necessity for other kinds of people who write programs that are also a target market for such scripting languages.
Of course it should really be called 'runInShell', and provide an option for which shell etc.
I think the author's point is that it would still pretty easy to "sh -c" yourself if needed, and at least you would have to be aware you are doing that.
The disagreement is as to who it is easy for; the author seems to only consider programmers by trade and then blame some notion of propagating a legacy of unsafe calls.
Every language the author lists as vulnerable was intended, at launch, to be used both as a serious programming language as well as program launching glue or by non-career programmers for things that aren't production grade applications.
The only language I can think of that is explicitly an application development language is Rust, and as the author mentions this vuln is not present.
And as a note, the alternative to exec in NodeJS are keeping execFile and using these ~10 lines [1] or using spawn directly with these 216 lines [2]. I barely trust myself to reliably reproduce those.
The need for APIs to not have non-obvious side effects is especially important for non-career developers as they’re less likely to be aware of said side effects. So your argument is actually a strong reason for exec() not to fork /bin/sh
Javascript has fetch(), PHP has a curl library built into the language, python has requests, etc. The majority of shell() calls have the same exact situation.
I'm not sure I follow the question. The main difference between the original line and the "why does it not work like so" line is the original is written as an async call and your line is as a sync call but all of that has nothing to do with the question of why call an external binary for something that has a built-in.
Node’s child_process.exec() [0] passes the first argument straight into a shell, so you can use whatever shell features you want. The functionality you’re talking about is provided by child_process.spawn() [1] (as long as the shell option isn’t specified).
The exec() documentation actually provides a warning:
> Never pass unsanitized user input to this function. Any input containing shell metacharacters may be used to trigger arbitrary command execution.
Likewise, spawn() says this:
> If the shell option is enabled, do not pass unsanitized user input to this function. Any input containing shell metacharacters may be used to trigger arbitrary command execution.
I do think the naming could be better - why isn’t exec() called shell()?
As many wrote, this is a well known problem. Running any commands on external user input is prone to similar attacks. When doing it, one should strongly consider going through an API that will prevent such command execution. And I would go for a standard/well known library for the language used instead of a homegrown one.
That said, I often use such code myself for simple tasks that I run on a local machine. The simplicity of calling a shell, internal pipes and all of its options, is great. If and when I need to polish it or especially expose this to other users I can rewrite it properly. My 2c.
I also wonder if the existence of these APIs has something to do with DOS and Windows, where the command line is actually a single string and quoting is a nightmare.
Good post! It's an interesting role reversal. Usually Unix is all about passing raw strings around whereas Windows is much more comfortable with passing structured data (see also COM).
I would warn that "CommandLineToArgvW" does not necessarily do what the C runtime does. They are different implementations.
Also your main problem in that post seems to have been in using `cmd.exe` and DOS utilities. The command prompt is all about mimicking DOS and not breaking .bat files from 80s and 90s. It's essentially stuck in stasis. Powershell is the actively developed shell (now on version 7).
It might have been worthwhile to include a discussion on how to safely escape the shell strings, in case there is no API that receives a list of arguments.
On Unix, it is not very intuitive because you cannot use backslash inside a single quoted string. To quote "o'neil" we have to write
> On Unix, it is not very intuitive because you cannot use backslash inside a single quoted string. To quote "o'neil" we have to write
'o'\''neil'
You can always do the weird one, which I find easier to remember (but is likely so for no one else), of dropping to double quotes for the single quote:
The article: "And this is what I am complaining about — that the API looks like it is safe for an untrusted user input, while it is not"
It is a bit sad to read that.
Spawning applications from your application is dangerous, so you need to be careful, and at least read the documentation.
This shows that a lot of developers nowadays just copy-paste code from StackOverflow, don't read the documentation and don't know the real low-level calls in a soup of unorganised abstractions and libraries.
This behaviour is documented, is actually intended, and is all explained:
There is even a warning:
"Never pass unsanitized user input to this function. Any input containing shell metacharacters may be used to trigger arbitrary command execution."
few lines later:
"do not pass unsanitized user input to this function. Any input containing shell metacharacters may be used to trigger arbitrary command execution."
Please read the manual before you want to operate a dangerous machine.
To avoid confusion for future readers: the second part of the post, the part which contains the quote, talks about VS Code `ShellExedution` API. This API is unrelated to node’s `child_process` module. The absence of clear docs on `ShellExecution` is one drawback of the API which is called out.
Separately from the factual response, I want to explicitly say that I don’t endorse such style of communication. Please do not blindly assume, and then pick on, on the lack of knowledge of other people (be it author of the post or abstract lot of developers).
Even when you actually find the lack of knowledge (rather than assuming it), do not scold people for it. It’s OK to not know things, we all learned them for the first time once.
My comment was more about the situation where a newbie knows that he is handling dangerous operations but still copy-paste from StackOverflow without even thinking or reading.
You went one-step ahead and wrote a summary so it wasn't about your situation.
The only point where I fundamentally disagree is about ShellExecution.
When you send a string to the shell for execution, it seems reasonable to expect >, semi-colons, pipes, and other shell operators to be interpreted by the shell unless you escape them.
> When you send a string to the shell for execution, it seems reasonable to expect >, semi-colons, pipes, and other shell operators to be interpreted by the shell unless you escape them.
Fully agree here. But the problem is the second overload, which can take an array of strings (so looks like a safe API), but concatenates them without escaping.
Ah, ah. I wanted to say no, but after reading more and more the method they offer is actually much more evil than it seems.
In the ShellExecution function, if you actually comes from C programming you can get confused:
Could the parameter "string | ShellQuotedString" mean:
"A string which has the binary flag ShellQuotedString on"
= "A shell quoted/escaped string" +/- "a string of sub-type ShellQuotedString"
which would imply that the developer has to take this responsibility.
or it means "string || ShellQuotedString", as in param is a "string OR ShellQuotedString"
which in this case the behaviour is unknown.
(well, now I know it's the second one :| ) but it's fun to read
If people read the manual before operating a dangerous machine we wouldn't need terms like "misuse resistance". Documentation is not a substitute for safe API design.
The eventual point of the article's shaggy dog story is even "misuse resistance resistance" in VSCode. There's an API provided which looks exactly like the API you'd provide that resists misuse but in fact it's designed in a way that enables misuse instead. Why?
In this particular case they know you might need to pass a bunch of completely unrelated strings, and they know their underlying primitive can't really do that, and so they... provide code that just concatenates all your strings, as if you're a programmer who doesn't know how to concatenate strings and needs help with that.
The best excuse I can imagine is somebody at VSCode assigned a junior the task of writing the trivial "escape everything properly and concatenate it" function. Except, that function isn't trivial, and the junior lacked confidence to come back with the answer "This isn't trivial after all, I need a grown-up" so they hacked up a solution that doesn't solve anything, checked it in and breathed a sigh of relief when it passed review.
If you're escaping shell tokens to reduce your arguments to a single executable command then you might as well not bother with forking to a shell in the first place and instead fork that executable directly since you're removing the one benefit forking to a shell has (effectively embedding a DSL). You'd get some small performance improvements in doing so too.
That doesn’t really solve the problem. Those shell tokens are still getting parsed and interpreted as special tokens only you’re now doing it in your host application rather than the Borne shell. And as you’ve pointed out, reimplementing shell syntax is non-trivial and highly error prone.
Plus if you’re going to those lengths then you’re better off embedding a Lua/Python/Perl/whatever interpreter in your application.
> It's a common vulnerability that can happen in all languages.
While that's true, some language's standard libraries don't expose this vulnerability so the developer has to explicitly write bad code. And that's what really matters.
Take node.js for example: it's not going to be obvious that `exec(command)` is actually equivalent to `sh -c "$command"` and thus a lot of people will get caught short if they have unescaped shell script tokens (eg $, <, >, ;, etc) in their command string. And that's very easy to happen.
Personally I think it was a boneheaded decision when most of the languages adopted that approach of forking to sh. If developers wanted to fork to a shell then they could still explicitly write the code to do so (like the `spawn("/bin/sh", "-c", cmd)` example in the article). But by having exec fork sh by default you're creating a risk for all sorts of unexpected behaviour from developers who don't know their language's standard library inside out (which, lets be honest, is going to be most developers given the point of standard libraries is to abstract away that complexity so developers don't need to think about it).
This isn't just a web scale problem either. In fact the example given in that article is a perfect one because its a CI/CD pipeline running trusted input that still failed unexpectedly.
Cannot agree more. To me it seems the original program in the article should be correct, given the `exec` from node would indeed perform a simple exec, instead of actually being an `exec_shell` function.
> It's a common vulnerability that can happen in all languages.
From the article:
> I would have written this in Rust, but, alas, it’s not vulnerable to this particular attack :)
But of course, strictly speaking, this isn't the case, as you can always call sh -c and rust doest prevent you from doing that (though makes it harder).
There is nothing fundamental about Rust that prevents command injection vulnerabilities. (What Rust has going for it in this specific case is that the standard library AFAIK doesn't have an easy way of doing it accidentally laying around - if you want to call the shell, you need to do it explicitly)
But that's exactly the point. I'm often the first to tell people that Rust doesn't automatically guarantee safer code however in this instance Rust (and others who don't call `sh` by default) are safer because the developers have to explicitly write code to fork a shell. And if they're explicitly doing that then you'd fucking hope they'd understand the risks of doing so (and at least it would stand out more in a code peer review too).
Whereas standard libraries that silently fork sh are less safer because they introduce the shell injection vulnerability by default thus requiring developers to consciously be aware of the need to sanitise their inputs.
As I've mentioned elsewhere, I personally think it was a boneheaded decision to ever make exec() functions call a shell. Maybe I'll forgive Perl because that has it's origins as being an extension to shell scripting. But by the time node.js was released people really should have known better.
That's clearly not what they are saying, given that the C standard library contains (only contains, the safer ones are platform-dependent) the vulnerable form.
https://bonedaddy.net/pabs3/log/2014/02/17/pid-preservation-...
Another issue to worry about when executing other processes is option injection:
https://www.defensecode.com/public/DefenseCode_Unix_WildCard...