The author missed that sshd will always execute the user's shell and pass it the command with arguments as a `-c` argument. This means that the given command string will always be parsed by the remote shell. This is required to restrict users to certain commands with special shells like scponly or rbash.
When you keep in mind that the given command string will be parsed twice, first by your local shell and then again by the remote shell, it becomes clear why a running a remote ssh command behaves like this.
Yep! God though, this hits me in the face so often. Trying to add `sh -c` to fix it is a trap, because obviously, you just create yet another layer of escaping.
It really becomes one hell of a puzzle sometimes, especially if you're necessarily nesting another layer of escaping. It feels like you're trying to write a quine.
This works:
ssh host -- ls "folder\ name"
This also works:
ssh host -- ls \"folder name\"
This works:
ssh host -- sh -c \"ls \\\"folder name\\\"\"
OK, so clearly, just throwing more escaping at it fixes it. But even if you figure that out, the real mental gymnastics would be figuring out which of the three shells interpreting your command line in the last case would handle shell expansion.
In this case, it's the host:
ssh host -- sh -c \"ls \\\"folder nam\\\"*\"
In this case it's the remote:
ssh host -- sh -c "\"ls \\\"folder nam\\\"*\""
Of course where you put the quotes makes no difference. All it does is prevent your shell from processing it. So this works just as well:
ssh host -- "sh -c \"ls \\\"folder nam\\\"*\""
If you sit and think each layer through, it usually isn't completely impossible to understand, but the odds that you are going to get something wrong the first time is astonishingly high.
It does make me wonder why ssh handles it the way it does, though. Because with the way SSH handles it, it may as well just automatically escape the spaces. Right now, not putting an SSH command in quotes doesn't make much sense unless you for some reason want local shell expansion for something.
If you can invest a little bit of time in configuration of the remote host, then my "special shell" (also mentioned elsewhere on this page) may be of use, and you will no longer need to hit yourself in the face so often!
neat. one little pedantic note, argsh requires all arguments to be valid unicode strings while I think at least Linux allows argv to be arbitrary sequences of non-zero bytes. But then I really hope there are not many cases where that is relevant consideration.
If I can't fix it in like 3 tries, I switch to my local shell script having the script to run on the remote end be in a HEREDOC and just scp it it over and then ssh exec it. Inside the HEREDOC, it's a sane environment.
Copying over the commands to run also has another added benefit, you have quite good documentation of what was run and when. Also it allows one to easily run a failing thing again (in case output gets mangled in the executing script)
? Single quotes are the shell’s ultimate bulk “no touchy” escape, so if you don’t need them in the inner command, it seems easier to use them for everything. (Also when passing programs to sed, awk, jq, xmlstarlet, etc.)
Yes, that’s the intent: using single quotes at both levels will prevent either your shell or the remote shell from expanding the name, so you can interact with a file name containing a literal $.
I believe that also works, but I don't do it often because I often wind up wanting to be able to use variables from the host, especially e.g. in cron jobs and scripts.
At that point, though, I’d try to write a general shell escaping function, because I don’t trust myself to figure out which host things are OK to include unescaped in such a situation. (Here’s when I start to long for Tcl, despite all the times I’ve had to spell out [string index ...].)
The optimal solution would be to have a separate side channel for passing things to the quoted program, like -v in awk or --arg in jq (or whatever it is in your favourite SQL DBMS binding) but I don’t think SSH will let you do that.
This is more or less baked into the SSH protocol because the exec request only has a single string for the command line.
Also the reason echoing a MOTD from rc files or similar crap breaks tools like rsync or scp which use SSH as a neutral transport. SFTP isn’t affected because, while using the three-pipe as a transport, it’s a separate subsystem with its own SSH request type to initiate the channel (just like X11 or port forwarding).
If you control client and server you can define your own subsystems and invoke them directly, which avoids this whole mess.
I think the confusion comes from the documentation where ssh(1) says that the command "is executed on the remote host instead of a login shell.". Which is true from the perspective of ssh(1) in the sense of the protocol. The client has no control over what the server does with that string.
However, sshd(8) clearly documents that it will always execute the login shell, even when a command has been passed. ("All commands are run under the user's login shell as specified in the system password database.")
Subsystems open secondary channels to communicate separately from stdin/stdout of the remote command. I used the X11 forwarding before to run a remote command with sudo without getting the password prompt interfering with the protocol: https://raimue.blog/2016/09/25/how-to-run-rsync-on-remote-ho...
An extra subtlety is that it is the user’s login shell (in the getpwnam sense) but the command is not run in a login shell (in the sh - / sh -l sense). That’s why proper motds or profile files work ok (requesting a shell runs the user’s login shell as a login shell) and don’t break applications. Also the reason why $PATH often differs between the two modes.
> This is required to restrict users to certain commands with special shells like scponly or rbash.
I don't think this is some specific design goal of OpenSSH, I think it's just a side effect of how shell escaping works.
> When you keep in mind that the given command string will be parsed twice, first by your local shell and then again by the remote shell, it becomes clear why a running a remote ssh command behaves like this.
I get that this behavior may be surprising to new users, but anybody working with ssh regularly will encounter these kinds of escaping issues. SSH isn't even the only place you'll encounter this. Things like docker etc will have the same "problem".
In the case of ssh you can simply write your commands to a file and send them via stdin, or copy a script to the target.
The tone of this blog post rubs me wrong. Yes this is a footgun (in the same way many POSIX shell-related things are), but it's not like it's some "problem" with the design of SSH.
Only tangentially related.. Is there something like rbash that is actually secure and more restrictive? Like a shell that only "sees" certain files and folders and can only execute certain commands in a non privileged manner.
The shell rarely "sees" files and folders, except for expanding a glob like "*".
When the shell executes "cmd folder/file", the "folder/file" is just a string as far as the shell is concerned. It is the command that uses that string with a function like unlink or open.
The root of the problem isn't OpenSSH, it's the SSH protocol itself. When a command execution is desired, the command travels over the wire as a string, not as an array of strings as execve would expect. See RFC 4254 section 6.5. [1]
The SSH server has to figure out how to deal with it, most SSH servers to my knowledge invoke a shell. Parsing the passed command and trying to invoke it without a shell can be unsafe as escaping is shell-specific. (Imagine, for example, SSH'ing into a Windows machine, which the protocol entirely supports.)
to pass "sh -c 'cd /tmp && pwd'" over the network, which would trigger the expected behavior. But the arguments are not quoted so you end up with just: "sh -c cd /tmp && pwd".
The problem is that the quote signs are already eaten by your local shell. The SSH binary gets the parameters like this:
['sh', '-c', 'cd /tmp && pwd']
How should SSH know what shell is running on the other end and escape it properly? Will it escape for Sh? Bash? Fish? Powershell? Python shell? Custom script? The SSH client would need to know what shell is running on the other end, which it can't. The server, which knows what will run, cannot fix this because it only gets the already mangled string.
This is not solvable the way the protocol currently works because you can't support all possible constellations of client/server applications and shells.
Rule 5 of http://cr.yp.to/qmail/guarantee.html remains completely true. Don't parse. When you try to use a user interface as a command interface, you have to parse. And some day it will bite you.
The solution is to remotely invoke programs designed for the purpose, and then pass data to them in a structured format.
It's because SSH sees "localhost" "sh" "-c" "cd /tmp && pwd" and will send "sh -c cd /tmp && pwd", which the remote shell will interpret as executing "sh" "-c" "cd" "/tmp" and executing on success "pwd".
Oooh, the here string reads nicely to me and would require less thought about quoting/interpolation choices which is a win on its own even if it still takes me multiple attempts to remember to do that.
Oh yes, I've run into this when creating judo[1] (which runs commands/scripts across a fleet of hosts via SSH).
It's not even a problem with OpenSSH, the SSH protocol itself (RFC4254[2], Section 6.5) specifies that the command is sent as a string, and there's just no sane way to convert between an array of strings and a string that's also human-friendly to the command line.
My work-around was to allow only a single argument in "judo -c CMD HOST", where CMD is to be interpreted by the target's shell. If you need to do anything more complex, the "judo -s SCRIPT HOST" form is more-or-less equivalent to "chmod +x SCRIPT && ./SCRIPT".
I had a similar experience when I was testing A Black Path Toward the Sun[1], specifically using SSH and SCP over the HTTP/HTTPS tunnel, as well as when I was writing unreleased SSH pen testing tools at two different organizations.
I found that SSH is more complicated than I would have expected at a low level, and far more barebones than I would have expected at a higher level. It's incredibly sensitive to certain types of small delay for its packets. On the other hand, the actual terminal traffic is basically just a pair of streams. For an interactive SSH session, there's no useful concept of "send a command and wait for it to finish on the remote system", because the remote system is just piping the output of the shell back to the SSH client. You can hack in something like generating a random tag and appending "; echo '---complete:{random_tag}---'" to every command and assuming the command is complete when that string appears, but it's not built in, and of course every SSH server OS can have different syntax requirements, so now the client has to detect and handle each of those.
It makes sense given that SSH is a general-purpose protocol that's supposed to work for just about any CLI, so I was surprised for the same reason as the author of the article. Their issue makes sense in context as well - SSH has to support server shells like the Windows command prompt that don't even have the concept of an array of arguments at a low level, just the command string.
OpenSSH has already added plenty of extensions and the world seems to be fairly comfortable with using quite a few of them - for backwards compatibility reasons they probably can't change the current behaviour by default for -everybody- but a protocol extension and a flag to use it would improve the world substantially.
The current behaviour is already a UX trap, and such an extension would be another one on top of it. What's the sane thing to do if either end does not support it? Any kind of fallback would just make things even worse; detecting and supporting both variants is a road to hell. I still have CentOS 7 systems in production, with the last CentOS 6 box only decommissioned a few years ago. Judo mostly works because it sticks with the lowest common denominator, the value comes from things you can do on top of it.
What's interesting to me is that this "UX trap" is of ancient origin not mentioned in any post that I could find here. The whole ssh CLI started off trying to mimic the existing rsh CLI experience for basic things, just as scp tried to mimic rcp.
To me, this is a bit like the scp vs sftp topic. You really want a new remote-invocation CLI that throws away the rsh compatibility concept and is designed with "modern" goals. And, it probably needs new protocol elements to properly engineer it.
But this is also similar to the "system" callouts from multiple languages which abdicate responsibility and pass a command string to an external interpreter. Are all these API designers so naive, or do they have a philosophy of being so inclusive that they are not willing to assume that a command on a target system involves an executable and an array of arguments?
> Are all these API designers so naive, or do they have a philosophy of being so inclusive that they are not willing to assume that a command on a target system involves an executable and an array of arguments?
Remember APIs are made for humans, (assuming we're talking local processes) nobody can stop you from using raw fork(2)+pipe(2)+execve(2) every time, yet you will usually reach for popen(3).
Add a flag or option to the client, e.g. -oSplitExec=true or -E* or whatever. This then uses an execv request type or something like that. If the server does not understand that, this is an error.
OpenSSH updates tend to be rolled out very slowly though. So if you wanted any kind of widespread usability of this option, you would've needed to add this about ten years ago.
* this is of course alreay taken, it seems only -u and -z are free at this point. You could use -U and pretend to be a roman (EXECUE).
A while ago I wrote a Rust crate to deal exactly with this problem (which is a problem if you want to use SSH programmatically to run commands remotely).
The README contains my own explanation of the phenomenon.
That's such a useless statement when it comes to OpenSSH though. It's like saying I rode a bike for 23 years and it was fine until I tried to ride over a mountain.
This is frankly a stupid analogy. To use your analogy, so because your bike can't ride over a mountain, the one time you decided to ride over a mountain in 23 years, you conclude your trusty bike of 23 years is not good?
This is probably an instance of the DWIM principle. You want this tool to do a thing when you invoke it a certain way, but its objectives and methods differ from yours, so despite best intentions on both sides, you end up with unexpected behavior.
Argument parsing, whitespace detection, tokenizing a string, these are really difficult things even when a remote host is not involved. Try writing a shell script that always handles variables so perfectly that you can access any pathname in the filesystem, whether it has spaces, tabs, unprintables, Unicode, or whatever.
There are, of course, solutions, or at least workarounds to this. ssh can be configured on the receiving side so that it runs a particular command, and parsing whitespace/arguments should be significantly easier when you configure it this way. Alternatively, you could write custom Bash or Python to do what you want, and just have that tool on the receiving end. (The latter assumption is easier said than done!)
Configuring ssh to run a dedicated command line is probably an underutilized, underrated mode of operation. I used it to safely operate LAN backups: I had a client machine that would ssh in to a special user account, which had no actual shell, but immediately kicked off the backup process. That's the stereotypical use.
But for ad hoc use cases, like this blogger was probably trying to implement, you don't really have a chance to pre-populate the receivers with scripts or special configurations. And yeah, that becomes a pain when you're trying to do a clever one-liner and it chokes. I feel your pain. (Have you considered improvising a Python or Perl one-liner for ssh to run?)
> ssh can be configured on the receiving side so that it runs a particular command
Yes — but it will always, ALWAYS launch your shell as (typically) defined in /etc/passwd, and will pass it two string arguments: "-c" "and your particular command with arguments and all". Go strace it, go look at the argument vector, you'll see.
Thus if your shell is bash, zsh, tcsh, dash, sh, whatever: the moment that it sees and tokenizes "and your particular command with arguments and all", you'll have lost.
The only winning move is not to play.
> Configuring ssh to run a dedicated command line is probably an underutilized, underrated mode of operation.
Perhaps, but in any case, it doesn't save you. As I explained above, your shell runs first, so you've already lost. Maybe it's not underrated, maybe it's rated exactly right because of this problem!
But what if I tell you you can make your problems go away. You juuuuuust have to use this very special shell. I don't want to be downvoted for linking it for a third time on this page, so I won't. Thus: see my other comments on this page :-)
This is an instance of the problems caused by throwing everything into a simple plain stream, instead of having a modal stream that can tell what kind of thing is going there.
Try writing a python script that always handles variables so perfectly that you can access any pathname in the system. It's funny to even write that, because it's trivial (barring UTF-8 issues that require using a non-obvious API). But you can only do that because of those annoying quotes that create a new idiom inside them. Sh saves you from that annoyance.
Here's a wrapper script I wrote that has the same syntax as the ssh command but automatically shell-quotes arguments, ensuring they're passed through 1:1 to the remote command:
The quoting is done with Python's shlex.quote(), whose output should parse properly in any POSIX-compatible shell. (Before editing this comment I had linked a similar script from Stack Overflow, but it uses bash's printf %q to quote, which produces output that uses ANSI-C quoting, which isn't supported by dash.)
this isn't surprising, the author is just cowboy coding. i *expect* anything in un\*x to require 5 hours for first reading the man page (which the article has shown to be useless, informal, and idiosyncratic - making it worse than nothing - in this case, which isn't surprising either) then conducting an investigation into weather the behavior is portable. now you know why real programmers hate un\*x - it's actually bogus garbage that only charlatans can grasp. just the cowboy programmers like it because they can feel like they're doing something when they're actually just making errors everywhere.
lol im just gonna leave those typos (automatically double escaped asterixes) there which were caused by writing this comment on a non logged in page after logging in on a different tab. perfect example of the same string flinging garbage
It has always been like that, at least since I know it..20 years?!So, the author thinks it should be "fixed"?. Oh yeah, sure, we should add new openssh client parameter -yaml and send the remote command nicely structured in a yaml config.
Reminds me that OpenSSH fails with some insane error indicating memory corruption if you try to run it with tcmalloc, which can happen by accident if you call it from a Python program that uses it.
(tcmalloc is commonly used to fix memory 'leaks' (fragmentation?) in Pytorch, so this happens a lot to me. At least, I've been bitten twice.)
It's hard to imagine what might be going on to make it sensitive to the details of malloc/free... and I'm not sure that I want to.
I tried OpenSSH, libssh and libssh2. All three fail, though admittedly I don't know for sure this was the cause of failure for the two latter; they gave me no diagnostics.
They all worked fine without LD_PRELOAD though, so...
OpenSSH security features are tightly coupled to the underlying OS. I remember reading the code and seeing how sshd forks and re-execs itself in order to leverage dynamic library address randomization in each connection. I wouldn't be surprised if there are some malloc/free-related tweaks in a similar manner.
> After this, the client either requests an interactive shell or execution or a non-interactive command, which sshd will execute via the user's shell using its -c option
Its not sshd doing the argument splitting, its the shell on the server side.
now imagine if you're writing a script that calls ssh in a loop which runs on the remote host and uses sudo to run an awk cli command with a single-quoted one-liner.
So many things could be broken if we didn't insist on backwards compatibility with ancient systems. Virtually everything, for just one example.
Can you imagine if we didn't bother with backwards compatibility for servers? Or even just your desktop - you try to boot one day and find that your desktop doesn't boot because we dropped compatibility with MBR systems and only support UEFI, or that you choose ext4 for your filesystem, but the kernel only supports superqFS version 2023.07 ...
or that SSH backup script that has been working fine for 20 years suddenly, silently, stops working, and then your critical production systems die.
Or how about your favorite photo software decided to just change all of its keybindings, or stopped loading your photos -- from last year!
Not to detract from your overall point but for anyone out there wondering about a way to handle this in general: your job should update or annotate something (a file, a table, a bucket, etc.) upon success. Then you use a "dead man's switch"-style check/monitor that alerts if the job hasn't updated its proof of life, so to speak.
It'd be nice if they didn't break, but for example, since around OpenSSH 7.7, the order argument parsing changed from last wins to first wins. This change broke all of my aliases where I used to be able to have an alias sr="ssh -l root" and override the user with something such as "sr foo@host".
Last wins is how most other UNIX tools work, since it's the laziest approach (parse from first to last and just overwrite any old value). But I guess somebody tried to be explicit with the order of everything from the configuration file including the command line, and now this happened.
I really wish this would be reverted, but now it's probably been so long that it would break things for people again.
Your examples aren't relevant at all and full of hyperbole. Are you replying to what I wrote or showing off your oratory skills to the peanuts gallery?
SSH is a protocol. Just have the client negotiate the version of the protocol when establishing a connection. Use the old protocol by default, write new scripts with a flag requiring the new protocol from the server. Boom. You maintain backwards compatibility with ancient systems, but you don't force old mistakes upon current day users.
> SSH is a protocol. Just have the client negotiate the version of the protocol when establishing a connection. Use the old protocol by default, write new scripts with a flag requiring the new protocol from the server. Boom. You maintain backwards compatibility with ancient systems, but you don't force old mistakes upon current day users.
The problem isn't so much the protocol (SSH already does the client/server protocol negotiation), but the command line interface to the SSH tool. Changing the interface can stop older scripts from working in the same manner, so either you have to add completely new options to the CLI so that the older usage still works, or you have a renamed version of the tool (e.g. nussh) that won't need to worry about backwards compatibility.
> Not a problem for free software, just update the old code if stuff breaks ;)
I know one qa team at Google had the slogan "if it isn't broken, you aren't working hard enough" but they are the qa team. It is their job to find defects before anyone else.
It is not our job as developers to create defects for no reason.
If you can even still get the updates on the backwards-incompatible package manager. (as anyone who's ever maintained an Arch system has probably run into, and even Red Hat used to run into this problem way back in the day.)
well, that's another problem. there is a difference between what the developers see and what the users see.
the developers should have access to the updated API/ABI code and maybe some migration documentation in order to update the program. the user should have a clear migration process to avoid data loss or corruption.
in this case, the package manager's core functionality should not depend upon any other packages. this way you can just store all the previous versions of the program and provide an upgrade path. sure, it will take a bit if you're doing a big upgrade, but that's tour fault for not keeping stuff updated
The origins of the behaviours might be "ancient systems", but contemporary programs still rely on them. I do wish there was a bit more movement towards fixing some obvious mistakes, but it's not that easy – if it were, it would have been done already.
Oh wow, good to know it's that broken. I still had figuring out how to properly quote SSH command lines on my todo list because I've got some shell wrappers that try to transparently execute stuff inside containers on remote machines and they weren't dealing right with spaces and quotes (think trying to build `remote @server sql <statement>`)
It's weird that this has been known for over a decade, and noone ever added a 'pass on the argv[] array unchanged' in all that time
Now that I know I can't fix this in stock openSSH I think I'll just look into throwing argv[] into a base64 encoded JSON array and somehow have jq fix and exec it on the other end.
If you use bash on both sides, you can use printf for any words in the command that need quoting (or all words if you are writing a script):
ssh foo@bar baz "$(printf '%q\n' "something that needs $quoting")
For cases where one or both sides are not bash, but still POSIX like, it's relatively easy to write a function that uses single quotes; just replace every single quote with '\'' before wrapping in single quotes.
When you keep in mind that the given command string will be parsed twice, first by your local shell and then again by the remote shell, it becomes clear why a running a remote ssh command behaves like this.