Hacker News new | past | comments | ask | show | jobs | submit login

The problem isn't actually with the transfer protocol itself, it's the invocation of the remote scp is done using "$SHELL -c <some string>" and that turns out to be somewhat annoying to secure. The other parts (server sending other file than you requested) are really just pretty obvious oversights in validation (when you are doing open(server_response.fname, O_WRONLY) you should really have validated that fname...

That being said, scp-the-protocol is actually very simple. There is no spec for it, but a number of interoperable implementations and the protocol is really damn simple (it's basically goes "file <length of name> <name> <size> write <length> <data> write <length> <data>" and so on). It achieves good throughput (for large files) over SSH, but because every file involves a few ping pongs, it is RTT-bound for small files.

SFTP is much, much more complicated. And the spec situation is much worse, because there are like a dozen drafts and half a dozen different versions of the protocol. SFTP also pulls in half of the POSIX file semantics. SFTP naively is RTT bound for throughput; read size is limited to 64K in OpenSSH, so with 20 ms RTT you're only going to get at most ~3 MB/s with a naive client.

SFTP is essentially NFS, but over single SSH channel (and different). You get to ask for file handles, and then you can do requests on those handles. You get to opendir() remotely and get a directory handle and so on.

Like NFS, SFTP supports having multiple requests in flight (how many: implementation defined / no way to find out), so you can request multiple reads and wait for them to get around the 64K limitation. Problem: maximum read size is implementation-defined / no way to find out, which makes this really quite complex, since you have to account for reads coming back out of order and for reads being shorted than you requested without having reached EOF. Say you want to transfer a 500K file in 256K chunks, you schedule two reads of 256K and 500K-256K = 244K. Let's call them r0 and r1. Now r1 comes back, but it only read 64K (or 8K or 16K or whatever the implementation felt like). Now you need to figure out that (1) you should hold this data back, because the data before the offset of r1 has not been read yet (2) you need to issue another read to get the contents from 320-500K where (3) you may figure out that the implementation probably only does 64K reads (note: SFTP read request length field is 32 bit... expectations and all), so you get smart and schedule a few more reads: r2 for 320-384, r3 for 384-448 and r4 for 448-500K. Now you wait for the responses and get, e.g. r3, r4, r1, r2. You need to hold all this data and shuffle it around correctly, then write it in-order to the file (assuming you want to write the file sequentially, which is very reasonable if you want to have any chance at all of resuming the transfer).

This is on top of SSH already having throughput issues in the basic protocol over long fat networks.


Curious... I've never even had to think about chunk size when using sftp. It has always "just worked" for transfering files.

What scenarios are you talking about where chunks are important and you have to be concerned about ordering? Is this strictly for applications that perform large sync'ing jobs where "to-the-limit" performance is important?

It doesn't seem like a huge deal to deprecate scp and start using a short stanza of sftp for simple file transfers.

As a user of sftp(1) you don't have to care. You can even specify unsupported buffer sizes, because that sftp client does have all that complexity that I described above built-in. But if you need to interact with the protocol - well it's just a headache, plain and simple. (However, sftp(1) will write files non-sequentially - so you really can't just restart a transfer or you'll very likely end up with a corrupted file).

> This is on top of SSH already having throughput issues in the basic protocol over long fat networks.

Is that why I've sometimes observed slower-than-expected transfers when using rsync over ssh to do a mass migration of server data from one data center to another? Can you recommend an alternative (besides writing the data to external media and physically shipping it)?

You could try something like bbcp (https://www.slac.stanford.edu/~abh/bbcp/) if copying large amounts of data over relatively high-latency network connections.

I had some fun time coercing the build system to build this on an ARM SBC the other month, but alas, it did not seem to speed things up substantially over an LTE modem. Probably need to play with a bit more.

There are patches for openssh when using it over LFNs: https://psc.edu/research/networking/hpn-ssh/ (which seems to be in a sad state currently). These patches might interact poorly with other implementations (e.g., it didn't interoperate with paramiko (2 years ago)).

Also, make sure TCP window scaling is working. I was making transfers through a F5 Big-IP which was running a profile that disabled it.

If SSH throughput itself is the problem, idk., encrypt a tar file and wget it?

If SCP/SFTP is the problem due to small-ish files, use a tarpipe instead. Nothing beats tarpipes for small files.

Very good description of the problems with scp and sftp.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact