Sigh. If you're doing bulk file transfers, you never hit that problem. If you're sending enough data to fill up outgoing buffers, there's no delay. If you send all the data and close the TCP connection, there's no delay after the last packet. If you do send, reply, send, reply, there's no delay. If you do bulk sends, there's no delay. If you do send, send, reply, there's a delay.
The real problem is ACK delays. The 200ms "ACK delay" timer is a bad idea that someone at Berkeley stuck into BSD around 1985 because they didn't really understand the problem. A delayed ACK is a bet that there will be a reply from the application level within 200ms. TCP continues to use delayed ACKs even if it's losing that bet every time.
If I'd still been working on networking at the time, that never would have happened. But I was off doing stuff for a startup called Autodesk.
That fixed 200ms ACK delay timer was a horrible mistake. Why 200ms? Human reaction time. That idea was borrowed from X.25 interface devices, where it was called an "accumulation timer". The Berkeley guys were trying to reduce Telnet overhead, because they had thousands of students using time-sharing systems from remote dumb terminals run through Telnet gateways. So they put in a quick fix specific to that problem. That's the only short fixed timer in TCP; everything else is geared to some computed measure such as round trip time.
Today, I'd just turn off ACK delay. ACKs are tiny and don't use much bandwidth, nobody uses Telnet any more, and most traffic is much heavier in one direction than the other. The case in which ACK delay helps is rare today. An RPC system making many short query/response calls might benefit from it; that's about it. A smarter algorithm in TCP might turn on ACK delay if it notices it's sending a lot of ACKs which could have been piggybacked on the next packet, but having it on all the time is no longer a good thing.
If you turn off the Nagle algorithm and then rapidly send single bytes to a socket, each byte will go out as a separate packet. This can increase traffic by an order of magnitude or two, with throughput declining accordingly. If you turn off delayed ACKs, traffic in the less-busy direction may go up slightly. That's why it's better to turn off delayed ACKs, if that's available.
One of the few legit cases for turning off the Nagle algorithm is for a FPS game running over the net. There, one-way latency matters; getting your shots and moves to the server before the other players affects gameplay. For almost everything else, it's round-trip time that matters, not one-way.
Do you have any links specifically discussing the issue?
Google for 'virtualbox sendfile' and there's a lot of discussion.
Worth noting, and this bit me recently, Go's http file handler also uses sendfile beneath the covers.
I then found out that it is actually mentioned in the Vagrant docs also:
nodelay and cork are different, indeed, but opposites? They both try to achieve the same effect, put more data in before sending a packet.
> [...] This mechanism is ensured by Nagle’s algorithm, and 200ms [...]
Absolutely not. Nagle's algorithm does not have any delay or timer build in. It simply holds back non-full packets, when there is data in flight (not acked). The second half of the problem is delayed acks, but this is not mentioned in the article, instead it goes on saying
> [...] but Nagle is not relevant to the modern Internet [...]
which is indeed popular belief, but a very superficial analysis that holds no water if you study it further.
The feeling I always get from articles like this is they border on "technical religion". It sounds correct, it is technical, it isn't even false, but it doesn't paint a clear picture, instead it mystifies things further.
The problems nginx had:
1. nagle and http keepalive don't play nice together, the last bit of data might be artificially delayed, especially when delayed acks come into play. nodelay seems needed here. (It is not though, see that Minshall bit.)
2. how to send headers and use sendfile for the body, and fill the first packet with more then just the headers? nopush (tcp_cork) is a solution.
Igor said in a talk I went to that Nginx 1.x was written for FreeBSD first, while 2.0 will be written for Linux first so perhaps some of these things may change (hence "nopush" in the config file, the freebsd term).
As the latter part of article describes "tcp_nodelay on" is at the odds with "tcp_nopush on" as they are mutually exclusive, but nginx has special behavior that if you have "sendfile on", it uses "tcp nopush" for everything but the last package and then turns nopush off and enables nodelay to avoid 0.2 sec delay.
That said, it equates to a single kernel-to-kernel copy instead of a kernel-to-userspace plus a userspace-to-kernel copy as in the read/write case.