
The curious case of slow downloads - jgrahamc
https://blog.cloudflare.com/the-curious-case-of-slow-downloads/
======
BugsBunnySan
Related story time: At university, we ran a local quakeworld server (yes, this
was in the stone age XD). And I wanted to write a tool to allow users to
control to server. It would listen on a network port and pipe its input into
the quakeworld server running as a child subprocess.

Sounds simple enough, right? Disregarding accept(2)ing connections,etc... it's
just going to sit in an endless loop, select(2) on the sockets then read what
comes in, and write that to the quakeworld server.

To debug this, everytime someone sent some data over the network, it would
write a simple message ("blah") into a file on my home drive, called "~/blah"
(very sophisticated debugging, I know).

Now, the semantics of select(2), cousin of poll(2), are interesting.

My understanding: select(2) blocks until you can read data from the socket.

Actual behaviour: select(2) blocks until a subsequent read on the socket does
not block.

Fun fact: if there's not data available on the socket, read does not block,
but returns with an error immediatly.

Result: Over the weekend, the process was spinning in an endless loop, writing
'blah' into a file on the nfs-home drive as quickly as the filesystem allowed.
This being a university scale solaris NFS server, that's pretty fast.

End Result: A test file called 'blah', filled with about 800 GB of 'blah'.
Good thing the admins where nice people and didn't shoot me when I went to
explain the next monday why the home drive was filled with "blah" XD

EDIT: read(2) doesn't actually error on EOF, just returns 0, effect is the
same though.

~~~
brazzy
> Good thing the admins where nice people and didn't shoot me when I went to
> explain the next monday why the home drive was filled with "blah" XD

They should be the ones explaining why they don't have disk quotas set up on a
shared system.

~~~
BugsBunnySan
AFAIR, they had one, which was so flakey they had to disable it all the time.

But yeah, maybe them being somewhat at fault as well helped me out a bit :)

~~~
TeMPOraL
Sounds like a story straight from Unix-Haters Handbook... :).

------
bigethan
This is why open source (and controlling your whole stack) matters in big
business.

For example, Microsoft may be changing their image, but their core software is
closed source. <cloud provider> may be great, but can you pull off something
like this when you've got an issue? The importance of being able to debug and
patch your mission critical systems is hard to overstate.

Please encourage your employer to fiscally support the open source code that
they rely upon!

~~~
masklinn
> For example, Microsoft may be changing their image, but their core software
> is closed source. <cloud provider> may be great, but can you pull off
> something like this when you've got an issue? The importance of being able
> to debug and patch your mission critical systems is hard to overstate.

Not OSS != no source access. A cloud provider on top of the MS stack would
most likely have Shared Source Initiative licenses.

~~~
bigethan
What's that cost? How quickly can it be done? If I was a developer would I
have to ask my manager to look into the details of our support plan? What
about the ability to patch the source?

`git clone` is an incredible power. Companies should reward those who are
generous with their code.

~~~
masklinn
> What's that cost? How quickly can it be done?

Careful with those goalposts, you're tearing up the pavement.

> Companies should reward those who are generous with their code.

So companies "should reward those who are generous with their code" but not
compensate them for it?

~~~
lmm
> So companies "should reward those who are generous with their code" but not
> compensate them for it?

No, they absolutely should compensate them for it. Buy OSS.

~~~
jamespo
"So you're asking me to pay for something we can get for free" said every
Finance Officer ever...

~~~
bigethan
"Hey CEO, There's an easy way I can increase team morale for less than the
cost of a happy hour" said the VP of Engineering.

~~~
jamespo
Can you explain how team morale improves if (for example) RHEL is used over
Centos?

------
Animats
They just rediscovered silly window syndrome, but at the application level
rather than the TCP level.

In early TCP implementations, if you read from a TCP connection one byte at a
time, and the sender was writing faster than the receiver, the sender TCP
would get an ACK opening one byte of window. The sender TCP would then
transmit a packet with only one byte. This often came up if a TCP connection
was driving a slow device. The fix was to not send a window update until there
was at least one full packet worth of window available.

This is the same problem, but at the OS buffering level. Linux has a similar
solution. The problem is 1) servers today have to protect themselves against
readers that are "too slow", because that can be used as a denial of service
attack, and 2) the approach to doing this was not well designed.

This is a tough one for Cloudflare, because they're in the DDOS-prevention
business. They have to decide whether a slow reader is an attacker or just
someone on a slow connection. Still, insisting that a reader consume at least
5MB/minute is a bit much. Some people are still on dialup.

------
scurvy
I really wish people would stop misusing TCP resets. It's _not_ just a fast
way to close a connection. That's not what it's for at all. If you're ever
thinking about sending a reset, please read RFC 793 and RFC 3360 first.

I'm looking at you Arbor.

~~~
Dylan16807
Well the goal here seems to be aborting a connection, not just closing it.
They specifically don't want to empty the buffer, because time is up. Is that
a misuse of reset?

~~~
scurvy
Yes, it is a misuse of a TCP reset. The RFCs are very clear about when to use
a TCP reset.

You can't use a reset just because you want to be fast or don't feel like
sending any more data. Resets are for something abnormal happening. A timeout
isn't abnormal to a TCP connection. It's part of the process.

~~~
arielb1
Closing the connection would cause the client to believe that the file has
been fully downloaded - resulting in a truncated file. Not good.

HTTP does not have a way of sending an error in-mid-transfer, so a TCP reset
has to suffice. The other option would be to blackhole the connection, but
that would be worse for both sides.

~~~
nunwuo
> Closing the connection would cause the client to believe that the file has
> been fully downloaded - resulting in a truncated file. Not good.

That's not correct. Both Content-Length and the chunked encoding scheme would
allow the client to know that the connection was terminated before the end of
stream was reached.

------
Yrlec
Great that they solved it! My company (degoo.com) had to use CloudFront
instead of CloudFlare for our binaries because of this bug. Our conversion
rate went down about 10% whenever we used CloudFlare instead. Perhaps time to
switch back.

------
LoSboccacc
ah! got bit by this and had to workaround serving stuff from s3. nice to see
there was an actual problem and wasn't just some issue with our stack.

------
speps
Original title doesn't mention "NGINX".

~~~
gruez
You're correct, but the title is way less helpful without the "nginx". Unless
the reader knows that cloudflare uses nginx, he won't know the article
pertains to nginx

~~~
masklinn
Is it now? nginx seems to have pretty normal behaviour here, so the core issue
is write-polling buffers on linux rather than nginx itself. It seems to me
nginx was the messenger of sorts, but the issue doesn't really pertain to
nginx.

In a nearby comment, LoSboccacc notes that they've hit that issue [removed: on
S3].

~~~
ParadoxOryx
It sounded to me like LoSboccacc's solution was to serve files from S3, not
that they experienced the same issue on S3.

~~~
masklinn
You're right, I misread it.

------
bluesign
I guess dropping very slow connections are also kind of protection against DoS
or DDOS attacks, where attacker can drain resources.

~~~
scurvy
Slow reads are not a great attack vector on modern OS releases.

Slow requests (like slowloris) were much more effective.

