Apple's got a history of noisy services. I remember when iOS 4.0.0 was blocked from the email service I was working for and many others because it basically spammed mail servers. A lot of our customers were mad that the software update disconnected them, but Apple released 4.0.1 quickly to address this.
TCP cannot tell what the application wants. If the client application never closes the connection then TCP will keep that connection alive indefinitely (assuming you configured it with keep alives). Each connection consumes some resources on the server (primarily send and receive buffers in memory). So if your application never tells TCP to close the connection then it hogs server resources. There are many malicious DoS tactics that go for the same effect.
> There are many malicious DoS tactics that go for the same effect.
One of the most popular is a slowloris attack. It's particularly pernicious if it's distributed and coming from a botnet or something. https://en.wikipedia.org/wiki/Slowloris
Would a sane server not set an upper limit to the lifetime of these connections? In YouTube's case, something like... the length of the longest video? edit: or perhaps even the _exact_ length of the video being requested?
Rather than using a single HTTP request to pull a video stream, your player makes (hopefully intelligent) HTTP requests as needed to pull pieces of potentially multiple streams of different bitrates.
[0] http://labs.prx.org/2012/11/14/ios-6-0-devours-data-plans-ca...