: “Is it Still Possible to Extend TCP?” - http://www0.cs.ucl.ac.uk/staff/ucacmha/papers/extend-tcp.pdf
: Minion - https://tools.ietf.org/html/draft-iyengar-minion-protocol-01
: TCP Hollywood - http://ieeexplore.ieee.org/document/7497221/
Another problem is that TCP is a bytestream protocol. Apps that stream over TCP don't usually add packet-orientated framing and resync points, so if you lose a packet, the receiver will often need to discard quite a bit of data after the missing packet before they can start decoding again. Effectively this multiplies the effective loss rate. In the extreme, there's the potential for congestion collapse, where lots of packets are being delivered but none of the are useful, so they're all discarded at the receiver.
Edit: I should add - middleboxes often resegment the datastream, merging multiple packets, or splitting large ones. So even if the sender added a header in each segment sent, those headers may not be at the beginning of the segment when it arrives. After a loss, you may not be able to reliable find the next header again.
By the way, that web server at UCL may well be the oldest on the Internet. It's probably the only server left proudly running CERN/3.0 on Sparc hardware since 1994.
What you can do is to have a good protocol that requires no interference from middleboxes but detects it if it happens, and then a less efficient legacy fallback protocol that basically looks as much as possible like HTTPS.
Then if you detect interference from a middlebox, show the user a message that says, "WARNING: MAN IN THE MIDDLE ATTACK DETECTED. Something is modifying connections on this network. This may compromise security and performance."
Then hopefully having multiple different apps show a message like that to every user on the network will get enough users complaining to fix the middlebox so that it stops breaking new things.
Usualy this refers to various "security" "solutions" which attempt to do deep packet inspection and generally break various things, but it also can mean NAT or L2 bridges that do filtering on L3/L4 headers (for example DOCSIS CMs and CMTSes are such middleboxes)
Once upon a time, I used to write network protocol sniffers. I basically simulated a partial TCP stack, in order to allow higher layer protocols decode traffic that was out of order. If a packet was lost in the capture, it's basically the same thing as the "lying ack" in the article, where the real endpoint would ack a frame the capture engine never saw.
This becomes really difficult to process, because TCP simulates a stream, and if you lose a packet, and skip some bytes in the stream, where you pick up and continue may not be the beginning of a new message. So you also need to create heuristics, on how do you discard received data, until you get to a point where you can understand the stream again, and continue processing.
As others have commented, UDP is likely a better choice here, as ordering semantics can be added, but on packet loss, the next received packet can always be parsed. Or look at SCTP, which allows turning on/off in order deliver, and I believe through an extension allows turning on/off reliability (but I'm not sure if this ever got implemented).
The irony of TCP is that in almost every use case you end up creating "frames" within the stream. Turtles all the way up I guess.
Edit: we can all collectively make fun of telco protocols that are HDLC-in-HDLC all the way down, but such things usually do framing only on the outermost layer.
One way to do this is to use a format where you can resync. For example, have special values that only occur in certain positions. You look for one of those values and can then enter a known state.
Another possibility is to be selective about what you will allow to be discarded. Suppose you have a framing structure so that you send a byte count followed by a blob of data (like video or audio data). Then you can use these anticipatory acks to express that you can live without that blob. For example, if your protocol says "and now here's 123456 bytes of video data", you can ack with a sequence number that reflects having received those 123456 bytes but no more. Obviously, this limits your ability to skip ahead and may not be as useful.
This can be trivially easy for some data formats, where you know the absolute or relative offset of everything. E.g. here is a chunk of data I don't care about: 32 bit length followed by that many bytes. Read the length, then hit the llseek system call (that would be supported on sockets just for this feature) to skip the stream position that many bytes ahead. Done.
If it's due to congestion, you've just subverted the mechanism TCP uses to relieve it - and the more people that are using the deviant implementation, the bigger the problem becomes, which impacts everyone.
We'll stick with the VoIP example. Packets are going to be dropped independently of any information in the TCP headers. If there's too much load, some of them will disappear. If your client says "yeah I got that data" when it didn't actually get the data, it doesn't increase load any more -- if you're sending a real-time 256kbps audio stream, then you need to be able to send 256,000 bits every second regardless of whether or not the network is capable of that. By not retransmitting packets you don't care about, you're decreasing load.
Of course, if you push the information all the way up the stack to the application, you can do interesting things. You can notice that packets are being dropped and switch to a codec that sounds worse but sends fewer bits per second, and maybe you'll get a better quality call going on with fewer dropouts.
Now that I think about it, I'm surprised this isn't how we deal with degrading things over slow connections in general. I would much prefer to only get the "mobile friendly" version of a page if I'm actually on a mobile connection. Right now, the hueristic seems to be "if screen < desktop; send mobile page". That of course is silly because my home WiFi can happily pipe huge images into my phone faster than the web server can send them to me, while my 8 core laptop with a 4k screen tethered to my phone can't magically make 4G faster. Interesting interesting.
Suppose there is congestion, and that if the sender(s) don't slow down, it will just get worse. Well, they don't find out about the congestion if the receivers lie (and congestion is asymmetric, so that the ACKs get through just fine).
That's not good.
Now, one way to deal with this is to lie but only for a bit. After a while of not receiving anything, the receiver should stop sending ACKs and the sender should notice the congestion. That might help. But the underlying problem is real: lying ACks -> failure to detect congestion -> worse congestion.
Also, mobile friendly versions of sites are at least as much about the user interface and rendering speed as about reducing bandwidth usage.
I would like to see a proof of concept before I full buy in to the author's claims. It's difficult to tell what this might do to proxies or how all the various router firmwares on the Net might handle it. I could see a hop along the way having trouble with the receiver claiming it received packets it could not have. For that matter, it's possible for the receiver to claim to have received a packet that the sender had not yet generated. The sending TCP stack may very well consider this an error.
I happen to have some experience with this case (receiving an ack of an unsent packet).
Linux since 2009 will silently drop acks of unsent data . FreeBSD follows the RFC and will send ack with current sequence and ack numbers to try to 'resync'. As long as this modified stack doesn't respond to that ack with another ack, it would probably be ok. There's a reviewed and accepted patch for FreeBSD to rate limit the acks it sends in this case, but it doesn't seem to have been committed 
 https://github.com/torvalds/linux/commit/96e0bf4b5193d0d97d1... (although the comment says this is consistent with the RFC, it actually isn't)
The article deals with this head on: that's the essence of the ambiguity of "I have received up to X" versus "I am not interested in bytes up to X". The second intent is consistent with not having received bytes up to X, which is consistent with them not having been sent yet at all.
The anti-congestion-control situation is when the receiver is in fact interested in getting all the bytes, and so "I am not interested in bytes up to X" is of course a lie. But so is "I have received up to X".
I think you missed the point. The article makes no mention of how existing implementations handle this case. It seems the author had only theorized based on his knowledge of the TCP protocol.
Other behaviors, like crashing or messing up the stream, of course, spoil things.
There is a problem if the receiver sends only an ahead-of-sequence ack, without acknowledging frames before, and the sender drops that ack. The sender must acknowledge everything actually received, and respond properly to window probes, to ensure forward progress.
Bingo; that precisely the context in which I first saw the technique of 'fake ACKS' described: as a congestion-control-defeating mechanism which provokes senders into sending faster.
I used to argue that you want to use fair queuing and drop the newest packet in a stream, so all the old packets get delivered. But Random Early Drop, which is trivial to implement, caught on in routers. That means you lose random packets from the stream, and they have to be retransmitted.
There's nothing awful about acknowledging a TCP sequence number for a valid packet received even if some previous packets are missing. You know the missing packets were sent, or you wouldn't have the later packet. If you don't need them, why ask for retransmission? The receiving TCP doing this should notice that packets are being lost and cut down the congestion window a bit, so the sender will slow down.
This is purely a receive-side thing. It shouldn't bother upstream middleboxes. But, as someone pointed out, about 25% of middleboxes don't like it.
The idea is to send ACKs ahead of time to provoke the sender into transmitting faster, defeating some of the congestion control mechanisms in TCP.
The network has to be reliable to make it work (the pipe has to buffer all the excess spew), and the sending stack has to ignore situations when it gets a premature ACK, ahead of the sequence number it has actually sent.
So it depends more upon whether the use case is appropriate than it does about technical implementation.
In a true best effort streaming system, the sender wouldn't retransmit. With this scheme, the sender might retransmit, sometimes, depending on whether certain packets get through and allow certain acks to get sent.
If the data has been sent but lost, the sender will accept the aggressive ack. If it hasn't been sent yet, it will accept the less aggressive ack.
For that matter, if you have some way of knowing the expected transmission rate (even approximately as long as you can resync to reality), you can blindly send a stream of periodic acks with steadily increasing sequence numbers.
Subverting TCP leads to all sorts of problems around congestion(which you can no longer filter on because which TCP streams are being non-compliant?) that it just should not be done.
Then you get 6,7. Is 4 still out there? You've got 3 packets in your pocket, waiting for 4. That adds up too.
TCP gives you some idea of what packets you SHOULD have received, so you can respond better. UDP doesn't have any windowing etc so you have no idea.
Generally if you're hitting cases where TCP is causing you grief and you need to reach for UDP you've already got enough context to understand your congestion problems/etc.
We've been doing this in game-dev for decades, ditto the voip space so it's not like you don't have a wealth of knowledge to draw from if you're really stumped.
Most folks use some UDP-based protocol package instead of reinventing the wheel. Its not rocket science, but it isn't trivial. Defining your own packets to do all the flow stuff is just work, like any other programming task.
I've built variations of UDP based protocols 4 or 5 different times over my career. I'm literally in the middle of this right now with the radio framing protocol I've been developing. I really think you're making it out to be much harder than it is.
It focuses narrowly on a congestion control protocol, and is intended to be combined which whichever datagram-based protocol you have lying around that might be suffering from congestion issues.
Specially when compared with creating and using a user space IP protocol like was done here, or adding a new one into the kernel.
Don't underestimate how often packets are received out of order. There's even a consumer DSL modem that swaps every odd UDP packet with every even one - I had to compensate for this in a VOIP product. Using TCP in this bastardized way would cure that. That said, I tend to agree its a poor idea to use TCP in this way. The famous book on IP used to list 8 protocol possibilities (only 2 commonly survive today, UDP and TCP) of which streaming and packet reordering was a valid combination (without acking/retransmitting). Don't know what it was called, but that's what being attempted here.
But reordering also happens simply by examining received packets in a burst, and putting them right.
So its six of one and a half-dozen of the other I guess. Sorry for coming off so abrupt.
But probably not.
From a stateful firewall's point of view both UDP and TCP have state.
That said, I've never seen a router that didn't allow UDP packets to flow back to the origin client.
DNS forwarders like dnsmasq are a relatively recent inclusion in home routers. Sure, they've been there for 10 years or so, but they weren't there for the 5+ years before that. Before Linux took over the embedded OS on home routers, the DHCP servers just passed the DNS configuration that the WAN port got from the ISP, and you can still do that now if you want. That's why nslookup.exe and dig still work on your workstation when you specify an external DNS server instead of the one your DHCP server on your home router gives you.
> That said, I've never seen a router that didn't allow UDP packets to flow back to the origin client.
Which is the point I was making.
> If it’s used that way, it makes TCP a reliable, in-order protocol.
the key here is "in-order".
The question is whether you want the other mechanisms of TCP: handshaking, (limited) retransmission, flow control...
In the last 10 years of streaming rtp over the Internet the vast majority of failures is bursty. Reordering is more rare than you'd expect.
Looking at one sample from India to Europe over a period of 6 weeks, my 30mbit rtp stream was fine 99.998% with fec. The rest of the time that's why I dual stream, either timeshift (send the same packet again 100ms later - as I said most outages are time based), or dual path (although if the packets traverse the same router en route there are issues), or even both.