Why TCP Over TCP Is a Bad Idea (2001)

minimax · on March 28, 2015

TCP over TCP is a great idea for end user VPNs because 1) networks are generally pretty reliable and 2) NAT is everywhere. I regularly use an SSL VPN over crappy coffee shop wifi and tethered 3G connections. For the most part it works fine.

agwa · on March 28, 2015

NAT is an argument against TCP, not for it, because when your NAT'ed IP address changes, TCP connections break. UDP will drift between NATs without skipping a beat.

And I'm very surprised you find crappy coffee shop WIFI and 3G to be pretty reliable - those are exactly the kinds of networks which have occasional sustained periods of high packet loss which wreck havoc with TCP.

Edit: the only downside of UDP VPNs is that stateful firewalls can have extremely short timeouts of UDP "connections" (e.g. 30 seconds!), which necessitates the VPN to constantly send keepalives, which kills battery life on mobile devices. TCP connections tend to be given much longer timeouts.

chetanahuja · on March 28, 2015

Completely agree with your righteous rage about TCP over bad connections. I allowed myself a bitter chuckle to the GP's "networks are generally pretty reliable" thing.

Minor disagreement though:

"Edit: the only downside of UDP VPNs is that stateful firewalls can have extremely short timeouts of UDP "connections" (e.g. 30 seconds!)"

Plenty of mobile networks will timeout inactive TCP connections in less than 30 seconds. TCP keepalives are an absolute requirement on long running mobile connections.... e.g. Google maintained GCM connections on Android for notifications... a simple packet capture will show you the frequency of keepalives there... and it's almost always more frequent than per 30 seconds.

agwa · on March 28, 2015

> Plenty of mobile networks will timeout inactive TCP connections in less than 30 seconds.

Good grief - that is awful, but sadly believable. Do you happen to know who does that? A couple years ago I tested AT&T's 3G and found that the TCP timeout was 30 minutes, versus 30 seconds for UDP. I'd love to know numbers for other carriers.

Edit: Found an interesting paper from 2011 that tested 73 cellular carriers worldwide and found only 4 with TCP timeouts less than 5 minutes. The majority had timeouts greater than 30 minutes, and 21 had a timeout in the 5-30 minute range. Some of my faith in humanity has been restored. http://www.cs.ucr.edu/~zhiyunq/pub/sigcomm11_netpiculet.pdf (see page 8, table 5)

Edit: A paper from 2012 which measured Verizon and Sprint at 30 minutes, and AT&T at 3 minutes (my tests are more recent, so perhaps AT&T wised up?) http://www.cs.umass.edu/~yungchih/publication/12_mtcp_4g_tec... (see page 11, table VI)

zrm · on March 29, 2015

Timeouts that short violate RFCs. Established TCP connections can't be abandoned unless idle for two hours four minutes (RFC5382 REQ-5) and even UDP timeouts normally have to be at least two minutes (RFC4787 REQ-5).

If people are violating the RFCs then applications that detect it should probably start notifying users exactly why their battery life is suffering.

chias · on March 29, 2015

I see a lot of "should"s. When push comes to shove, though, a notification is not going to change the behavior of the network.

There's a lot of complaints I have about how some networks (especially mobile ones) deviate from RFCs and break specifications. A 30 second timeout is not nearly at the top.

beefsack · on March 28, 2015

TCP over TCP for VPN is pretty useful in places where deep packet inspection blocks UDP by default, eg. GFW in China.

droopyEyelids · on March 29, 2015

Do you have any idea why the GFW blocks UDP by default? I can imagine that for corporate networks, but as far as I know, uninspectable UDP streams only recently came into existence with Google's QUIC (which assumes pre-negotiated encryption keys are still valid)

Why block all streams if you can inspect them (or at least their handshake)?

duskwuff · on March 29, 2015

Probably because they determined that most of it was VPN or other encrypted traffic, and that blocking it was easier than trying to inspect it.

beefsack · on March 29, 2015

While practically no official information exists publicly, this appears to be the reason. My gut tells me that the lack of structure in UDP makes it a little harder to inspect too.

rahimnathwani · on March 29, 2015

Do you have any information you can share about this? A few years ago, I could reliably use OpenVPN over UDP, as long as I switched ports out frequently. Some time ago (I don't remember when), this ceased to be the case, and I switched to PPTP and, more recently, Shadowsocks.

What has been your experience with UDP over GFW?

beefsack · on March 29, 2015

UDP used to work (~3 years ago) but currently it's blocked wholesale. OpenVPN over TCP gets throttled and blocked thanks to DPI too, an obfuscation layer is required because OpenVPN traffic is identifiable due to a fairly unique encryption fingerprint.

http://blog.strongvpn.asia/china-blocking-udp-ports/

rahimnathwani · on March 29, 2015

Thanks. This is totally consistent with my experience, and it's good to know that it's not just me :)

minimax · on March 28, 2015

I'm just saying in a previous life I used to spend a ton of time fighting with IPSec NAT traversal issues. With TCP encapsulation (e.g. SSL VPN), you don't have that problem. Most NAT firewalls do a good job dealing with TCP. Other protocols are more questionable.

When I'm using wifi at a coffee shop and start getting a bunch of packet loss, I will switch to a tethered 3G connection. When my SSL VPN reconnects, the VPN server hands me back the same IP address I had before. In some cases, my SSH sessions don't even drop.

agwa · on March 28, 2015

IPSec is indeed hell with NATs, and an SSL VPN would be much better. But UDP is even better - most NATs do a good job with UDP too, and if done right, it's possible to switch Internet connections without the VPN having to reconnect.

eps · on March 28, 2015

The hell are you fellas smoking? IPsec NAT traversal has been a non-issue since it was standardized about 10 years ago.

mordechai9000 · on March 28, 2015

DTLS is a standard protocol for TLS over UDP. It is used by existing commercial products, such as Cisco Anyconnect.

cbhl · on March 28, 2015

In my experience, many NATs just drop UDP packets altogether, but still allow TCP through.

Similarly, I find the Internet at my local Starbucks to be some of the fastest, most-reliable networks around; even faster than some local ISPs hooking up to my home.

chetanahuja · on March 28, 2015

Note that any NAT that drops UDP packets altogether will basically disable DNS and VOIP type applications or at least degrade the experience in serious ways. I haven't come across many such NATs recently.

agwa · on March 28, 2015

Me neither. I have come across networks that filter everything but a few TCP ports (like 80 and 443), but that's a matter of draconian firewall policy rather than a NAT limitation.

minimax · on March 28, 2015

I have come across networks that filter everything but a few TCP ports (like 80 and 443)

Another point scored for TCP in TCP with an SSL VPN. :-)

agwa · on March 28, 2015

Only if your TCP VPN is listening on one of the allowed ports ;-)

(I do keep a TCP VPN running on port 443 ready to go for these situations, but UDP is always my first choice.)

selectodude · on March 28, 2015

Google is apparently paying for fiber connections to Starbucks, since AT&T DSL wasn't getting it done.

jerf · on March 29, 2015

Did you read the linked article? (Or did anybody else who is replying?)

It's not about how TCP-over-TCP is somehow aesthetically displeasing or about how people should feel bad about doing it. It's about how TCP-over-TCP is a technically bad idea because stacked TCPs interact poorly. It's never a good idea. TCP-over-TCP is still a profoundly flawed protocol even if you don't happen to tickle its problematic cases.

chias · on March 29, 2015

> It's never a good idea.

I disagree with that profoundly:

I read the article. Yes, TCP over TCP include unnecessary performance-harming features in certain circumstances, because you have two unrelated collision avoidance systems running at the same time. Even so, in many scenarios TCP over TCP is an excellent idea: it can provide you with many benefits, in practical terms works great, and has no readily-available alternative which is better.

It doesn't solve the problem as theoretically neatly as possible, and carries some cruft. But hypothetical me at a starbucks about to open an ssh tunnel to a trusted connection, hypothetical you telling me that that's never a good idea. Okay then, what should I do instead?

IgorPartola · on March 29, 2015

TCP over UDP.

chias · on March 29, 2015

So sitting at starbucks with my laptop -- what do i do? I am not aware of an option to have SSH run over UDP, although I do know that some VPNs allow you to use UDP instead of TCP.

Unless there is a relatively simple way of getting an encrypted tunnel for my HTTP traffic using tools like ssh and netcat and other things I'm likely to already have installed, I disagree with the notion that it's never a good idea.

IgorPartola · on March 29, 2015

When you are running a SOCKS proxy through ssh, you are not doing TCP over TCP. We are talking about things like OpenVPN which can do TCP over TCP, but that is generally a bad idea. It's default mode is TCP over UDP, as it should be.

chias · on March 30, 2015

> When you are running a SOCKS proxy through ssh, you are not doing TCP over TCP

Are you sure? SSH uses TCP, and encapsulates the web traffic which also uses TCP... in what way is that not TCP over TCP?

jerf · on March 31, 2015

Here, read this: https://github.com/apenwarr/sshuttle#theory-of-operation

"TCP-over-TCP" and "TCP carried on something that happen to be TCP" turn out to be two very different things.

chias · on April 13, 2015

Thank you for the link :)

nickodell · on March 29, 2015

Ok, so how can we fix this?

Perhaps we could disable retransmission on either the upper or lower layer?

beagle3 · on March 28, 2015

> For the most part it works fine.

Only for apps that don't try to utilize maximum throughput. Skype, YouTube, most web browsing, most mail use.

But those that do - like large file transfers over ftp/sftp or a very large email, for example - will cause the meltdown described in this article.

There are some TCP stacks that use RTT rather than packet loss as their congestion metric; Those fair well under a TCP-over-TCP regime (but have other problems)

apenwarr · on March 29, 2015

UDP (and datagrams in general) are not the only alternative to TCP-over-TCP. My sshuttle VPN uses TCP but avoids the TCP-over-TCP problem. https://github.com/apenwarr/sshuttle

seunosewa · on March 29, 2015

Please explain how it avoids TCP-over-TCP. It doesn't seem to me that multiplexing multiple TCP connections over a single TCP link avoids the problem.

apenwarr · on March 29, 2015

TCP congestion control fundamentally depends on packet loss to know when to slow down. If the outer TCP makes sure packet loss doesn't happen - because if it does, it retransmits - then the inner TCP won't know what's going on, and will send as fast as it can, creating a mess.

The trick with sshuttle is that you terminate the TCP sessions at the server, and just send the raw data over the multiplexed link; there are no inner TCP headers anymore. Then you add them back at the other end by reconstructing a new TCP session. This eliminates the second layer of TCP congestion control inside the tunnel.

throwaway41597 · on March 29, 2015

I'm not totally sure I understand correctly:

1) a TCP packet from source IP S comes in on side A of your tunnel

2) instead of acknowledging the packet, side A only sends it as data to side B other TCP (ssh)

3) the data may get lost, in which case, the TCP connection between A and B retransmits

4) side B gets the data, forwards the packet to the destination IP D

5) D acknowledges, sends a packet to S

S --------- A ================= B ------------ D

When there is a lot of packet loss at step 3, the delay before S getting the acknowledgement sent at step 5 increases and S sees the congestion. Unlike TCP-over-TCP where A acknowledges packets from S as soon as it gets them.

Is that right?

chubot · on March 29, 2015

Why wouldn't it? There aren't 2 TCP stacks. There aren't 2 timers as described in the article. I think the burden of proof is on you.

danudey · on March 29, 2015

Somewhat off-topic but sshuttle is super super cool, thanks so much for writing it.

parennoob · on March 28, 2015

> Because the timeout is still less than the lower layer timeout, the upper layer will queue up more retransmissions faster than the lower layer can process them. This makes the upper layer connection stall very quickly and every retransmission just adds to the problem - an internal meltdown effect.

Intuitively, this is obvious at an organizational level. If your boss is constantly micro-managing each piece of work (segment), and his (or her) boss is doing the same, you are going to have a meltdown.

usaif · on March 29, 2015

TCP does what it's designed for: Reliable, ordered, stream connection with fixed end-points. UDP is the other extreme of this permutation. Wonder why no one explores the spectrum in between?

chias · on March 30, 2015

All of those properties are binary things -- it's either reliable or not reliable, ordered or not ordered, etc. The notion of mostly ordered or mostly reliable we sort of get "for free" with UDP. So then your question becomes: what about some properties and not others? There are ways around the fixedness of endpoints so I'll just look at reliability and ordered-ness.

Unreliable but ordered: use UDP, enumerate your packets, and if you get a packet out of order, discard it. No cheaper than UDP, and I can't really see a potential benefit over it.

Reliable but unordered: for this to make sense you have to impose a timeout (i.e. if you're willing to wait forever, UDP is reliable in that you can never be sure you won't ever receive that packet). So now you have ACKs and NACKs and you essentially have TCP minus congestion control and where you don't bother to re-order packets based on seq #. I can't really see the benefit of this either.

That said, there are many non-tcp-non-udp protocols out there. I just wouldn't say that the protocol-space is a spectrum with TCP on one side and UDP on the other -- there are many, many dimensions to look at.

crazy2be · on March 29, 2015

Perhaps I'm missing something obvious, but isn't this trivial to solve by just having a deduping filter at the lower level? When the higher level starts keeps sending duplicate packets, just ignore them. Duplicate packets at this layer will always be useless, because your lower layer is already implementing reliability semantics. Then, when you get ACKs from the other side, translate those packets to match the sequence numbers that the higher level is expecting (i.e. the most recent sequence number that was sent out corresponding to a packet with those contents).