
Why TCP Over TCP Is a Bad Idea (2001) - nikital
http://sites.inka.de/~W1011/devel/tcp-tcp.html
======
minimax
TCP over TCP is a great idea for end user VPNs because 1) networks are
generally pretty reliable and 2) NAT is _everywhere_. I regularly use an SSL
VPN over crappy coffee shop wifi and tethered 3G connections. For the most
part it works fine.

~~~
agwa
NAT is an argument against TCP, not for it, because when your NAT'ed IP
address changes, TCP connections break. UDP will drift between NATs without
skipping a beat.

And I'm very surprised you find crappy coffee shop WIFI and 3G to be pretty
reliable - those are exactly the kinds of networks which have occasional
sustained periods of high packet loss which wreck havoc with TCP.

Edit: the only downside of UDP VPNs is that stateful firewalls can have
extremely short timeouts of UDP "connections" (e.g. 30 seconds!), which
necessitates the VPN to constantly send keepalives, which kills battery life
on mobile devices. TCP connections tend to be given much longer timeouts.

~~~
chetanahuja
Completely agree with your righteous rage about TCP over bad connections. I
allowed myself a bitter chuckle to the GP's _" networks are generally pretty
reliable"_ thing.

Minor disagreement though:

 _" Edit: the only downside of UDP VPNs is that stateful firewalls can have
extremely short timeouts of UDP "connections" (e.g. 30 seconds!)"_

Plenty of mobile networks will timeout inactive TCP connections in less than
30 seconds. TCP keepalives are an absolute requirement on long running mobile
connections.... e.g. Google maintained GCM connections on Android for
notifications... a simple packet capture will show you the frequency of
keepalives there... and it's almost always more frequent than per 30 seconds.

~~~
zrm
Timeouts that short violate RFCs. Established TCP connections can't be
abandoned unless idle for two hours four minutes (RFC5382 REQ-5) and even UDP
timeouts normally have to be at least two minutes (RFC4787 REQ-5).

If people are violating the RFCs then applications that detect it should
probably start notifying users exactly why their battery life is suffering.

~~~
chias
I see a lot of "should"s. When push comes to shove, though, a notification is
not going to change the behavior of the network.

There's a lot of complaints I have about how some networks (especially mobile
ones) deviate from RFCs and break specifications. A 30 second timeout is not
nearly at the top.

------
apenwarr
UDP (and datagrams in general) are not the only alternative to TCP-over-TCP.
My sshuttle VPN uses TCP but avoids the TCP-over-TCP problem.
[https://github.com/apenwarr/sshuttle](https://github.com/apenwarr/sshuttle)

~~~
seunosewa
Please explain how it avoids TCP-over-TCP. It doesn't seem to me that
multiplexing multiple TCP connections over a single TCP link avoids the
problem.

~~~
apenwarr
TCP congestion control fundamentally depends on packet loss to know when to
slow down. If the outer TCP makes sure packet loss doesn't happen - because if
it does, it retransmits - then the inner TCP won't know what's going on, and
will send as fast as it can, creating a mess.

The trick with sshuttle is that you terminate the TCP sessions at the server,
and just send the raw data over the multiplexed link; there are no inner TCP
headers anymore. Then you add them back at the other end by reconstructing a
_new_ TCP session. This eliminates the second layer of TCP congestion control
inside the tunnel.

~~~
throwaway41597
I'm not totally sure I understand correctly:

1) a TCP packet from source IP S comes in on side A of your tunnel

2) instead of acknowledging the packet, side A only sends it as data to side B
other TCP (ssh)

3) the data may get lost, in which case, the TCP connection between A and B
retransmits

4) side B gets the data, forwards the packet to the destination IP D

5) D acknowledges, sends a packet to S

S --------- A ================= B ------------ D

When there is a lot of packet loss at step 3, the delay before S getting the
acknowledgement sent at step 5 increases and S sees the congestion. Unlike
TCP-over-TCP where A acknowledges packets from S as soon as it gets them.

Is that right?

------
parennoob
> Because the timeout is still less than the lower layer timeout, the upper
> layer will queue up more retransmissions faster than the lower layer can
> process them. This makes the upper layer connection stall very quickly and
> every retransmission just adds to the problem - an internal meltdown effect.

Intuitively, this is obvious at an organizational level. If your boss is
constantly micro-managing each piece of work (segment), and _his_ (or her)
boss is doing the same, you are going to have a meltdown.

------
usaif
TCP does what it's designed for: Reliable, ordered, stream connection with
fixed end-points. UDP is the other extreme of this permutation. Wonder why no
one explores the spectrum in between?

~~~
chias
All of those properties are binary things -- it's either reliable or not
reliable, ordered or not ordered, etc. The notion of mostly ordered or mostly
reliable we sort of get "for free" with UDP. So then your question becomes:
what about some properties and not others? There are ways around the fixedness
of endpoints so I'll just look at reliability and ordered-ness.

Unreliable but ordered: use UDP, enumerate your packets, and if you get a
packet out of order, discard it. No cheaper than UDP, and I can't really see a
potential benefit over it.

Reliable but unordered: for this to make sense you have to impose a timeout
(i.e. if you're willing to wait forever, UDP is reliable in that you can never
be sure you won't ever receive that packet). So now you have ACKs and NACKs
and you essentially have TCP minus congestion control and where you don't
bother to re-order packets based on seq #. I can't really see the benefit of
this either.

That said, there are _many_ non-tcp-non-udp protocols out there. I just
wouldn't say that the protocol-space is a spectrum with TCP on one side and
UDP on the other -- there are many, many dimensions to look at.

------
crazy2be
Perhaps I'm missing something obvious, but isn't this trivial to solve by just
having a deduping filter at the lower level? When the higher level starts
keeps sending duplicate packets, just ignore them. Duplicate packets at this
layer will always be useless, because your lower layer is already implementing
reliability semantics. Then, when you get ACKs from the other side, translate
those packets to match the sequence numbers that the higher level is expecting
(i.e. the most recent sequence number that was sent out corresponding to a
packet with those contents).

