
Must, Should, Don't Care: TCP Conformance in the Wild - gbrown_
https://arxiv.org/abs/2002.05400
======
Animats
This takes me back to the early days of TCP, when I used to do this. I had a
TCP implementation with a "bit bucket"; every packet that was either rejected
or didn't advance the connection (such as a duplicate) was logged. Then I'd
send out emails to other developers; "Your packet sent at T did not conform to
spec, per para..." Gradually, things got better.

The "urgent" option could be deprecated. The original purpose was for when a
TELNET connection was sending data to a slow printing terminal such as a
Teletype, and you wanted to cancel the output. The TCP connection was being
held back by flow control waiting for the printer on output, and might be held
back on input if you'd typed ahead but the server wouldn't take another line
until the output was done. The user would push BREAK, the TCP connection would
send an urgent message, bypassing any queued data, and the server would get
this, stop sending, and clear its output queue.

Almost nobody has needed that feature in this century. But there's probably
someone, somewhere, with some ancient embedded device like a cash register
printer, using it.

~~~
lkrubner
Mark Pilgrim once had a hilarious blog post in which he argued that all
software developers were either assholes are psychopaths. He also argued that
some people believed in angels, who followed the specs for the best of
reasons, but who were actually mythical. Your """"Your packet sent at T did
not conform to spec, per para..." Gradually, things got better""" would make
you an angel, as you did it for the right reasons, though I imagine some of
the people you wrote gave you a different classification.

Sadly, Mark Pilgrim committed info-suicide and erased everything he'd written
from the Web. (Off topic: it is sad how much disappears from the Web, and how
many weblogs shut down. Some of the best essays I've ever read were on
weblogs, now gone. I just revived a weblog I had run in 2005, and checking the
links I found that the linkrot was running in the area of 50% to 60%.)

~~~
Animats
I was at an aerospace company. In aerospace, specs matter, because part A has
to plug into part B. You can remove the Pratt and Whitney engines from an
airliner and substitute Rolls-Royce engines. One is not emulating the other;
they both meet the same spec. DoD used to be big on having multiple sources,
all making interchangeable units to the same spec.

We used to say, "if A won't work with B, check the spec. If A doesn't match
the spec, A is broken. If B doesn't match the spec, B is broken. If you can't
tell, the spec is broken.

Much of this worked in the early TCP/IP days because DoD was funding most of
the players, both industrial and academic. They wanted interoperabilty.

There's much less of that today. Compatibility tends to involve reverse
engineering the dominant player's product.

------
Eikon
"Should" and "May" are such horrible words to encounter when implementing an
RFC.

Sometimes, implementing workarounds for ends that only implements "Must" is
harder than just implementing the RFC as if everything was just mandatory.

In my opinion, RFCs should strive to limit the optional parts of a
specification at a minimum and, maybe, put the remaining in extensions.

~~~
linkdd
Funny how your comment also use "should" and "may" :)

Anyway, those key words are used to allow flexibility in the implementation.
Remember that a RFC is specific to a single version of a single protocol.

Thus splitting everything in multiple RFCs will still implies complexity in
the implementation, just not the same kind of complexity (what versions to use
and when?).

> Imperatives of the type defined in this memo must be used with care and
> sparingly. > In particular, they MUST only be used where it is actually
> required for interoperation or to limit behavior which has potential for
> causing harm (e.g., limiting retransmisssions)

> For example, they must not be used to try to impose a particular method on
> implementors where the method is not required for interoperability.

Source:
[https://www.ietf.org/rfc/rfc2119.txt](https://www.ietf.org/rfc/rfc2119.txt)

------
iwalton3
This reminds me of "Hyrum's Law":

"With a sufficient number of users of an API, it does not matter what you
promise in the contract: all observable behaviors of your system will be
depended on by somebody."

People might implement work-arounds for bugs in an API that could break when
the underlying bug is fixed. Or software might implement the absolute bare
minimum for it to "work" with some specific implementation.

~~~
asdfasgasdgasdg
This is kind of the opposite of Hyrum's law, actually, in that the protocol
promises one behavior but what's actually universally supported in the wild is
only a subset of that promise.

~~~
IceDane
I think you're both right.

He's still right because my experience suggests that it is 100% likely that
somewhere, some system relies on a bug in the TCP protocol stack of another
system causing a segfault to shut down something critical.

~~~
ninju
[https://xkcd.com/1172/](https://xkcd.com/1172/)

(leave your mouse over the image and read the hover text :-))

~~~
codetrotter
I’m on mobile, what does the alt text say again?

~~~
detaro
[https://m.xkcd.com/1172/](https://m.xkcd.com/1172/) <\- mobile version has a
button for alt text

> _There are probably children out there holding down spacebar to stay warm in
> the winter! YOUR UPDATE MURDERS CHILDREN._

------
oconnor663
Is there a standardized set of TCP "test vectors" anywhere? Even an informal
de facto standard test like what SSLLabs is for TLS?

~~~
Dosenpfand
There is Google's packetdrill:
[https://github.com/google/packetdrill](https://github.com/google/packetdrill)

------
rwmj
I can understand why someone implementing - say - a custom web server stack
might never have encountered a TCP packet with the URGent bit set. As I
understand it, it's only really used by telnet (does ssh use it?)

~~~
cryptonector
SSH does not use it.

------
peterwwillis
Name a standard with 3 or more implementations and at least one of them will
have violated the standard. Sometimes it's all three. Sometimes it's the
implementation of the people _who wrote the standard_.

This is inherent to all standards implemented by different products. There's
no implementation police going around checking on and forcing people to
implement standards properly. And in the course of just regular product
development, it often starts to slowly violate the standard without notice.

It sure does help when there's a test suite that you can validate your
implementation against, but I find them rare.

------
asdfasgasdgasdg
To me, this says a lot about expectations of conformance when many different
entities need to implement the same spec. As the size of the spec or the
number of entities grow, the probability of nonconformance approaches one.
Then you end up with two specs: the ivory tower one, and the usable, actually
implemented, lowest common denominator one.

This problem seems especially acute when there are multiple hops on the path
that are able to interpret (and fuck up with) data flowing through that hop.
Especially when the owners of those hops don't have a direct economic
connection to the entities dealing with their failures.

This seems to argue in favor of something like QUIC. You use an extremely
simple protocol for the transport (basically, "try to send this data to that
address"). You hide the complex parts of the protocol in an encrypted channel
so that only the economically connected stakeholders have to conform. This
aligns incentives better than in the case of TCP and probably gives you better
outcomes in the long run.

~~~
kevingadd
The downside to QUIC is that all the other nodes in the chain lose the ability
to do useful things. Of course in the long run it's turned out that Google et
al do _not_ want anyone else doing useful stuff, but for quite a while it was
very useful to be able to have stuff in the middle like a caching proxy. Alas,
that era is over.

As an admin it's appealing to be able to have stuff like deep packet
inspection to give me info on network traffic, but the price we pay
collectively for that being possible is way too high so it was inevitable we'd
lose it.

~~~
kevin_thibedeau
The upside to QUIC is that all the other nodes in the chain lose the ability
to do useful things.

This behavior is why we can't use SCTP everywhere and can't deploy new
protocols on top of IP.

~~~
touisteur
But I thought you could use SCTP over UDP (works, I tried). If QUIC is another
layer above SCTP it feels like wasted effort. SCTP is really interesting and
featureful. Multi-homing, parallel streams, datagram oriented...

~~~
pas
QUIC is basically SCTP over UDP. It has substreams (like HTTP/2, but it
eliminates the TCP level head-of-line blocking), there's a multipath extension
(proposal from 2018) - but maybe MP-TCP will land first and then who knows.

~~~
touisteur
Thanks for this answer but I still don't understand, sorry.

What's missing in SCTP (used everywhere in telephony/3-4-5g, right?) that we
had to reinvent another complete transport+session layer ? It's already in the
kernel, seems to have had its own share of CVEs that it should now be
relatively trustable? But it's also available in userland, especially over
UDP.

Performance is fine, from my benchmarks. Ease of use is (to me) far better
than TCP and all the socket-hand-holding you need to do (if you don't use zmq
because tired of writing the same all the time). Flexibility of substreams is
amazing. Multi-homing is great so you can bond links at the applicative level
(so, higher decoupling instead of ip-bonding).

I'm genuinely curious as to why they haven't just taken SCTP as is, and added
the extensions (?) they need.

~~~
tialaramex
> What's missing in SCTP (used everywhere in telephony/3-4-5g, right?) that we
> had to reinvent another complete transport+session layer ?

SCTP isn't encrypted. Because "Pervasive Monitoring is an Attack" new IETF
protocols should be encrypted or explain why they can't be. HTTP/2 for example
is in effect always encrypted (the document explains how one could in theory
build an unencrypted variant but nobody implements that).

> I'm genuinely curious as to why they haven't just taken SCTP as is, and
> added the extensions (?) they need.

If you "just" drop the encrypted transport on top you either have to do all
the work to deliver features like substreams yet again, or else all the
metadata in the layer that's not encrypted is left unprotected and you'll
regret that.

~~~
touisteur
Ah, encryption, thanks.

Thought there was a sctp+tls RFC
[https://tools.ietf.org/html/rfc3436](https://tools.ietf.org/html/rfc3436)

don't know whether any userland lib supports this with sctp-over-UDP.

~~~
tialaramex
I explained the negative consequences of just layering TLS on top in my
comment already.

RFC3436 just makes pairs of SCTP streams (one in each direction) into a
transport like TCP that TLS will run on top of. Each such stream-pair then
does a TLS handshake.

That's _not_ what QUIC does. The entire QUIC connection, however many streams
and in whichever direction, is encrypted.

As a very simple example - suppose you spin up three parallel stream pairs to
fetch three separate documents over a hypothetical HTTP-over-TLS-over-SCTP.
With RFC 3436 you reveal to an on-path adversary that there are three streams,
and they get to see how much data travelled over each stream. But with QUIC
it's just an opaque pipe, the on-path adversary can see how much data was
transmitted each way but can't determine whether that's one document in one
stream, or ten documents in two streams or anything else.

~~~
touisteur
Sorry I did not mean to say you were wrong, I just meant to say 'it exists',
for reference. I understood your first explanation, didn't know whether this
RFC encrypted substream by substream or the whole session (I should have read
more before posting). Thanks for the clarification. And your patience.

------
dirtydroog
HTTP is in a similar sorry state. For example, it offers pipelining support
but nobody can reliably use it because it probably won't work on some
forgotten about machine somewhere on the path.

~~~
dagenix
Pipelining support in http 1.1 is basically useless even outside of the
compatibility issues.

~~~
rkeene2
I built a server-side mouse tracking over HTTP application that used HTTP/1.1
Pipelining to increase precision [0], back when web browsers supported
HTTP/1.1 Pipelining, that is.

[0] [https://github.com/rkeene/webdraw](https://github.com/rkeene/webdraw)

~~~
dagenix
That does sound pretty cool. But it sounds like a fairly special case where
the responses were likely uniformly small, which is unusual for a more general
use case. Given somewhat more recent technology, it sounds like a great use
case for something like WebSockets.

~~~
rkeene2
Yeah, WebSockets would also work here -- I would need to invent an ad-hoc
protocol for making the resource requests, and it would be similar to a mini
HTTP/1.1 Pipelining in spirit.

The project needs some other work, since it looks like changes to Chrome have
broken the mouse tracking. [0]

[http://webdraw.rkeene.org/](http://webdraw.rkeene.org/)

------
Diggsey
Maybe part of developing a standard should be developing and maintaining test
and validation suites for that standard...

~~~
slezyr
That's a good idea that's already implemented

[https://www.khronos.org/vulkan/adopters/](https://www.khronos.org/vulkan/adopters/)

------
majkinetor
I thought this is some kind of meme - from must over should to don't care.

------
est
Could this be used to fingerprint IP origin devices?

------
ape4
Its going to be specific stacks that don't do the MUSTs.

