

The little ssh that sometimes couldn't (2012) - jasonmp85
http://mina.naguib.ca/blog/2012/10/22/the-little-ssh-that-sometimes-couldnt.html?ref=goog

======
karlshea
Previous discussion:
[https://news.ycombinator.com/item?id=4709438](https://news.ycombinator.com/item?id=4709438)

------
foxhill
tldr; single bit flips in a hop to a remote server.

the moral of this story is - the number of layers and abstractions between our
code (even our shell scripts - cron jobs in this case) and the network layer
is so large.. the most subtle of bugs in one of these layers is a _massive
pain_ to track down.

i am in awe of the tenacity of these bug hunters.

~~~
digi_owl
Another thing is that TCP does not have a facility for reporting what the
problem is.

So you basically has to dump signals down the wire and hope something comes
out of it.

------
gpvos
Is there some kind of TCP signal that the kernel could reasonably send back to
the originator if it detected packet corruption?

~~~
toast0
There are some ways to coax a retransmission (duplicate acking, maybe
selective ack?); but retransmissions doesn't really help, since a given socket
was always running through the same route, and getting corrupted. I guess an
explicit 'got bad data' message would have shown up better in tcpdump though.

~~~
derefr
Sounds like a session-layer/presentation-layer sort of thing. TLS or IPSec
might have such a protocol message.

~~~
gpvos
Because the TCP checksum was incorrect, the packet would never reach a higher
level such as TLS. TCP or ICMP would be the only options; maybe IPsec.

