
How both TCP and Ethernet checksums fail - jsnell
http://www.evanjones.ca/tcp-and-ethernet-checksums-fail.html
======
cperciva
_If the chance is purely random, we should expect approximately 1 in 2^16
(approximately 0.001%) of corrupt packets to not be detected. This seems
small, but on one Gigabit Ethernet connection, that could be as many as 15
packets per second._

Not really. You'd only get that many corrupt packets if you have a Gbps of
traffic flowing; but as soon as you start detectably corrupting 99.999% of
packets, TCP throughput is going to drop dramatically and so you'll have fewer
packets available to corrupt.

~~~
clinta
With 1500 Byte packets at 1Gbps you're pushing 83,333 packets per second. If
1% of those (833) are corrupted and 1 in every 2^16 corrupted packets has a
valid CRC then you have 1 corrupt packet with a valid CRC every 78 seconds.

Still not something to ignore, but not nearly as bad as the author indicates.

~~~
cperciva
Right, I think 1% PER is around the worst case in terms of having lots of
errors while not causing the data rate to scale back dramatically in common
internet-facing applications. I was estimating around one undetected error per
minute; as you say, not something to ignore, but definitely not as bad as the
author suggested either.

~~~
jsnell
Actually, I don't think that's the case. While it's true that the theoretical
maximum speed of an individual connection would decrease rapidly as packet
loss increases, the aggregate data rate of the traffic going through a single
network element would not necessarily be affected all that much.

It's totally routine to see much higher network wide packet loss rates higher
than 1%. The most I can remember was >15% sustained for weeks, for a few Gbps
of real life traffic in a mobile network. ("Real life" as in hundreds of
thousands of normal users, with the traffic coming directly from whatever
servers they were actually accessing).

~~~
cperciva
Right, mobile networks are unusual. My point about "common internet-facing
appliances" is that _most_ systems maintain cwnd values of at least 10
segments, or else people notice and complain about poor performance.

------
X-Istence
Cut through switching in 10 GbE applications however does not modify ANYTHING
about the packet as it gets sent along. It'll calculate that a packet was bad,
but at that point it's already too late to do anything about it because it's
already forwarded part or all of it onto the next wire segment.

This can be equally frustrating, as now you have to trace the entire path from
switch to switch and try and figure out what cable/fibre is bad, and you see
error counters increase on multiple interfaces.

~~~
sargun
It's not store-and-forward vs. cut-through. It's whether or not the switch
acts as a layer 3 device, or a layer 2 device. If it acts as a plain old layer
2 device, it can pass the packet, unmodified. As a layer 3 device, it modifies
the layer 2 headers, and the TTL. As a layer 3 device, it can still cut-
through.

Source: Broadcom documentation

~~~
X-Istence
I would call a layer 3 device a router. Whether it is marketed as a switch or
not. Layer 2 devices is what I was referring to.

My experience is with the Nexus platform from Cisco, pure layer 2.

------
PaulHoule
One thing to note about CRCs is that they are good for error correction but
not good as hashing functions. I ran a bunch of "almost sequential"
identifiers through CRC32, and upon producing 1024 buckets out of it, found
half of the buckets had the lion's share of hits.

~~~
drudru11
s/correction/detection/

------
joosters
_This seems small, but on one Gigabit Ethernet connection, that could be as
many as 15 packets per second._

Only if your network is corrupting _every_ packet!

Data corruption is a serious problem, but it doesn't help the discussion if
you wildly over-estimate its occurrence.

------
greglindahl
If you don't monitor bad TCP segment counts, you get what you deserve. It's
also smart to have your own end-to-end checksums on serialized objects.

------
wyldfire
> The root cause appears to have been a switch that was corrupting packets.
> ... The hypothesis is that occasionally the corrupt packets had valid TCP
> and Ethernet checksums. One "lucky" packet stored corrupt data in memcache.

Did the server (both source and destination) in question both/all have ECC-
protected memory? Hopefully that's a foregone conclusion but that's another
big opportunity for errors.

------
sargun
A couple things:

1\. IPv6 completely removes the checksum you used to have in IPv4. So, now
there is just the Ethernet FCS, and the TCP checksums. You should use IPv6. If
you're not using IPv6, you're only hurting yourself, and the rest of the
internet.

2\. Just transport everything over SSL, please. With AES-NI, the overhead for
encrypting data is so tiny, that it's easier just to let someone else solve
this problem.

~~~
toast0
SSL still has significant overhead, Netflix did a bunch of work [1], and they
still can only push 10 Gbps out of a box that used to be able to push 40 Gbps
(quad port 10G nic). 1/4th the throughput seems like a lot of overhead; and
I'm a mere mortal, and can't put TLS into my kernel.

[1]
[https://people.freebsd.org/~rrs/asiabsd_2015_tls.pdf](https://people.freebsd.org/~rrs/asiabsd_2015_tls.pdf)

------
zeveb
I agree with the comment in the article about using cryptographic hashes
(where possible, of course): there are huge peace-of-mind advantages to just
not having to worry about a problem. Obviously, there can be situations in
which one must make a pragmatic engineering tradeoff between reliability and
performance, but in the main I think it's worth doing.

------
drudru11
I have been lazy about this recently. A buddy of mine who works on flash
storage told me to do the same with any on disk data structures.

This post has pushed me to finally getting checksums in.

~~~
X-Istence
That's why file systems that do end to end checksumming should be the norm.

~~~
1amzave
I think a reasonable argument could be made that if it's in the filesystem
it's not end-to-end.

~~~
TheDong
Mind explaining a little more what you mean?

ZFS checksumming has served me well..

If you mean "it's not checksumming data not represented in the filesystem,
like unpartitioned sectors", well, no one cares about that data, right?

~~~
1amzave
See other reply:
[https://news.ycombinator.com/item?id=10363281](https://news.ycombinator.com/item?id=10363281)

