
Why do UDP packets get dropped? - sebg
http://jvns.ca/blog/2016/08/24/find-out-where-youre-dropping-packets/
======
ptero
I was excited when I saw the title -- UDP is the workhorse for data transfers
on many projects I work on. The info, though, was very basic.

TL; DR version: packets get dropped when some buffers, at your local computer
or a router between here and there get full.

I do not want to sound too critical -- the info is good for someone who never
heard of UDP.

But I was hoping for more information. More substantiation on why full buffers
are the main source of UDP drops (e.g., can smart throttling take some/most of
the blame -- given the need to drop a packet, dropping a UDP is usually less
painful than dropping TCP, etc.)? any quantitative numbers on sample network /
hardware? etc.

~~~
richardwhiuk
Dropping TCP is normally preferable, I'd have thought, as it'll cause the TCP
socket to back off. Dropping UDP is less likely to lead to such behaviour.

~~~
snuxoll
It's more because UDP is designed to be an unreliable protocol, packet losses
are expected to happen and applications will have deal with it - meanwhile
dropping a TCP packet _may_ cause the congestion control algorithm to back
off, but you're guaranteed to waste more traffic while those TCP sessions
figure out WTF happened and retransmit.

~~~
theandrewbailey
UDP isn't designed to be unreliable. It inherits the reliability of what it's
built and run on, and doesn't compensate for it.

~~~
justicezyx
UDP should be considered in the context of IP network and TCP. In this
canonical and typical context, it is designed to the unreliable alternative of
TCP, as an IP transport layer protocol.

~~~
falcolas
I recall reading that UDP was left as such a thin wrapper over packets so that
when another protocol like TCP was found to be the wrong solution for a
problem, you could easily build your own reliability and congestion protocols
over the top of UDP.

Of course, I think that reference was in a book, so I can't find it now.

------
tptacek
All of these posts are fantastic. I can only presume at some point they're
going to be collected and turned into an extremely good book on Unix
performance and diagnostics investigation; it'd be a worthy sibling to the
older great Unix books like Panic!.

The infectious tone is part of it, but I think the bigger win is just how
little Evans cares about what the reader/writer is "supposed" to know. Each
post goes from a standing start with almost nothing presupposed all the way
into the weedy details.

Regarding this post: it's getting close to a pretty big idea. Once you grok
why packets get lost† --- congestion --- you're pretty close to understanding
the big idea behind TCP, and how the Internet works and figures out how fast
to send things, even though we're on crappy wi-fi connections connected to
even crappier DSL lines connected to OC192 backbones.

† _Fun additional reason we found when trying to build a next-gen "routed" IRC
in the 90s: when routes change!_

------
rubbsdecvik
This is not an original opinion, but I love her style of writing. I can feel
the pure unadulterated joy at learning these things. Sometimes I learn along
with her, sometimes I am seeing an old subject through new eyes. Always, it's
worth the read.

------
chair6
Her category 'lost in transit' is perhaps the biggest cause of UDP drops. No
matter how big your buffers on the send/receive side, if an intermediary
carrier decides UDP is not important or a D/DoS threat to their network, bye
bye packet... or in some cases of rate-limiting, see you in a while perhaps.

A few rabbit-holes to dive down:

[https://www.us-cert.gov/ncas/alerts/TA14-017A](https://www.us-
cert.gov/ncas/alerts/TA14-017A)

[http://www.christian-
rossow.de/articles/Amplification_DDoS.p...](http://www.christian-
rossow.de/articles/Amplification_DDoS.php)

[https://tools.ietf.org/id/draft-byrne-opsec-udp-
advisory-00....](https://tools.ietf.org/id/draft-byrne-opsec-udp-
advisory-00.txt)

[https://datatracker.ietf.org/wg/taps/charter/](https://datatracker.ietf.org/wg/taps/charter/)

~~~
JoeAltmaier
The app can be at fault too. Some traffic is bursty (video frames, large
files, images) and some client stacks have tragically small IP buffers (as
little as 128K in some OSs). Apps must be prepared to read to exhaustion
without pausing to process in those situations, then process once the buffer-
storm is over.

~~~
advisedwang
Yep, I once saw a case with syslog over UDP with thousands of servers
reporting to one central logging point. All the servers ran a command at the
same time, and logged a message within ms of each other. The flood of UDP
messages caused completely predictable input buffer overflow.

~~~
jmcnulty
Otherwise called the stampeding herd effect. The way to address this is to
introduce a random sleep on the remote servers before the command executes to
spread the load.

------
jws
Large UDP datagrams are much more likely to be dropped than small ones. A
short datagram will fit in a single IP packet. A maximally sized datagram may
take about 40. If you have a 1% packet loss rate, then the short datagrams get
lost 1% of the time, but the huge ones get lost 33% of the time ( 0.99^40 ).
With a 10% packet loss you get almost 99% loss of maximally sized UDP
datagrams.

The extra pain is that even though the huge datagram is lost, almost all of
its data is transmitted. So you have a congested line and you are hammering it
extra hard by resending the same data over and over.

Moral: unless you can guarantee low IP packet loss rates across the entire
route, be very careful about large UDP packets.

Counter moral: I built a very nice VPN over TCP solution for a customer who
insisted 10% packet drop on their network was fine. Completely fixed their
large UDP packet based legacy system… which I had designed a decade before.

~~~
AstralStorm
Counter-counter moral: detect Jumbo packets and use them. Except maybe still
not over general Internet.

Modern OSes have a thing called black-hole detection, it is a good idea to
have something similar in your protocol running on top of UDP.

------
StillBored
This applies to _ALL_ IP packets, not just UDP. The lack of buffer credits/etc
on IP mean that flow control is a function of the higher levels. Your TCP
packets are also getting dropped for all the same reasons, its just that TCP
backs off, and retransmits, so you don't see it as anything other than a
slowdown.

------
captainmuon
If so many packets get lost in buffer overflows, can you somehow make the
system wait for the buffer to empty?

If sending a packet syncronously, the system call would just hang until the
buffers have drained enough.

How would you do it while receiving? Does the outside network just bang bits
through the cable like a radio station? Then you'd have no choice but to save
everything as fast as it comes in, lest you loose packets. Or is there, deep
down on the lowest levels of the stack, actually some kind of request/response
going on, even for UDP? For example at the level of Ethernet frames, or even
individual bits? Like "here are some bytes. got them? - yes. - here are some
more. got them? - (waiiiiting....) yes." Then you could just let the next
router in the system wait while you drain your input buffer.

Even if there is no request/response going on, you could still view your
incoming as the next routers outgoing. Configure the router to wait with
sending until your client has drained the router's output buffer enough. (That
would require the router to know how much data you can take.)

~~~
krallja
> can you somehow make the system wait for the buffer to empty? If sending a
> packet syncronously, the system call would just hang until the buffers have
> drained enough.

It sounds like you're trying to make UDP lossless. Use TCP instead.

What if I'm running VOIP? Or a real-time video game? I don't want to wait for
the dropped datagram to be received before processing the next one. It's too
late! It should have been received already. Skip it and use the next one.

> Does the outside network just bang bits through the cable like a radio
> station?

Basically, yes. Ethernet has no flow control. IP has no flow control. UDP has
no flow control. TCP does.

> is there, deep down on the lowest levels of the stack, actually some kind of
> request/response going on, even for UDP?

No. Ethernet, as commonly deployed, has no need for this, but CSMA/CD is the
retransmit part of the specification in case you are using coax tap or a hub.
Each Ethernet frame also has a Frame Check Sequence, and the receiving node
will discard the frame if it doesn't match up. It's not up to Ethernet to
retransmit. Use TCP for that instead.

~~~
phab
> Ethernet has no flow control.

Except it does:
[https://en.wikipedia.org/wiki/Ethernet_flow_control](https://en.wikipedia.org/wiki/Ethernet_flow_control)

Whether people use it or not is another matter!

------
acomjean
What we found when working with UDP (multicast) at a previous gig (its been a
while).

The messages can be up to around 64KB. I say "around" because if different
OS's based on experiment did different things (this caused us much confusion).
I think HPUX would drop them silently on send if they got to 62. at 64 it
would return an error. Keep them below 60 to be safe.

Multicast means the router has to be set up correctly. When they muck with the
network and add a hop, make sure your TTL (time to live) is set correctly on
send. One of our OSs had a strange default to this.

I liked the all or nothing receive nature of UDP messages. Never a waiting and
reading for the rest of the data like with TCP. You have no idea if the
message got to where it was sent, sometimes you don't need that. Very low
overhead too.

Multicast was the selling point for us. Send and everyone attached to the
group gets the message. You can subscribe to the groups and see all the
messages which makes debugging easier.

------
api
UDP is just a thin wrapper on an IP packet that adds port and checksum. This
in turn is just a thin wrapper on whatever underlying network packets you use.

So UDP packets are as reliable as TCP packets in principle. TCP just hides the
unreliability with flow control and retransmits.

~~~
JohnStrange
There is also UDT as a reliable UDP-based alternative to TCP. Some day I'd
like to hear from someone who used it whether it's worth it and when it makes
sense.

~~~
rhinoceraptor
Also, if you use Chrome, you're probably already using QUIC. Go to
chrome://net-internals/#quic to see for yourself.

------
JoeAltmaier
Buffers is the putative reason. We'd be tempted to suggest bigger buffers
would help. But that contributes to the famous buffer-bloat problem that
tanked the internet some years back.

The practical solution is to meter traffic (don't send faster than they can be
processed on the receive end or faster than the tightest congestion bottleneck
enroute). At the same time process as quickly as possible on the receive side
- don't let a single buffer sit in the IP stack longer than absolutely
necessary. E.g. receive on one higher priority thread and queue to a
processing thread.

~~~
wahern

       E.g. receive on one higher priority thread and queue to a processing thread.
    

Doing that you've only created another buffer. But it's worse, because your
buffering thread has a higher priority than the processing thread, you've
moderated the back pressure, effectively signaling to the sender that you can
handle more packets than you actually can.

Yes, the solution to buffer bloat is to only keep buffers at the ends, but
whether that's in the kernel or in your application shouldn't matter in most
of cases, if any at all. Better to just let the kernel handle it and not
recreate the wheel.

Tweak the kernel send/receive buffers if you want. The defaults on Linux are
usually too aggressive, but I don't see any need to do much more than that.

~~~
JoeAltmaier
UDP has no back pressure, that's the point here.

And as observed elsewhere in this thread, some OSs have tragically small
buffers (128K). Its absolutely vital to keep those buffers from filling in
ambitious apps.

I wrote audio/video/screenshare communications code for years. In the bursty
situation I described, the whole point is to offload the ip stack buffer into
the app buffer at high speed.

~~~
wahern
Ah, right. But with the caveat that there's not any other control flow. Most
UDP-based protocols support either some kind of flow control (e.g. RTP/RTCP)
or retransmit (e.g. DNS), and that was my frame of mind. To stop buffer bloat
it's important for people to stop implementing hacks that make a peer look
more responsive than it actually is. It would be a shame if people got the
idea that UDP necessarily meant that lessons of buffer bloat don't apply.

I've also written streaming media services for many years. RTP/RTCP, for
example, supports adaptive rate limiting, though few implement it. The RTCP
sender and receiver reports signal packet loss and jitter so that the sender
can, e.g., dynamically decrease the bitrate. If implemented properly,
buffering too many RTP packets can hurt the responsiveness of the dynamic
adaptation, which can quickly lead to poorer quality. (Modern codecs help to
mitigate this issue, but largely because the creators have spent a lot of time
putting more adaptation features into the codecs and the low-level bitstream
knowing that software higher up the stack is doing it wrong.)

For DNS, because Linux has a default 65KB (or greater!) buffer on UDP sockets,
it's trivial to get huge packet loss when doing bulk asynchronous DNS queries.
The application will quickly fill the deep UDP buffer; with the deep buffer
the kernel will keep the ethernet NIC frame buffer packed, with the result
that you'll see a ton of collisions on the ethernet segment and dropped UDP
packets once the responses start rolling in. That results in a substantial
fraction of the DNS queries have to retransmit, and because the retransmit
intervals are so long, that means a bulk query operation that could have
finished in a few seconds or less could take upwards of a minute as the
stragglers slowly finish or timeout. Without the deep UDP pipelines, the
ethernet segment would be less likely to hit capacity, would see fewer dropped
packets, and so the aggregate time for the bulk query operation would be
several times less.

Reducing the UDP output buffer is substantially easier than implementing
heuristics or gating for ramping up the number of outstanding DNS queries. The
latter, if well written, might be more performant, but just doing the former
would alleviate most of the problem, allowing you to move on to more important
tasks.

------
lordnacho
The the root cause is always buffers? That's it? Nothing ever gets lost for
some other reason, maybe some sort of collision?

~~~
monocasa
There aren't a lot of true bus topologies out there any more for collisions to
occur on. And Wi-Fi retransmits.

~~~
TallGuyShort
Just to be clear, you mean Wi-Fi retransmits in the case of a collision a-la
Ethernet, right? Was talking to someone a while ago who was under the
impression that a TCP packet that didn't get ack'd within a certain window
might be retransmitted by a Wi-Fi router that was closer to the destination. I
was pretty certain it was wrong, but since you said, "Wi-Fi retransmits", just
checking that's not what you meant.

~~~
monocasa
Yeah he's wrong, routers don't buffer for downstream reasons in any context
I've seen.

That being said, Wi-Fi retransmits on more than just collisions. Under the
hood, Wi-Fi actually has it's own acknowledgement protocol beneath the
Ethernet layer. The idea is that with electrical bus protocols like Thicknet,
each tap has a pretty good view of the whole bus. For wireless you might see
your own frames just fine, but the destination might not. So you need to
retransmit not just when you see collisions, but when the destination hasn't
explicitly acknowledged your packets.

------
en4bz
> So if you have a network card that's too slow or something, it's possible
> that it will not be able to send the packets as fast as you put them in! So
> you will drop packets. I have no idea how common this is.

This is extremely uncommon on 1Gbps NICs but is much more common on 10Gbps+
NICs. Also this type of dropping can happen on both RX and TX.

EDIT:

Also the author missed a set of buffers. The RX/TX rings on the NIC! These
store packets on the NIC before they are moved to RAM (for RX) or sent on the
wire (for TX). You can see them and configure their size using ethtool on
Linux.

    
    
       $ ethtool -g ens6
       Ring parameters for ens6: 
       Pre-set maximums:
       RX:		4096
       RX Mini:	0
       RX Jumbo:	0
       TX:		4096
       Current hardware settings:
       RX:		512
       RX Mini:	0
       RX Jumbo:	0
       TX:		512

~~~
hyperpape
"This is extremely uncommon on 1Gbps NICs but is much more common on 10Gbps+
NICs. Also this type of dropping can happen on both RX and TX."

Is this backwards, or does the faster NIC really drop more packets? That seems
like it would be unfortunate.

~~~
en4bz
Not at "low speeds" but at line rate it requires a lot more CPU power to
process all of those packets which often leads to drops. These drops
specifically are called RX ring/fifo overflows/overruns and happen when the
NIC enqueues more then rx_ring_size packets before the OS initiates a DMA of
incoming packets from the NIC to RAM.

I suppose if you point a 10Gbps UDP stream at 1Gbps NIC there will be drops
but these drops will happen at the switch not at the interface which is a
different type of dropping.

~~~
hyperpape
Gotcha, so it's the NIC dropping the packets, but really it's because the host
machine fails to process them in time.

------
wruza
Somewhat off-topic, but what always wondered me is that why we don't have
something like BitTorrent-IP or FlashGet-IP, where you you can split stream
into independent [stream-like] segments and wait for them separately,
retaining sequential structure. I.e. TCP with separate channels. This would
solve many problems with 'site stuck in load-progress' when some item near
</head> failed to arrive and now entire <body> canot be rendered because we
wait for that retransmission.

I know HTTP2(?) and/or SCTP moved in this direction, but seems no luck.

------
fdegrassi
Interesting fact, there's an ARP buffer in the Linux kernel, which holds
outbound packets waiting for ARP resolution, which holds 3 packets (not
configurable last time I checked).

~~~
noselasd
No. Linux uses the socket buffer to buffer packets while ARP is in progress.
(The only sane thing to do imo). I believe the "buffers 3 packets" was how it
was done it the early 2.0 kernels.

Windows[1], BSDs[2], OSX[3] only buffers 1 packet per socket while it's ARPing

    
    
        [1] Empirical testing.
        [2] https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man4/arp.4.html
        [3] http://www.unix.com/man-page/FreeBSD/4/arp/ and similar for other BSDs

------
known
"when a few dropped packets here and there don't mean the end of the Universe.
Sample applications: tftp (trivial file transfer protocol, a little brother to
FTP), dhcpcd (a DHCP client), multiplayer games, streaming audio, video
conferencing, etc."

[http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html...](http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#twotypes)

------
chinathrow
I used to write processing software for stock exchange data feeds such as OPRA
([https://www.opradata.com/](https://www.opradata.com/)), they were UDP only
at the time.

Of course packets could be lost so there actually was a paper based form you
could fill in and re-request the missed data packet by fax (no-one ever did
that, though).

I think by now they implemented a TCP/IP connection to request re-
transmissions these days.

------
plopilop
The guy also misses the point that packets can be dropped because of integrity
check failures: if the packet has been altered in some way (cosmic ray, WiFi
interfering with microwaves or whatever), then the checksum fails and the
packet is dropped.

Actually, the causes of packet dropping are (AFAIK) exactly the same as in
TCP. But in TCP, dropped packets are sent again.

One exception though: in UDP, delayed packets won't be waited for long before
being dropped.

~~~
dfrey
The ways in which packets are accidentally lost is the same between TCP and
UDP, but if you have 150Mbps total coming in from four different ports and you
need to route that data over a single 100Mbps port, then the router needs to
choose which packets to drop. I bet that TCP packets take priority over UDP in
many routers.

~~~
chinathrow
> "I bet that TCP packets take priority over UDP in many routers."

Unless you prioritize them differently.

------
throwawaydfkjs
The article doesn't even skim the surface on the underlying reasons why
packets get dropped. First, unless you're using some snazzy QoS the router
doesn't care if the packet is UDP, so why is he even talking about UDP in the
first place? Does he not realize that TCP and others are just as likely to get
dropped, TCP just automatically resends.

------
flushandforget
(I wish authors would date articles.)

~~~
sebg
URL has the date of 2016 08 24 ...

~~~
flushandforget
That's funny, I'd even viewed the source looking for a date, and totally
missed the url!

Not so sure that urls should be any more than a uniqid - should they have
semantic meaning? Discuss... No don't.

Ta.

~~~
krallja
<time datetime="2016-08-24T18:53:10-04:00" pubdate="" data-
updated="true"></time>

