Lots of people talk about a "TCP or UDP" design decision, but that's usually not the relevant question for an application developer. Most UDP applications wouldn't just blat their data directly into UDP datagrams, just like almost nobody blats their data directly into IP datagrams.
Typical applications generally want some sort of transport protocol on top of the datagrams (whether UDP-in-IP or just plain IP) that cares about reliability and often about fairness.
That could be TCP-over-IP, but some (like MixApp, which wanted TCP but not the kernel's default implementation) use TCP-over-UDP-over-IP, BitTorrent uses LEDBAT-over-UDP, Mosh uses SSP-over-UDP, Chrome uses QUIC-over-UDP and falls back to SPDY-over-TCP and HTTP-over-TCP, etc.
The idea that using UDP somewhere in the stack means "designing and implementing a reliable transport protocol from scratch" is silly. It doesn't mean that any more than using IP does.
The question of SOCK_STREAM vs. SOCK_DGRAM is more typically, "Does the application want to use the one transport protocol that the kernel implements, namely a particular flavor of TCP, or does it want to use some other transport protocol implemented in a userspace library?"
That's in theory.
In practice ISPs routinely classify, prioritize and shape application traffic (based on either TCP/UDP ports or results of deep packet inspection), so running SSL over IP will yield different IP loss profile than when running SSL over UDP.
That depends I guess. As somebody who works mostly with cloud telephony, SIP (signaling) and RTP (audio) are almost always carried over UDP. Funny thing is that you'll see more occurrences of one-way audio or call drops because of people implementing the application level protocol like SIP wrong rather than the transport layer letting you down.
That said, I would any day prefer a private MPLS network rather than the public internet.
Though I can understand why somebody who writes a fast nosql database prefers TCP. Also, thanks for redis, we use it a lot and love it! :)
UDP for RTP makes sense because transmitting packets is pointless. In case of packetloss, the end user has to interpolate the result either via software or in their head.
This private MPLS network... it wouldn't happen to run over the exact same equipment that'd be handling your IP traffic, would it?
> UDP for RTP makes sense because transmitting packets is pointless
Should be this:
> UDP for RTP makes sense because retransmitting packets is pointless
TCP will retransmit if no ACK is received; UDP won't (there being no ACK and all).
Mobile wireless stuff (EDGE/3G) is going to great lengths to avoid dropping packets, so if you get conditions right (moving train in the areas with poor coverage is one place where you can easily see this), you can get the packets "reliably delivered" in 20 seconds and more.
TCP has been designed in such a way that it interprets packet drops as a sign of "congestion" (which was typically true in ye olden days of purely wired networking), and it will start sending less data in response.
Whereas in wireless networking, occasional packet drops are just a fact of life and are not indicative of competing flows trying to share the channel. So it actually makes [some] sense that wireless protocols try to compensate for the behaviour of the transport protocol used by 90% of all data: TCP.
Thanks to your comment, a short Wikipedia trip later, I now know that "penultimate hop popping" is a thing and has a fantastic name.
I read that Chrome, Linux 3.6 and Android all support this.
> It's a shim on top of IP to let unprivileged users send IP datagrams that can be multiplexed back to the right user. (Hence the name: "user" datagram protocol.)
It's mostly a way to send data in such a manner that you don't need to do the full 'call-setup-data-transmission-and-terminate' that would be required for other virtual circuit based protocols such as TCP when you don't need all that luxury. So for protocols that carry small amounts of data in a manner where retrying is not a problem and where loss of a packet is not an immediate disaster. It's also more suitable for real-time applications because of this than TCP (especially true for the first packet). Because of the fact that there is no virtual circuit a single listener can handle data from multiple senders.
The 'USER' does not refer to unprivileged users but simply to users as opposed to system packets (such as for instance ICMP and other datagram like packets that are not usually sent out directly by applications). So it's not a privilege matter but a matter of user-space vs system modules elsewhere in the stack.
> Lots of people talk about a "TCP or UDP" design decision, but that's usually not the relevant question for an application developer.
It absolutely is.
> Most UDP applications wouldn't just blat their data directly into UDP datagrams,
They usually do exactly that.
> just like almost nobody blats their data directly into IP datagrams.
You're comparing apples with oranges, IP is one layer and TCP and UDP are on another. So you'd have to compare UDP with TCP and then you're back to that design decision again.
> Typical applications generally want some sort of transport protocol on top of the datagrams (whether UDP-in-IP or just plain IP), that cares about reliability and often about fairness.
Fairness is something that is usually not under control of the endpoints of a conversation but something that routers in between influence. They can decide to let a packet through or drop it (this goes for TCP as well as UDP), if a line is congested your UDP packets will usually (rules can be set to configure this) be dropped before your TCP packets will be in spite of the fact that TCP will re-try any lost packets. UDP packets can also be duplicated and routed in such a way that they arrive out-of-order.
> That could be TCP-over-IP, but some (like MixApp, which wanted TCP but not the kernel's default implementation) use TCP-over-UDP-over-IP, BitTorrent uses LEDBAT-over-UDP, Mosh uses SSP-over-UDP, Chrome uses QUIC-over-UDP and falls back to SPDY-over-TCP and HTTP-over-TCP, etc.
Running alternative protocols packaged inside other protocols is a time honored practice. See also: TCP over carrier pigeons and tunneling HTTP over DNS traffic (effectively using UDP). This is not in any way special, it's just a means to an end.
> The idea that using UDP somewhere in the stack means "designing and implementing a reliable transport protocol from scratch" is silly. It doesn't mean that any more than using IP does.
It actually comes down to exactly that. If you use UDP as your base and your application requires reliable transmission of data then you're going to have to deal with loss/duplication/sequencing at some other point in your application or put another (pre existing) protocol on top of it in order to mitigate these.
If your application can tolerate those errors (or if they are not considered errors) then a naive implementation will do.
> The question of SOCK_STREAM vs. SOCK_DGRAM is more typically, "Does the application want to use the one transport protocol that the kernel implements, namely a particular flavor of TCP, or does it want to use some other transport protocol implemented in a userspace library?"
TCP is the default for anything requiring virtual circuits if you have demands that are not well describe by that model and/or need real time, low overhead and you're willing to do the work required to deal with UDPs inherent issues (if those are a problem) then you're totally free to do so.
But the question is usually not 'do I need TCP', it usually is 'how do I avoid re-implementing TCP if I need its features'.
It's a tough choice because at a minimum it means that you're going to have to write software for both endpoints.
This is one of the reasons why we see HTTP over TCP in so many places where it wasn't originally intended: it is more or less guaranteed to be well tested and there are tons of tools available to use this protocol combination, especially browsers, fetchers and servers in all kinds of flavors. For UDP that situation is much less rosy and using UDP always translates into having to do a bunch of plumbing yourself.
That is what the books say, but I don't think it is right. When you consider that raw IPv4 doesn't work in practice because of NAT, UDP is a defacto minimum internet layer in practice.
It is possible that someone wants a virtual circuit but can do better than TCP for their application. I think the parent's explanation was more apt - it's a small layer on top of IP for you to implement your own protocol logic.
One usually want to reimplement a reliable protocol (e.g. similar to TCP) on top of UDP when dealing with peer-to-peer capabilities.
Indeed, NAT traversal is easier to implement and more reliable when dealing with UDP techniques (e.g. UDP hole punching).
OP, you should have done latency tests as well. Whether packets arriving out-of-order is unacceptable to an application or not depends on the time distribution of packet delays. If, for example, 99.9% of packets arrive in-order within, say, 10 ms, then that is perfectly acceptable to use for video streaming, for example. Even if that means a packet is 1000 packets late, this just requires a bigger buffer on the receiving end. As long as it's below a certain duration, out-of-order packets aren't necessarily a problem even for live content.
The grouping of 5 packets together and thus checking whether a packet arrive out-of-order more than six places seems arbitrary. If, for example, a packet arrives out-of-order 50 packets late, but it's actually only 1 ms late, then it's not a problem. If it comes along 1 second too late, then it is a problem.
As you say, that grouping is arbitrary and, it was more likely to include packets that shouldn't have been included (multiple seconds later) than those that did. Though, that's more todo with the fact that I sent so few packets (5-10) in each burst.
"Hello, would you like to hear a TCP joke?"
"Yes, I'd like to hear a TCP joke."
"Ok, I'll tell you a TCP joke."
"Ok, I will hear a TCP joke."
"Are you ready to hear a TCP joke?"
"Yes, I am ready to hear a TCP joke."
"Ok, I am about to send the TCP joke. It will last 10 seconds, it has two characters, it does not have a setting, it ends with a punchline."
"Ok, I am ready to get your TCP joke that will last 10 seconds, has two characters, does not have an explicit setting, and ends with a punchline."
"I'm sorry, your connection has timed out. Hello, would you like to hear a TCP joke?"
@mckeay The sad thing about IPv6 jokes is that almost no one understands them and no one is using them yet.
@dildog What's up with the jokes... Give it a REST, guys...
@ChrisJohnRiley: The worst thing about #protolol is that you get the broadcast even if you really don't give a shit!
@mdreid: The best thing about proprietary protocol jokes is REDACTED.
@maradydd: The bad thing about Turing machine jokes is you never can tell when they're over #protolol
A UDP packet walks into a bar.
A UDP packet.
Conversely I've pondered using UDP packets as a "canary in the coal mine" for networks to monitor it's health.
TCP (that is, most TCP flow control algorithms) specifically uses packet loss as an indicator of network congestion / how much bandwidth is available, and will back off and retry dropped packets at a lower rate; hence why TCP continues to function (albeit at degraded performance). UDP has no such mechanism, hence you see firsthand the dropped packets. A well-written application should back off in such a scenario so as not to flood the network, but of course many won't.
(Of course TCP may very well be prioritized over UDP! It's just not necessary to explain your observations.)
Obviously this presumes that the UDP protocol in question has some mechanism for handling lost packets. But, for instance, if you're doing lossy video or forward error correction, end users do not deal directly with lost packets.
It's a bad idea to measure UDP reliability on a quiet day and then make decisions based on those results. On a different day, everything will be happening at once - everyone trying to message their Mom, everyone trying to sell stock, everyone trying to cast a spell on the big monster - and that is when the most packets will be dropped.
You can use TCP as this canary as well by monitoring a counter of how many retransmissions have been performed.
RUDP or similar systems allow network messaging to ack when needed and unreliable by default which is fine for positional updates with some missing data using interpolation and extrapolation to simulate missing data. Global game states and determining the winner/ending the level might need a reliable call.
With UDP and only some reliable calls you drastically improve real-time performance with less queueing. TCP is ok for turn-based or near-real-time but high action real-time games almost all use UDP with a RUDP twist. Using a mix can also be harmful  so the best option is RUDP where needed, default to UDP.
UDP is great because it is a broadcast, almost like TV/radio in that you can show the data you receive and smooth the rest. Although, TV/radio needs every data frame to prevent static/lag but imagine the broadcaster having to error check all connections, it would quickly saturate. UDP allows saturation later. Games can predict movement and smooth out missing data for most game actions, you only need a few points for a bezier to be smooth or you might have variables for speed/direction/forces that you can predict with. Pretty good reliability and lack of ordering are not really an issue, if out of order (timestamp older than last) discard and use next or predict until the next valid message (too much of this leads to lag but normal UDP operations it is enough and actually smoother).
- If your bandwidth use is small, you can just spam multiple copies of each packet to decrease the chance that a laggy retransmission will be needed. If you're sending packets at a constant high rate, you can instead include copies of the last N messages in each packet, rather than just the new data.
- You can send packets directly between two NATted hosts using STUN, rather than having to rely on UPnP or manual port forwarding. Pretty obvious, but I only see one other mention of this fact in the thread, and it vastly increases the likelihood of being able to make a direct connection between two random consumer devices.
Most game networking libraries will let the developer choose, when sending a datagram, whether it's reliable, ordered, both or neither. The reason is that you rarely actually need both, but in the rare cases you do (e.g. if the user issues a "fire" command) you can upgrade datagrams to that state.
RUDP is simply a message-based connection that is both reliable and ordered. You might be able to get away with using it for games, but, why would you - RakNet is open source and is mostly the industry standard.
Also don't forget the one final advantage of UDP is that STUN/TURN work great. NAT punching with TCP is walking on thin ice.
• Timestamps are necessary: UDP loss is bursty, when any component anywhere along the path (including the endpoints) is momentarily too burdened (buffers, CPU, wires) to forward every packet.
• Try packets larger than the 'path MTU' between the two endpoints: any packet larger than that is fragmented somewhere en route, and then loss of any one fragment causes the full UDP packet to be lost.
• Try alongside other traffic on the endpoints, and note that TCP streams can't find their achievable rates (or split the available bandwidth amongst themselves) without hitting the packet-dropping pressure that also affects UDP. So perhaps try with 1, 2, or more TCP streams trying to max wire-speed, between the same machines at the same time.
Note also you can create arbitrarily-high loss rates by simply choosing to send more than the path and endpoints can handle. (Let one side send X thousand in a tight loop; on the other side only check for 1 per second for X thousand seconds. Many will be lost, depending on how much you've exceeded the available bandwidth/buffering between the send and receive.)
My intuition is we are close to relativistic problems. (Can't prove it, I have to code a kikoo lol form for tomorrow).
I agree. Having (or not) other traffic on the network will impact UDP throughput dramatically. As one who does VoIP QoS will soon find out.
My biggest UDP "surprise" was on a system using UDP but treating it as perfect, because the system designers knew that the hardware bus they were using was "100% reliable". And they were right--the hardware wasn't dropping anything. Too bad that the OSes on either end would discard UDP packets when they faced buffer pressure.
Basically, if you send() 2 or more UDP datagrams in quick succession, and the OS has to resolve the destination with ARP, all but the 1. packet is dropped until you get an ARP reply. (This behavior isn't entirely unique to windows btw.)
Which unfortunately is all they can do, in the limit. Otherwise, if applications (or the OS itself) can't keep up with the incoming packet/data rate, buffers would grow without bound. Not good for a production system.
How unreliable is UDP as a protocol? Unreliable. This is really a binary state, not a percentage-measurable value.
If you need reliability (in ordering or delivery) you need to layer on top of it and unless your network usage has very specific constraints (eg. low latency is far more important than strict ordering, as with most fast action games) you should almost certainly just use TCP rather than end up badly reimplementing it over UDP.
On the other hand, in that situation TCP is the worst choice you can make, as, especially on bad lines, like say Whistler in the canadian mountains, it's easy to get TCP into a state where it builds up a massive latency because it insists on trying to push every single packet through.
It seems a mite... silly... to resend a packet from the other side of the world just because your ISP couldn't shove the packet down your internet connection fast enough.
The smart solution you're looking for is TCP ECNs - a way for the routers to say "I'm buffering it for now, but you'd better slow down". If you're running linux it's a kernel setting you can enable (they're disabled by default as some routers mishandle them).
Don't treat links as end-to-end. Treat them as a bucket chain - each link in the chain negotiates with its immediate neighbors only. Currently we do a game of "toss the bucket at the next guy and hope he catches it".
The only solution to avoid that problem would be to have every router keep state for each connection/flow it sees, and managing separate buffers for each flow - but that would mean keeping state for hundreds of millions of flows on backbone routers, all of which essentially would need to be in level 1 cache in order to be able to move packets at line speed, and even that probably would be too slow - there is a reason why those routers use CAMs to be able to keep up even just doing routing table lookups in only about 500000 routes.
As a matter of fact, it would probably be easier to make dynamic. (Router A gets a packet for router Z - router A wants to send it to router B, but router B is currently congested, and router A knows that router C is an alternate route, so it sends it to router C instead.)
Now, there are circumstances where this approach is not particularly valid. In particular, on wireless networks. However, TCP over wireless networks isn't exactly great either. (TCP and this approach both make the same assumption: namely that most packet loss is congestion as opposed to actual packet loss.) This approach is for the segment of the network that's wired routers with little to no packet loss disregarding packets being dropped due to no cache space. I.e. this approach is for the reliable segment of modern networks - wireless should probably have an entirely different underlying protocol.
Routing the packet to Z and telling you that the path to Z is congested are mirror images of each other; it makes sense to use the same mechanism for both.
For games etc, you need to be able to hide 1-2 seconds of lag for TCP.
If you need the full reliability TCP offers, UDP isn't the answer unless you layer your own protocol on top of it, and while it is possible to layer your own protocol on top of it that will beat TCP for specific use cases, most people (who aren't expert network programmers and don't even know about [let alone understand] the minefields of issues you can run into such as dealing with NATing, etc) are much better off just using TCP, warts and all.
What would be great is a "third half" of this equation: a popular protocol a la SCTP. Imagine not having to re-invent message size headers for every damn protocol: the size of your packet is embedded in the datagram just like UDP, but the transmission is reliable just like TCP. HTTP would suddenly get so much better!
Once that standardization happens, the OMA LWM2M spec is going to need some revisions, because it's very much written with "CoAP over UDP" in mind. This will ultimately be very good, because it will allow the possibility of other protocols like MQTT, HTTP, etc.
On another note... does anyone know if RDP has seen any use lately? I'm guessing it's pretty redundant vis-a-vis TCP in practice?
Each protocol has its own place (and uses)...
Also there was an ATT home router that dropped every other UDP packet! Really! Its a matter of, if you don't test it, it doesn't work. And home appliances get tested on delivering web pages etc, not UDP streaming.
I used to run a pretty sizable VoIP network over a DS0-based Carrier Ethernet network. When you have tight control over the network, UDP is extremely reliable. We would go long periods of time with 0% packet loss (as reported by AdTran VQM). When we did see loss, it was normally because of a problem we were already aware of. That's the beauty of DS0. When a DS0-based circuit drops frames, alarm bells sound.
So "make sure you can handle (where "handle" could simply be "safely ignore") failed/unordered transmission" rather than "expect failed/unordered transmission".
If you can't cope with or ignore missed/unordered reception, wrap your comms in a stream based protocol (TCP, one of the other existing ones, or some contraption of your own) that manages reordering and retransmission as needed.
Maybe there is a better word to cover that then "unreliable" which sounds quite definite, but I can't think of one off the top of my head.
But you can not make work what you can't test.
So anyone embarking on making their own transport protocol should test how their app works with various amounts of packet loss (1%, 5% and 20% are three reasonable marks to try which may happen in real world - especially in the wireless links case), as well as jitter, and variable delay.
To do such testing, it's handy to use a Linux module called "netem" which allows to simulate the delay, jitter and loss:
Packet ordering wasn't an issue in our app, though, so I'm not sure what my point is other than sometimes UDP is reliable enough.
If you're connecting with wireless, expect to see some more dropped packets, and if you're connecting with bluetooth expect to see even more (because of its low power bluetooth is more likely to drop packets than either a wifi or wired connection).
I'm not sure it's that reliable for a end user on a wireless network.
Still, it's good to know.
Is that correct? If so, then regardless of UDP actually appearing somewhat reliable, the spec makes no guarantees that it will be consistently reliable.
There are (were) other protocols possible but these are about all that's supported/tested these days.
Also there's fragmentation/reassembly issues with UDP. If you send a 30K UDP datagram packet, it gets sent as many Ethernet-sized chunks. The receiver is supposed to put them back together. If one is lost, the entire thing is lost. And to boot, Linux didn't do UDP reassembly at all until recently (last year?)
And they ain't kidding, because there are plenty of layer 3 devices out there who play fast and loose with fragments of UDP packets; it used to be you could assume they'd drop anything that doesn't have a header, so by default UDP on linux doesn't encourage it.
It could be that distributions have been bundling sysctl.conf files with ip_no_pmtu_disc set to true, being that most modern layer 3 devices no longer mistreat UDP so badly; this may be what you're experiencing in the last year.
Not at all! There's no way to ask for delivery of a datagram of unknown size (which is pretty much the definition of a datagram-oriented protocol), as in UDP, RDP, SCTP, and other datagram-oriented protocols.
(No, not even the PSH bit guarantees this!)
This has implications when implementing application-layer protocols. In stream-oriented protocols like HTTP, which intersperse headers with payload, it is impossible to read the header from the OS's socket without also (possibly) reading part of the payload (unless you're reading a single octet at a time). Your application is thus forced to implement buffering on top of what the OS provides (if the OS does not itself provide a general pushback mechanism).
With an (ordered) datagram-oriented protocol, this is not a problem, as you can ask the OS to return a single datagram corresponding exactly to the header, and process it before retrieving any of the payload.
And to boot, Linux didn't do UDP reassembly at all until recently (last year?)
Wow, that surprises me. Good to know.
What UDP doesn't guarantee is ordering and arrival of packets. Usually "channel reliability/integrity" or simply "reliability" is used to refer to this.
With TCP, it might be a while - you may not get notification that he's gone until he comes back.
So you end up with keepalives, and... might as well have used UDP if your media can stand it.
Of course, some of the actual packet loss at the media level is often being masked by the media protocol itself. That's why you'd often see cases where a packet meanders it's way to the destination after 4-5x the normally observed ping latency. This sort of stuff plays havoc with TCP congestion control algorithms (because by then, TCP send side has already decided that the packet is lost and it can only be due to congestion... so it backs the heck off). A lot of our win comes because of doing these things more in tune with how mobile networks actually behave.
These test reveal exactly what you'd expect, as they are run between servers under prime networking conditions.
The answer is:
Pretty bad test conditions on this, tbh.