Hacker News new | comments | show | ask | jobs | submit login
A solution for enabling UDP in the web (gafferongames.com)
383 points by vvanders 117 days ago | hide | past | web | 180 comments | favorite

Minor nit that I almost hesitate to pick but I've been doing 100-series network training lately so this is something that I'd tell my students:

> UDP packets are not encrypted, so any data sent over these packets could be sniffed and read by an attacker, or even modified in transmit. It would be a massive step back for web security to create a new way for browsers to send unencrypted packets.

TCP packets are not encrypted, either. That data is transmitted via UDP or TCP doesn't make it encrypted; encryption is handled elsewhere.

Now if you insist that the contents of the transmitted UDP packets are encrypted--as the author does with the documented proposal--then that's one thing. But the data transport mechanism (UDP vs. TCP) doesn't inherently mean encryption.

(Fun fact I learned while writing words about TLS: Use of TLS does not automatically imply confidentiality--that is, encryption--of the data being transported. TLS supports a NULL cipher[0] that ensures integrity and authenticity but the payload is "in the clear.")

0 - https://tools.ietf.org/html/rfc4785

You're not wrong, but I think the point was, in a browser, you can get your TCP _connection_ "automatically" encrypted by connecting to HTTPS endpoints. It's not so easy -- "built in" -- with a connectionless protocol.

That's entirely true and it's one reason why I hesitated before nitpicking. That said, I think that the distinction should be properly made when someone is discussing a new way to sling data over a network from the application layer. If the author had written what you'd written, I'd have let my reply go.

(Even so, it's not that the "TCP session" is encrypted; it's that the transport-layer security mechanism encrypts the payload of the TCP session. The metadata for the session--the respective IP addresses, sequence/acknowledgement numbers, and so on--is still in the clear. I realize that, at this point, I'm being quite pedantic but I'm a networking person at heart and I want software developers to better understand the infrastructure that's being used.)

i think its fair what you said. webrtc offers "automatically" encrypted UDP if we are comparing like that.

But we already knew that. It's in the article, so again there is a story here that you're missing. We want a client/server protocol that is encrypted and simple -- not requiring ICE/STUN/TURN (OMG).

If you already have Websockets, then Simple-Peer makes it very easy to deal with WebRTC negotiation. As I've mentioned elsewhere, I've created a 4 process WebRTC UDP proxy with that running under nodejs. My use case is precisely client-server.

There is DTLS if you really want turnkey encryption for UDP, and it's implemented by openssl so it's straightforward enough to find an implementation. Not the most widely deployed protocol out there, to be sure, but if browsers wanted to make that the condition of use it would soon become widely implemented.

"openssl [...] straightforward" I don't think so. It's in the article; DTLS is too complex. The author wants something simple.

Most implementations (eg. WebRTC) involve pulling in truck loads of C libs; it's insane. And then you want to support other clients? Good luck.

Not quite turnkey, but openvpn is essentially tls over udp with an ip tunnel on top

I have implemented TLS on UDP. Basically to implement an EAP-TTLS authentication service which sets up a TLS tunnel over EAP over RADIUS over UDP. I can attest that is "doable" but is horrendously complicated. Not just to implement but to monitor and debug.

It "works" in the case of EAP because you're never sending more than a few datagrams at a time, and if one gets lost it's not a huge tragedy to have to start the whole process again ... but, I can't imagine this being feasible for more continuous data-transfers such as streaming data "oops you lost a packet - sorry you'll have to re-establish your TLS session again"

I did notice the DTLS thing lying around when I was browsing the OpenSSL source but as another commenter says it's not that popular and presumably suffers many of the same complications.

Is there a reason you couldn't use TCP/TLS to establish a shared secret, and then just encrypt all your packets with that?

Gosh .. I eh .. I guess I was just given this RFC to implement [0] I'm not sure of the reasoning specifically for doing it this way.

But kind of related to your point, TTLS does in fact use the "cryptographic material" derived as part of session establishment to derive a shared secret for hashing in CHAP, MS-CHAP and MS-CHAP-v2! [1] The nifty thing about that is it "binds" the ensuing authentication process to the actual TLS session. With MS-CHAP-v2 it also provides for mutual authentication without either party ever having to exchange sensitive info (neither client nor server ever has to know the password).

This was actually a real pain to do with Open SSL as that information wasn't readily available with the standard API. I had to implement some horrific cludge to get the necessary info nested within 3 layers of structs.

[0] https://tools.ietf.org/html/rfc5281

[1] https://tools.ietf.org/html/rfc5281#section-8

DTLS is a foundation of CoAP protocol (request/response) for the IoT world [1]

I personally think, it will see wider usage in upcoming decade.. but it is just a guess.

[1] https://projects.eclipse.org/projects/iot.tinydtls

Gosh I hope so! The current melee of devices chatting away without any security is a big cause for concern.

DTLS is also the foundation of webrtc, which is talked about in this topic a lot and which is implemented in most browsers now. This means DTLS is important - but unfortunately not offered by lots of libraries (e.g. not by the Go network/crypto libraries)

This is also true of IPSec AH and ESP. You can use it purely for labeling (turn off both)

IPsec, not IPSec

It's been 10 years since I worked on a VPN stack, but naming pedantry dies hard.

Actually the encryption question is interesting, you couldn't use modes like CBC in out of order. What technique would be used for out of order decryption?

Pick your mode, separate IV per record, separate record per packet.

> UDP packets are not encrypted

how does QUIC do encryption?

It's kinda sad that TCP was chosen so long ago for HTTP that there's effectively no changing it. With modern TLS the underlying data guarantees TCP gives you just aren't that useful. We have kinda a weird situation where we have TCP->TLS->HTTP in layers when it could all be one protocol layer. We also wrap a stateless protocol (HTTP) inside a stateful (TCP) one which causes some insanity.

What the problem with doing it the current way? Massive routing inefficiency at scale. Since the layers for persistence and routing (L2-4) don't carry all the info needed to connect to a server (some like headers and URL are up in HTTP - L7) it's mandatory to "unwrap" through the protocol layers before you can determine where a stream/packet/HTTP req is supposed to go.

This means you can use something like IPVS as your L2-3 load balancer, but once the streams are divided out by IP/port you need to do the TLS+HTTP in one step. There's also some hard limits on how much traffic a single IVPS instance can handle because balancing TCP even at low level requires the router to keep track of connection state (conntrack). So we have this situation where there's a main low-level balancer with some arbitrary traffic limit imposed from TCP overhead, and behind that we have a bunch of child balancers doing way more work than they should be handling the connection from the TCP level through TLS and HTTP before they can pass on the connection to a back-end app server.

This could all be avoided if HTTP was a stateless UDP based protocol, and TLS was baked in rather than being an additional layer. It would make routing and load balancing far more effective at scale. You probably wouldn't see nearly as many DDoS attacks succeeding, because the vast majority of them exhaust CPU power far before they actually flood you off the net.

It makes sense to separate the problem into multiple different protocols, because that gives you re-usability and greatly reduces the complexity of implementation. Can you imagine the amount of effort if every application protocol defined its own mechanisms for retransmissions, congestion control, security, etc.?

It also makes it possible to swap out parts more easily. For example, to put SSL under HTTP. Or QUIC under HTTP/2 when people realized TCP is not a good fit for it. You could, if you wanted to, run HTTP over UDP. Although you'd quickly realize you actually want many things that TCP and TLS give you for free, so you'd have to start re-implementing the same functionality on top of UDP.

Don't forget the history of TCP. TCP's congestion avoidance is the evolution of 20 years of managing large scale networks. Many papers later and the CA algorithms of TCP have been proven to work at large scale for a variety of traffic types. The internet as we know it simply wouldn't work without massive adoption of TCP. Even QUIC looks very similar to TCP, and is just an evolution of it in many ways.

It's a hard distributed problem: how do you coordinate many independent flows of traffic to efficiently utilise the many network links in the internet? Do this wrong at scale, and even broadband internet will feel like you're on a 2G mobile network.

Also I don't think it makes sense for HTTP to be on UDP. While it's stateless, you still can't tolerate packet loss with HTTP. Otherwise you'll try loading a page and maybe nothing comes back, or only part of it comes back. What then?

TCP makes a lot of sense for HTTP, since once you strip of the HTTP headers on the request and response HTTP is basically just TCP: A continuous bidirectional stream of bytes in both directions. If you want to rebuild that with all it's properties (flow-control, ordered and reliable) you rebuild half of TCP anyway.

The CoAP protocol for embedded devices tries to achieve HTTP semantics on top of UDP. But it's a lot harder to implement correctly, especially if you want to support large request payloads. And I think it isn't easier to load balance or proxy then the TCP based HTTP, since it's also not sufficient to look at a packet header but you would also need to parse and keep around the whole request response states between packets - only now the state is no longer associated to a TCP connection but to a plain UDP socket which must handle dozens of parallel requests.

Perhaps the massive pervasiveness of the internet is not obvious on a day to day basis, but you should consider it when judging whether or not TCP was a good decision.

When one consider what HTML and HTTP was meant for initially, TCP makes perfect sense. That it has since been bastardized into a UI framework is a very different matter.

I think the question is not so much could http have used UDP, but rather is the browser the right place to build a complex piece of software. The fact that you can doesn't mean that you should. I am a bit concerned that the browser is becoming the only cross platform API and we are forced to build complex software on top of a scripting language that was only designed to fire up porn advertisement pop ups.

Ignoring the (in)validity of your characterization of JS, web assembly will remove this reliance on JS

I'm not so sure the internet would have been as successful without the guaranteed transmission of tcp.

TCP gets changed all the time. In fact modern TCP stacks are highly tuned for HTTP.

Perhaps this is also why HTTP2 and QUIC have been proposed, but I am not a network guy, so take my guess with a grain of salt.

My memory is fuzzy, but I'm pretty sure I did this with WebRTC a while ago. Use RTCPeerConnection in unreliable mode, "peer" with the server, and you should have a UDP-backed SCTP connection. I'll try to dig up my PoCs on this.

Edit: Oh he mentions this, but invalidates it due to the complexity of typical P2P:

> But from a game developer point of view, all this complexity seems like dead weight, when STUN, ICE and TURN are completely completely unnecessary to communicate with dedicated servers, which have public IPs.

I don't remember this being complex (there was some off-the-shelf library for getting a data connection on the server-side), but YMMV.

That's because you don't need to implement any of those things for a basic client-server architecture. Unfortunately it looks like the author saw the massive and complex webrtc.org implementation and gave up, rather than try writing their own minimal implementation, or borrow from other implementations like Janus (https://janus.conf.meetecho.com/textroomtest.html).

There currently are only massive and complex implementations, because that's what the protocol requires. Even if you leave out the p2p bits you still need to get SCTP over UDP and DTLS support, which are both uncommon. The Janus thing also pulls in usrsctp and openssl for those. In Janus case there's even more things like libnice, which then requires you to use glib. All in all that's a massive set of C dependencies, which not everyone is comfortable pulling in. "Borrowing" from GPL software is also not what everybody is comfortable with. The webrtc.org reference implementation seems to contain half of chromes networking stack, which seems even worse.

I do also now prefer to use other languages than C++ to build servers, let it be Go, Java or C#. In all of those getting native webrtc data channel support is a giant effort, because there's neither sctp nor dtls support available. You could fallback on C libraries for this functionality, but that complicates the build process. Establishing HTTP, websocket or even HTTP/2 support is all easier than webrtc.

Yes; vaguely... there was some (string) message format that you could manually stitch together with the right IP and port, etc, and basically tell the browser's WebRTC implementation "other client is on this IP and port". I seem to have deleted this code unfortunately, the only thing I saved is this link:



Aha! I based my code on: https://github.com/js-platform/node-webrtc/blob/develop/exam...

Yup, the format is called SDP. It's an old, crufty, and mostly effective way to specify not only the port and IP, but also whether a TCP fallback is being used, video and audio formats, and other things you can mostly ignore when just using data channels. See the "answer" section of 5.2.3 of [1], for details of some of the other magic numbers (which your JS library should handle for you). You can also check out about:webrtc to see SDPs made by other websites, like the Janus demo page.

[1] https://tools.ietf.org/html/draft-ietf-rtcweb-sdp-03

You need to implement DTLS and SCTP for client->server, just not ICE. But client->server ICE is fairly trivial with ICE lite. It's DTLS+SCTP that are complex, and they don't go away client->server.

Replacing DTLS+SCTP with QUIC would be a good reduction in complexity, and we (the WebRTC team) are experimenting with doing so (https://cs.chromium.org/chromium/src/third_party/webrtc/p2p/...). It has a long way to go, but hopefully someday just having QUIC on your server will be enough.

(I work on WebRTC)

>> But from a game developer point of view, all this complexity seems like dead weight, when STUN, ICE and TURN are completely completely unnecessary to communicate with dedicated servers, which have public IPs.

> I don't remember this being complex (there was some off-the-shelf library for getting a data connection on the server-side), but YMMV.

I've implemented all of these protocols before for use in VOIP. It's only necessary to discover an open port. Once you have that, it's clear sailing -- you never need to do it again. So I have no idea what the author means. If you have a connection on the open internet, then all of these protocols will realise that they need to do nothing.

Having said that, I have never used WebRTC, so possibly the author is confusing some other WebRTC issue with this.

No you're missing his point which is that implementing STUN, ICE, TURN server side is completely pointless.

It's not complex because it's already built-in to the browser, but if your network architecture is client-server (not p2p), you have to implement that entire stack even though none of it is strictly necessary when communicating to a public IP.

The linked HN discussion has a few options:


with the top comment claiming all of this was required:


and other options that may be newer:


Hi! I'm one of the authors of librtcdcpp.

Its still new, but we're starting to pick up some steam in the pace of development, and we have a pretty easy demo in the repo.

If you're actively pursuing "the simplest server-side WebRTC UDP implementation" and the associated tech support then you should throw UDP into your project description and README alongside "DataChannels" for noobs like me.

The WebRTC Data channel is still unavailable in Opera, Edge and Safari... :( https://developer.microsoft.com/en-us/microsoft-edge/platfor...

Opera works - Safari is working on it? At least in my case, we sell web based phone system and pretty much that means we don't target IE/Edge or Safari simply due to the lack of WebRTC.

Looks like it is coming to Safari soon


I'm the author of WebTorrent, and let me tell you that getting WebRTC to run server-side is not fun.

The two best implementations on npm are `wrtc`: https://www.npmjs.com/package/wrtc and `electron-webrtc`: https://www.npmjs.com/package/electron-webrtc

Both have serious shortcomings. `wrtc` uses the Google webrtc.org library which is overly complex since it includes lots of code for dealing with complex video/audio codec stuff that isn't needed to open a simple data channel. And if you're unlucky enough to be on a platform that they haven't made a prebuilt binary for, then you have to wait an hour for a bunch of Chrome-specific build tools to download and compile the library locally. Not fun to wait an hour after typing 'npm install'.

The other library, `electron-webrtc` literally launches a hidden instance of Electron and communicates with it over IPC from Node.js, which means it runs everywhere that Electron does (without waiting for anything to compile!), but it's a pretty heavy-handed approach. Launching a whole Chromium instance when you need a socket is, like, not ideal.

Another idea: how about implementing just the parts needed for Data Channel in JavaScript? A former Mozilla engineer, now Google engineer, actually tried this but gave up before finishing. His code is here: https://github.com/nickdesaulniers/node-rtc-peer-connection It's also not trivial since it requires a DTLS (TLS over UDP) implementation, and it's not exactly trivial to write one of those in JS. There is a chance that Node.js could expose the DTLS implementation from OpenSSL, since that's already part of Node. Then we could probably achieve a "pure JS" (no npm compile step) implementation of Data Channel. But there's still quite a bit more code needed to finish off this implementation. Discussion here: https://github.com/nodejs/node/issues/2398

I'm grateful to all the awesome folks that worked on these implementations, and this isn't meant to snub any of them. WebRTC is hard!

We actually use `electron-webrtc` in `webtorrent-hybrid`, a CLI version of WebTorrent that can talk to "web peers" in browsers (https://github.com/feross/webtorrent-hybrid). Fortunately, most users can just install WebTorrent Desktop (https://webtorrent.io/desktop/) which works nicely.

But this all goes to show just how overly complex WebRTC is, and why we really do need something like this post suggests, a low-level UDP API for the web.

I built a server-side WebRTC implementation for Twilio (as part of a team of two) in Java (with C bindings to libsrtp and openssl) in mid-2012 (and gradually evolved it over the next year and a half before moving on to a different team/project). I'm certainly not going to say it was easy, but it wasn't as hard as you're making it sound. It turns out you can cheat a lot here and there. You don't need TURN at all, and the ICE/STUN stuff can be dirt-simple because you don't need to deal with UDP hole punching or other esoteric things. You can cheat and hard-code most of the things that would usually need to be variable.

Honestly, most of the difficulty back in 2012 was because all of it was poorly documented and the standards were changing with every Chrome release, so it was partially a reverse-engineering effort. If I had to do it again from scratch (even though it's been at least 2 years since I've touched it), I could probably get something working in a week or two, and production-ready in a month or so.

I think a large part of the difficulty you describe is because you're trying to do this in Javascript. Doing it in a language that already has some library support for the various components cuts out a lot of the pain.

I had exactly the same experience (albeit more recently) getting a simple WebRTC DataChannels implementation up-and-running.

Once you realize its just ICE + DTLS + SCTP, and that each layer has a corresponding library, the work getting it up and running is mainly just 'plumbing'.

Here's a link to the library I've been working on: https://github.com/chadnickbok/librtcdcpp

(Member of the Google WebRTC team here)

We are aware that it's a pain to use just the data channel, and we're working on it. We're refactoring the native API to be more ORTC-like and make it possible to compile without all the audio/video parts. It will take time, but it should become easier as we make progress.

I disagree that the solution is to expose UDP API to the web. Writing good congestion control is very hard, and it would not be safe to let web apps do that; doubly so for crypto. As for native apps (even JS/node ones): you already have a UDP API.

The solution is to make the WebRTC API easier to use and simplify the protocol stack. We're working on the first, as I mentioned. And we're experimenting with the second (by using QUIC instead of SCTP+DTLS). Again, it will take time, but eventually we'll get there (probably).

Feross, thanks very much for writing simple-peer! I used it to enable a server-cluster that acts as a WebRTC -> UDP proxy for my MMO game written in Golang. I am using electron-webrtc to run it under nodejs, which probably means about a gigabyte of overhead in total, but given the AWC EC2 pricing structure, I was doomed to be paying for more RAM that I plan on needing, so that turns out to be a non-issue for me.

As I was working on getting electron-webrtc running on my AWS Ubuntu instance, it occurred to me that it should be relatively easy (in other words, probably hours of tedious work) to abstract out the dependencies for everything that isn't required for only running WebRTC DataConnection. (I could produce a list of libraries to shim with placeholders. Call it 'drtc'?) If you are willing to give me some pointers, I'm willing to do the scut-work. (Contact info in my HN profile.)

IIRC the NAT traversal workarounds are optional in WebRTC? Just don't supply STUN/TURN servers when setting up a connection. Then you don't need to worry about it on the server side either.

(ICE means trying STUN and falling back on TURN).

You must supply a STUN/Turn server to the browser when setting up the connection, but if you're connecting to a public server you can use ICE-Lite implementations server-side to simplify the setup.

The ICE RFC does a good job of explaining it: https://tools.ietf.org/html/rfc5245#section-2.7

You don't need to give a server. It works fine with just host ICE candidates from the server and no stun or relay candidates.

NAT traversal is part of ICE, but another part is formally specifying things like how to trigger accepting a connection, and which connection can be tied to which WebRTC session. This can actually make it easier, not harder, to use WebRTC.

In addition, the SDP exchange also sets up DTLS, making sure that whoever the WebRTC SDP was exchanged with is the same as whoever connects at a low-level. While you can implement this as a messaging exchange over UDP once the connection is established, its a nice property that WebRTC doesn't even allow the connection to be established with a non-secured link.

I think the hardest part of the stack is getting a decent, stand-alone implementation. With things like Websockets creating a server is straightforward, but libraries for low-level webrtc are much harder to build.

Edit: Oh he mentions this, but invalidates it due to the complexity of typical P2P

Use the simple-peer library to hide away all that complexity. Just deactivate "trickle" for negotiation, then you get a simple Answer-Offer exchange with just one message each. I am using this to have WebRTC as a client-server proxy for UDP for my MMO game written in Golang.

> Websites would be able to launch DDoS attacks by coordinating UDP packet floods from browsers.

> New security holes would be created as JavaScript running in web pages could craft malicious UDP packets to probe the internals of corporate networks and report back over HTTPS.

Is this the same reason that the "raw" TCP is not supported on the web? When I first learned WebSocket I was surprised that it is a message-based protocol. What I imagined was a protocol that utilizes HTTP only as a means of negotiation, and the actual transfer is done just as the raw TCP does. But it's not, so I had to create another layer to be interoperable with the existing (raw) TCP server. What was the exact reason?

Edit: Another reason I can think of is the encoding issue, cause TCP is byte-based. But the current WebSocket spec already assumes UTF-8 for textual data, and is also capable of sending bytes using `ArrayBuffer`. I don't see how this would matter in practice.

Just a remark: You can use the websocket spec to emulate TCP behavior, which means get a stream instead of messages: Send only one giant message (payload set the maximum) and use Continuation frames to stream new chunks of data to the other side. FIN will be only set once the stream has finished.

The downside: A lot of websocket APIs (and most especially the one in the browser) don't support this and will only send/receive complete messages. Which means if you want streaming support you better implement it as a layer on top of messages, since it works everywhere.

It's a little bit sad that the websocket spec is complicated through the continuation frame feature while in reality noone has a reason to use it.

And back to the question: The masking "feature" of websockets is also there to prevent browsers from speaking raw TCP. Without it Javascript could craft exact TCP payloads, which might in certain situations be used to directly talk and manipulate internal services. The masking guarantees that the remote on the TCP connection will get some random data after the websocket header.

That's interesting, I've always thought the WebSocket spec is too complicated with all those frame types and message fragmentation, and I completely ignored contunation frames when I had to implement the protocol myself. But it seems more versatile than I imagined. Though the workaround you mentioned sounds a bit hacky.

And yeah, I now remember that masking was also a problem at the time. So even if web browsers adopt a new API to send fragmented messages, it would still not be possible to directly plug in them to legacy servers. Sad.

Edit: It would also be possible to send a single giant message, which contains a single giant continuation frame sent for the lifetime of the session, which is followed by a FIN frame. Am I correct?

Yes, you could use the single giant message too. The drawback to the continuation frame approach is that you can't interleave it with ping frames (for connectivity checks) and that you won't be able to signal end of stream (FIN). If that's not required because your higher level protocol takes care of that things otherwise it would be fine.

The problem that's missing is being able to speak to existing services without requiring them to implement a websocket interface.

The entire websocket code should have never existed in the first place — just use standard TCP with standard TLS, like any other system, too.

Maybe require a browser permission for that, and that issue is solved, too.

I guess they didn't want browsers to annoy users with popups like: "Do you want to allow a TCP connection to host.com:234?" The average user would have no idea what this means and would just accept anything. Maybe they could have elided the check if the destination uses the same host than where the HTML/JS was delivered from, but that would still leave some questions unanswered.

Yes this is the reason, but the message-based part is irrelevant. They need to use an existing http connection where both sides agree on the upgrade to prevent lots of possible security holes.

If you aren't aware, the FTP protocol lets one specify a port and address to connect to; it was not that long ago that servers would not check if the address for the data connection was the same as the address for the control connection, and so one could send data to any port on the internet from an anonymous FTP server. This caused all sorts of headaches.

Yeah, but both parties can leverage the existing connection without resorting to a message-based protocol, can't they? They could just assume they're now sending and receiving bytes after the upgrade.

What I meant when I said "utilizing the legacy TCP sever" was creating some plugin that is attached to the server, effectively acting as a lightweight HTTP server that just does the upgrade process (within the same connection). In this way the remaining communication doesn't have to involve message encapsulation/decapsulation each time a message is transferred.

I think the message encapsulation was chose because it's more work to implement a message-passing system on top of a stream than to do the opposite.

This is still a supported feature of some FTP implementations. It enables the client to initiate a file transfer between 2 FTP servers.

See eg https://support.microsoft.com/en-us/help/247132/how-to-perfo...

I am using WebRTC with a MMO server cluster written in Golang. Basically, I got Simple-Peer WebRTC library to run on top of node-electron in NodeJS. My Golang coordinating server then also coordinates a farm of 4 NodeJS processes which basically run as UDP proxies. The complexity of WebRTC is tamed by Simple-Peer.

One downside: Getting node-electron to run takes some wading through shared library installs to satisfy Chrome dependencies. However, this turns out to be a series of straightforward responses to error messages. Another downside: I'm wasting some memory and swap on Chrome dependencies I'm not using. In the case of my game, I'm always going to be more CPU/bandwidth bound than memory bound, so this turns out to be a non-issue. A third downside: WebRTC uses up 72 bytes of each packet with its own header information.

All that said, it appears to be running like a champ. It also makes a big difference in the play-ability of my game in poor network conditions.

Are you using the data channel p2p or client-to-server?

(I work on WebRTC and implemented much of the data channel; I'm always interested when someone uses it)

Client server. I'm being extra paranoid about cheating/security, so it's a naive client-server, authoritative server architecture. The client isn't much more than a dumb server, but for 2D sprites, not text.

I think some of the ensuing discussion here says clearly why UDP in the browser (at least as in giving developers access to firing off random UDP packets from a browser) is a bad idea: people don't understand network protocols, yet they feel they have something to contribute to the discussion.

I'd like to focus on the security issue of 'UDP probes and sending results back over HTTPS'. This is an /inherent security issue/ of running ANYONE else's code within a corporate network.

ECMAscript/JavaScript should NOT be blindly run within corporate networks, and web 2.0 is insane for making all websites expect that.

If your underbelly is that soft you have major security issues. There are dozens and dozens of ways to get a network probe behind a firewall. Pretty much anything that can be told to fetch anything anywhere can possibly be used. HTML can do a host scan or even a port scan without JavaScript using maliciously crafted DNS records and img or other tags combined with the use of timing to determine which ones are blocking and which ones got a connection refused or a connection right away.

Firewalls are a much more porous than most people think. It's only a first line of defense to prevent badly configured or broken junk on your network from being instantly exploitable.


Can you elaborate on this? Legitimately curious.

It is allows malicious code to scan ports. It allows easy building of difficult to manage tunnels (never mind corkscrew and port 443 SSH servers breaking out of 90% of corp networks I have been on via their HTTP/S proxy)

Is port scanning dangerous? How can JS in a browser build a tunnel to anything?

Port scanning itself is obviously not dangerous, however I believe the author is imagining a situation where a user on the other side of a corporate firewall views a webpage which then portscans from the inside then sends the results home/starts an attack on specific ports.

edit: To answer your direct question, port scanning is probably not /that/ dangerous. It is just information gathering. You almost always need to take steps after that. But if you can build a port scan you can build any other protocol up via this scheme (possibly even just proxy other UDP tools directly) and attack internal UDP services. It just seems like a glaring network hole considering how most corp LANs are set up.

This would take a few pieces, but port scanning is pretty feasible considering I have already done it more painfully using XSS exploitation frameworks to demonstrate XSS risks for customers before.

1.) You need some javascript running in a browser on the corporate network all set up for the corporate proxy to get out to "the internet". This browser would be on a machine on the internal LAN.

2.) UDP access enabled in JavaScript somehow.

3.) A user wanting to purposefully tunnel traffic would then set up a local UDP listener and build a proxy that would communicate with the JS using UDP. You would then need an HTTPS end point "on the internet" that would make your legitimate requests. You can do long lived streaming connections. The machinery isn't that important.

4.) That is it. [Special HTTP Proxy]<--->[Corp HTTP Proxy]<--->[Browser]<--->[UDP Proxy]<--->[Your apps that want to escape]

5.) You would of course reuse existing protocols and there are a ton of ways to flow all of this at each step. But it is just one more hole.

Side note: You can perform sort of crude TCP port scanning already with a meaty enough XSS exploit. An unfettered UDP connection would let you do UDP scanning to an internal LAN then as well.

Anyway this whole scenario is really complex, it would be much easier to just use Corkscrew and the existing corporate HTTPS proxy, because you need to invent a browser to UDP proxy scheme and a browser to "your HTTP proxy" scheme that can ultimately do generic UDP/TCP requests traffic. But your system listening on HTTP and or HTTP/S on the Internet would get requests and make the actual generic UDP/TCP connections for you. Honestly this whole scheme sounds complex and annoying and there are already other schemes like proxying TCP via DNS that are more accessible for exfiltrating data :)

The general point is code in a browser is inherently "permitted" and easy to get going on an internal corporate network. The more ways that code can reach out of its sandbox the larger the attack and defense surface you have as a corporate network trying to keep machines locked down and "under control".

js exploits in major browsers leading to remote code execution

There are known attacks from the last year, and probably a huge pile of 0-days waiting to be used.

Parasitic Companies want the ability to run anything they want, on anyone's machine, accessing all of the users personal data, as well as their competitors data if they can get at it, and use it for whatever reason they want, and have the ability to secure those revenue streams by erecting wholly artificial and socially destructive barriers to their removal, such as getting accepted to represent their interests in standards bodies then twisting the standards to do what they want. An example of this is EUI addresses in IPV6 with the Mac address as part of the IP address as a revenue stream maximization method for advertisers.

There was a you tube video awhile ago showing java-script able to run an operating system in a web browser as well as games inside the OS. Recently, Chrome added a task manager and user logins. Chrome is no longer a web browser, it's an operating system, and what enables an entire ecosystem of abuse is java-script.

In a few years hence, someone will figure out a technology that will disable java-script from running on client machines via firewall filter. It will break a lot of websites. Let it.

Each tab runs in its own thread, a page consumes CPU and RAM. I'm not sure why it's a bad thing that a web browser has a task manager. Especially since one of the powerful things about the web is that no extra software is required for debugging / development.

It's totally possible today to create a filter that will scan for JS and remove it from sites with one of those man-in-the-middle corporate proxies.

But it'll break pretty much every site. If you're talking about a company, that's as conductive to security as mandating weekly password changes requiring no repetition, symbols, lower + upper case and 15 characters.

Users will find ways around it. Like bringing unmanaged devices to browse the internet.

In terms of broken websites, corporate users don't have the influence as they did in years past.

Since we're trending towards pretty much every person in more economically developed countries having smart phones.

If a website doesn't work for a business's users I don't see how site owners will care.

On the development of web browsers becoming an OS of their own. Sure there's work to be done to improve security. For example, fingerprinting needs to be properly mitigated [1].

The web is open. Anyone can implement a web browser. Chrome, Safari, Firefox, Edge are trending towards writing one webpage/app and having it run everywhere.

Windows, OSX, Linux, Intel, ARM, Desktop/Laptop, Mobile/Tablet. Aren't a concern past display/formatting.

[1] https://panopticlick.eff.org/

Meet DNS rebinding: any JS can probe and interact with all internal services.

The more and more I reflect on it, the greater and greater a mistake JavaScript turns out to have been.

What's a shame is that there is a need for much of what it offers (and, if not a need then at least a desire for much of the rest). If only there were a sane, least-privileged alternative.

> It falls down because WebRTC is extremely complex. This complexity is understandable, being designed primarily to support peer-to-peer communication between browsers, WebRTC needs STUN, ICE and TURN support for NAT traversal and packet forwarding in the worst case.

You don't need WebRTC's level of complexity to do P2P. WebRTC is an over-engineered Rube Goldberg machine for many reasons, including the fact that it tries to include so many capabilities in a single standard. It's also complex because STUN, TURN, and ICE are overly complex. WebRTC should have used a simpler underlying design. You really, really don't need all that. I say this as a developer of multiple P2P protocols including some that are used in serious applications.

Then there's stuff like: https://github.com/js-platform/node-webrtc -- you don't have to do your whole backend in Node if you don't want, but you could plug something like this in on your servers and speak WebRTC to clients. In P2P a server is just another node and there is nothing the says servers can't talk in P2P networks.

I do believe it would be nice to have UDP in browsers somehow. In fact I think it would have been far better to just add UDP to the browser spec and leave WebRTC out completely. It could have been implemented in JS (and WASM in the future) using TCP and UDP transport provided by the browser. Doing it the way it's been done is like including application software in a computer's ROM. It's the wrong place to put that level of abstraction.

I believe the security model in browsers is fundamentally wrong. I understand why it evolved the way it did, but it is far from optimal.

In particular, I mean that subrequests are sent by default with credentials (cookies). This is backwards. In an ideal world, I should be able to send arbitrary anonymous HTTP requests, and should have to jump through hoops to send along the user's cookies. In reality, it is the other way around. This is the original sin that makes CSRF possible. Any site can send a request to any other site I am logged into, and pretend to be me. I understand that we can't change that now, but it just seems wrong.

Right now, you can send anonymous requests (e.g. using windows.fetch), but unless a specific CORS header is returned, you cannot access the results! Ideally, it would be the other way around, and you could only send authenticated requests if the target site cooperates.

Now you can make as many requests as you want (e.g. by using img or iframe tags), but if you want to use the results, you need 1. the site's cooperation, or 2. you need to include remote javascript. This is the next backwards thing. Why do I have to execute remote javascript when I just want JSON data? This should be absolutely discouraged in an ideal world!

The current security model precludes a lot of things. One is the classical Mashup. Back then, when Web2.0 was hot, I made a simple HTML page that scraped another website for geodata and wanted to display it on a Google Maps map. Imagine my shock when I found that I cannot read another (open, non-credentialed) web page with XMLHttpRequest.

There are a bunch of applications that I just cannot write within the current security model, and I believe it is in the interest of certain people to stay this way, so that they can remain in control of content.

The same applies even more when I think of allowing UDP or general network access. Imagine an IMAP client or BitTorrent client in the browser. A pirate iTunes that doesn't rely on a central server. A Tor implementation. You could even do JS crypto somewhat securely - you wouldn't serve the .html page over HTTP, just a simple static file that you'd deliver like any other program.

Sometimes I dream of a HTML 6+ "application profile" that would be opt-in, and allow you to do these kind of things.

I think the title of the blog post is misleading.

What he really wants is a connection-oriented, TCP-like protocol with more flexible handling of missing/delayed packets. (As this is a frequent use-case in game development).

His solution is a UDP-based protocol that satisfies this and other use-cases specific to networked games while keeping the connection semantics of TCP.

However the fact that it's UDP-based seems like an implementation detail to me. The proposed solution hides a lot of other UDP-features that are not contributing to those use-cases but would absolutely make sense in other situations (like connectionless messages and broadcasts)

> What he really wants is a connection-oriented, TCP-like protocol with more flexible handling of missing/delayed packets. (As this is a frequent use-case in game development).

What I really want is the most UDP-like thing possible in the browser, that lets people build whatever higher level protocol they want on top of it, just like game developers do on every other platform.

Being connection based is important because if it wasn't then it would quickly turn into a DDoS tool, or a security nightmare for probing internal networks for vulnerabilities.

So clearly it needs to be connection based. And if it needs to be connection based, then that rules out broadcast packets too.

Would a clear-text token, in addition to an encrypted connect token, allow servers to drop packets faster (since it wouldn't require decryption)? A clear token could be changed (cached, expired) however the back-end wanted, which would include dropping all new connection attempts when under DDOS.

I didn't see a port number anywhere. Shouldn't you define 443/UDP (or other) as the only connection port? This would allow networks to manage the traffic, for example dropping all UDP/443 traffic internally, or managing bandwidth at a firewall.

Finally, shouldn't P2P be in the mix? If not now, won't we be doing this again in a few years?

Thanks for taking on such a big challenge, and for listening to the peanut gallery.

I actually do stuff like this, in the connection request packet a mix of "additional data" protected by the AEAD as well as encrypted data (keys), allows me to quickly reject stale tokens because the expire timestamp is public, but protected with signature check, vs. having to fully decrypt first to see the token is stale.

I don't have any defined port because in the game industry we tend to run multiple game servers on the same box, so we take up a bunch of ports. I acknowledge that this could cause issues with corporate firewalls, and would be open to ideas how to solve this.

Regarding P2P, I think WebRTC solves the problem of browser to browser communication just fine. Any attempt to support p2p this would just end up rewriting WebRTC.

What I'm trying to do is solve just one small part of the problem, in a WebSockets like way, just one thing that would enable a lot of other cool things to be built on it, if it were adopted by browsers.


> quickly reject stale tokens because the expire timestamp is public, but protected with signature check

A forged timestamp would still need a signature check, but a clear-text token would not. The speed difference would only be important with a flood of forged packets, each of which would need to have it's signature checked. Additionally, each system could implement a different type of defense: like issuing regional clear-text tokens, or A/B tokens to find which account is leaking a clear-text token, or giving each server it's own token so a DDOS attempt that wanted to force signature checks would need to get the right token (and get it again when an attacked system changes it.) Hosting providers that offer DDOS protection today could offer integrations based on clear-text tokens that wouldn't need any interaction from a game system (other than submitting the clear-text tokens.) I don't know if routers typically have the ability to check a signature, but they can probably drop packets that don't pass a (clear-text token) mask check.

The port issue is interesting. Maybe a range of ports (44300-45100) would work for network administrators. Maybe there should be restrictions and a listing of uses like the first 1024 (this is voice, that port's video, file transfer, etc...)

Browsers should also limit connections. Only 2 connections/IPs at a time? Only JavaScript from the origin domain (the address in the URL) can open connections, not advertisements or iframes. Maybe ask the user to open connections like browsers ask to use your location or audio/video.

I wonder too if a browser should only open a connection when presented with a request signed using the (private) domain SSL cert? That would drastically reduce the ability of bad JavaScript to co-opt browsers of a website (into a DDOS of connection requests.) To use a websites traffic for a DDOS, you would need the SSL keys. Browsers could also be required to add the domain and verified domain sig to a connection request, so a website that cause a DDOS would be identified. I'm not sure identifying the website would lead to accountability since they may have been compromised, but it would prove which domain caused a browser based DDOS; reddit.com couldn't claim a bad advertisement caused their millions of browsers to open connections. This wouldn't stop forged packets, but it should stop innocent web-browsers from becoming a tool for DDOS.

You write that the big disadvantage of WebRTC is the complexity. There is a nice library named simple-peer that hides all of that. Using that and node-electron, I implemented a small server cluster that acts as a UDP to WebRTC proxy for my game. This gives you encryption and connection orientation.

See my other comment for the downsides: https://news.ycombinator.com/item?id=13745055

Don't worry about mitigating DDoS beyond not exposing amplification weaknesses. At the end of the day, DDoS is an economic problem, not a technical one; it crucially hinges on scarcity of bandwidth. It's not your problem to solve.

I understand that you love UDP and I won't try to move you on that. But can we avoid browsers? They're basically the worst things ever.

Please list the DDoS concerns you have with the proposal and I'm happy to discuss them.

> Additionally, the server enforces that only one client with a given IP address may be connected at any time

So much for people behind carrier NATs

My bad: I meant IP address + port. Fixed in the article.

Well what good is that? You can't tell if multiple IP,port pairs are from a single client or multiple clients.

The nat device will have different source ports for two connected clients.

It would also have different source ports if the client is using random source ports as well. So two client connections from the same host behind NAT would look exactly the same as two client connections from two different hosts behind NAT.

Hence the authentication on the web backend and the unique 64bit client id.

The use of libsodium looks absolutely brilliant. Would the author recommend using netcode.io for general games, not just agar.io or browser style stuff?

I'm an external contributor who did the C# bindings for netcode.io available here: https://github.com/RedpointGames/netcode.io

As someone who wrote her own (non-encrypted) networking protocol for desktop games, netcode.io certainly seems easier (especially with the built-in security), so in the near future I intend to move my projects over to using netcode.io instead of what I have currently.

Yes, it would work great for client/server games, provided that you host dedicated servers and have a web backend to generate the connect tokens (this could be your matchmaker, for example).

Answer: a browser was for traversing hypertext documents. UDP basically is not good for this.

Now a browser is just an abstracted operating system. So, fuck it, I guess, let's add a second TCP/IP stack.

More like TCP/IP is not good for data where only the current state is relevant and old state can be discarded: live video, games, stock prices...

The browser is used for all these things; why not have a protocol that works for these scenarios?

(Client<->Server; WebRTC and ObjectRTC are p2p orientated and excessively complex due to this as discussed in article)

Since the P2P feature set is a superset of the client-server feature set, it would only increase complexity in the browser to add another protocol for the client-server feature set.

ObjectRTC? That's a new one..

The browser is a much better multi-user operating system than the actual operating systems people tend to use. A video game on the average Windows, Mac, or Linux desktop has full read and write access to the memory of a tax app on the same desktop, defeating the entire point of memory protection. If the game and the tax app are in a browser (or in a mobile OS, to their credit), this isn't so.

That desktop OSes happen to be called "desktop OSes" and web browsers happen to be called "web browsers" is fairly irrelevant to what they're actually good at. Desktop OSes make great hardware abstraction layers, just as server OSes make great hardware abstraction layers for virtualization platforms. But they're fundamentally misdesigned for much of anything else.

Desktop operating systems have the capacity to do this; "server operating systems" (which are just desktop operating systems configured differently) already do this.

You run each program as a different user account. The HTTP daemon doesn't run as the same user as the DNS daemon even if they're on the same machine.

This is obviously annoying on a desktop because you have to use user switching to use another app. It would be better if desktop operating systems provided native support for it so that user accounts had a sub-account per app and apps automatically ran like that, and then a permission prompt if one app tries to access another app's files, which you can grant as one-time access or permanently.

Then you expect your email program to try to access a document you're attaching to an email, but if a video game tries to read your tax returns you can be appropriately suspicious.

Correct. This is a design problem, caused by building ill-suited interfaces. (This isn't a criticism of the people who designed the interfaces; we simply didn't have the depth of experience when UNIX and Win32 and their predecessors were designed, and now we're stuck with compatibility for those.)

The interface between a web page and the outside world allows it to do things like access its site and not others, request a particular file be opened, send structured messages to other sites that wish to opt in to such access, etc.

The interface between a process and the outside world gives it either TCP sockets on behalf of the machine, or nothing at all; either the ability to open any file as the current user, or nothing at all.

I used to be excited about proposals for desktop operating systems to provide sandboxing via multiple UIDs and native interfaces for passing files around and powerboxes and all of that good stuff. Then I realized that the web platform does all of that and is very widely adopted and well-tested, and there's no hope of the existing platforms catching up.

I used to have two user accounts, one of which was allowed to run the Flash plugin so I could listen to music, and one of which held my important work. Then the web folks figured out how to sandbox the Flash plugin without requiring me to Ctrl-Alt-F8 to pause a song. Then they figured out how to not use a native-code plugin at all.

I don't particularly like this conclusion in an abstract sense, but it's certainly let me get on with getting things done instead of hoping for a future that will never come.

This is more or less what the sandbox mechanism on macOS does today. Unless a sandboxed app was given special entitlements from Apple, it can only access files that were brought into the sandbox at some point (for instance by selecting it in an 'open file' dialog, aka 'powerbox').

Firejail is an other good approach.

  A video game on the average Windows, Mac, or Linux desktop has full read and write access to the memory of a tax app on the same desktop
What? No...

Not the in process memory, but yes to the stored data, which is arguably as bad.

On Linux, generally speaking yes to reading the in-process memory too.

I can read /proc/$PID/mem provided my UID = the effective UID of the process.

Or I can also use ptrace(), again if my UID = the effective UID of the process, both to read the processes' memory and even to modify its code at runtime.

(Some LSMs will block this...)

What? No...

Many games on Windows even ask for administrator to run (or update), so that's already game over from a security perspective.

Debugging APIs.

If you stick to App Store (sandboxed) apps on the Mac, you're safe from this. Those apps only have access to files explicitly opened by the user (through the system open file dialog, drag & drop, etc)

> Why can't I send UDP packets from a browser?

You can, if the browser is Chrome: https://developer.chrome.com/apps/sockets_udp

Worth noting this is using the "chrome.etc" object, which is only available to apps and/or extensions.

Which, at this point, just means extensions: https://blog.chromium.org/2016/08/from-chrome-apps-to-web.ht...

There are still critical holes in the extension API that they haven't plugged, e.g. filesystem access (super duper important if you want to make an app that can edit / create images, for instance, and not want to shove everything into the downloads folder or force a chooser every time). I'm not sure how they're going to shut things down while crippling the ecosystem so severely (though to be fair, Firefox is doing a similar crippling-move (but for solid reasons, IMO)).

Firefox can do this via extensions as well. That's how the Beef fake flash update works (presents something that looks like it's a flash update that really installs a browser extension).

Obviously going out of his way to implement in a secure way as a software practitioner. I would love to see a review of the design and implementation from the standpoint of a security expert.

Interesting idea I had: why not have many QUIC streams?

I think the only equivalent to UDP would be: 1 QUIC stream per packet, which avoids blocking inside a stream. I'm not into QUIC, but I guess creating and managing a (temporary) stream would have a lot more overhead than just sending an encrypted UDP packet.

The other downside is that the browser (currently only chrome anyway) would present QUIC to the user only in form of QUIC streams with HTTP on top - which means the browser would reject any streams with non HTTP content and there's no API in Javascript for working with other content. The XHR and fetch APIs will utilize QUIC but only for HTTP on top of it.

This means for transmitting unreliable and potentially unordered packets we would need a new browser API anyway. The webrtc data channel API allows to read/write this kind of data, but it's however bound to the webrtc protocol stack (SCTP/DTLS).

This would be the same as having a single QUIC stream, since QUIC already supports multiple data streams. The author addresses that with "Unfortunately, while head of line blocking is eliminated across unrelated data streams, it still exists within each stream."

However, you could use a pool of x streams; when sending an update for the tick 't' then use the (t mod x)th stream, and thus if one stream becomes blocking for some time then you do receive the next update (since it's using a different stream) and you can treat it something as (1/x) rate of UDP packet loss.

Are you thinking one stream per packet?

You could have a pool of them, right? Or would you have to actually tear it down after each packet

Maybe. Why not?

but QUIC uses UDP?

Minecraft works great over TCP[1], so casual clickers like agar.io mentioned there, and most games except fast-paced shooters will work too.

[1] http://wiki.vg/Protocol

Minecraft "works" over TCP but I wouldn't call it great. TCP works great for things like block data since keeping those ordered makes it easy to send changes instead of resending whole chunks. On the other hand, entity movement being TCP is a real pain due to the head of line blocking which is one of the reasons PvP can be a pain and why it can be impossible to melee mobs without them hitting you first.

I like this take on UDP v TCP: https://1024monkeys.wordpress.com/2014/04/01/game-servers-ud...

While agar is a casual clicker the core game experience revolves around sorting out collision conflicts. I can see how high performance would be desirable.

Some things I noticed about the proposed protocol:

- It's not clear how data is encrypted. I don't see a client key being generated. It will be very difficult to do this right without using something like TLS. Are the keys just sent over an unencrypted channel?

- The connect token should contain the client's IP address, so the server can verify the same IP that got the token is trying to connect. This prevents someone from using a valid token for reflection attacks.

The symmetric key / connection token is generated with netcode_generate_connect_token. This is normally done on your authentication server, which runs over normal HTTPS. Thus the workflow is basically like this:

- Web browser connects to like https://auth.myapp.com. This can either be part of the login process or server discovery, it's totally up to you and specific to your app. This service calls netcode_generate_connect_token and returns the symmetric key / connection token in the response. - The client then uses this symmetric key / connection token when calling netcode_client_connect. - And you're done, you can now use the rest of the netcode APIs.

The second thing that you mentioned isn't required. The token is unique to the client already, so a server can confirm that it's a specific client connecting.

Presumably the keys are sent over HTTPS.

Yes, the connect token is transmitted from web backend to client over HTTPS.

This token contains public and private data. The public data tells the client which servers to connect to, what the keys are etc, hence the need for HTTPS.

The private data is what gets sent over UDP in connection handshake, and is encrypted with a libsodium AEAD primitive using a private key shared between the backend matchmaker and the dedicated server instances.

Because of this clients cannot read, modify or forge the connect token private data, so cannot connect unless they get a token from the backend.

Tokens are only valid for a specific authenticated client to connect to a small set of n dedicated servers, and expire quickly (30-45 seconds).

Do the authentication over TCP and then fallback to UDP for rest of the communication? For encryption, have you considered DTLS? Read more in `Hybrid Implementation` section of http://ithare.com/udp-for-games-security-encryption-and-ddos...

The article misses one important reason why allowing UDP from the browser is a bad idea:

UDP provides no congestion control. Given the (lack of) quality of what gets developed for the browser, allowing it to send raw UDP would most likely cause an extraordinary amount of problems for the Internet as a whole.

Games have been spamming UDP packets over the Internet for the last 20 years and the Internet seems to be doing OK.

Why enforce server limits? I'm building a multiplayer game where user actions can happen at most every 150 ms. Am I naive thinking that one server should be able to handle thousands of simultaneous users? I understand the limits when it comes to fast-paced games though.

It depends. MMO's can generally handle thousands of users per second, but realistically those generate only one event per second. There's faster-paced games like Planetside and Battlefield which allow for hundreds of users per server though.


> but realistically those generate only one event per second

How does that work with seeing other avatars walking around in real-time in for example WoW? Are their trajectory only updated once a second? In my game it's not really displayed in that way but I'm curious.

Player position is not very critical in WoW or similar games. Combat is mostly selecting an enemy and attacking it. Or selecting enemy and casting a ranged spell. The game is designed with this server lag in the mind.

Servers can easily sync player positions once per second and the clients would happily interpolate between prev & current positions. You are probably seeing things a little late/different than server but it does not matter.

can't understand this bashing of WebRTC, it is good technology, TURN config is optional is not compulsory, but believe STUN is still necessary because, unlike the public server, the client is behind NAT, but then again, setting up a TURN+STUN server is a trivial job...

Why would STUN be necessary if the client is behind NAT? So long as the server is not, the client can send the first packet and NATs will establish associations for the packets from the server to arrive to the client.

Why can't I DDoS an arbitrary host using nothing but twitter accounts?

Also includes alternative protocol implementation design and source for protocol.

Would love a simple way to do client<->server udp from browser!

on mobile devices or accessing internet through mobile network, i would wager that it is congestion control which causes more heartburn than head-of-line-blocking.

this is predicated on the fact that packet loss e.g. due to poor signal etc. is conflated with congestion in the network. this then results in tcp reverting to slow-start etc etc

You could always use something like Electron if you can get users to download your app, which would allow you to use whatever node native networking libraries there are, I'm sure you could send UDP packets or anything else, and still use web technologies and the browser to develop your app/game.

Why was this down voted? lol its a legitimate solution..

> Why can't I send UDP packets from a browser?

My tweet from May 2015 is titled "I've sent my first UDP packet from a browser!" [1]

[1] https://twitter.com/shurcooL/status/605218976969261056

Looks like Firefox? Did you write a browser extension?

No, it was an experimental implementation of https://www.w3.org/TR/tcp-udp-sockets/ actually. Regular Firefox browser, not Firefox OS.

Yes, it was Firefox.

I did not write any extensions, I simply turned on an experimental flag for the TCP and UDP Socket API. [1]

[1] https://www.w3.org/TR/tcp-udp-sockets/

I'm not really sympathetic to the arguments against WebRTC. Ideally someone makes an open source library or 12, problem solved.

In fact looking at other comments here it sounds like it's not hard at all.

One other problem with WebRTC at the moment though, it's not supported by Safari

WebRTC will be enabled in Safari production builds shortly https://bugs.webkit.org/show_bug.cgi?id=168858

Could the "connect" token just be a JWT token?

Network congestion.

Cool, another protocol which claims for security and a reference implementation in... C

Because there's no such thing as UDP "packets"?

So let me get this straight... Basically this guy is trying to champion one of two things:

- Yet another attempt to emulate a full OS-level network stack on top of HTTP (which already runs on top of TCP/IP).

- Giving random web-pages access to RAW OS-level networking, and not just sockets, from untrusted internet JS.

How incredibly ass backwards am I if I find both proposals preposterous and dangerous?

> One solution would be for Google to make it significantly easier for game developers to integrate WebRTC data channel support in their dedicated servers.

I'm confused. Since when did Google decide what was web-standards? Is he deliberately mistaking Google for being W3C?

I don't think you got it straight.

> Yet another attempt to emulate a full OS-level network stack on top of HTTP

The proposed protocol does not work "on top of" HTTP. From the article:

> The basic idea is that the web backend performs authentication and when a client wants to play, it makes a REST call to obtain a connect token which is passed to the dedicated server as part of the connection handshake over UDP.

So both the handshake and the actual data transfer happen directly over UDP.


> Giving random web-pages access to RAW OS-level networking, and not just sockets, from untrusted internet JS.

The "Why not just let people send UDP?" section in the article actually explains why that would not be a good idea.

He's not advocating either.. keep reading

I kinda disagree with the premise about UDP:

> It would greatly improve the networking of these games.

This is not accurate, i know it's like a rule of thumb that UDP is faster than TCP, but that's naive, there are cases where UDP will end up being slower, and to be honest, it depends on lots of variables, and at many times, the congestion control, packet ordering and all the bells and whistles that comes with TCP is well worth the little performance penalty, assuming that in that specific case it actually is slower than UDP

And of course if I'm wrong, someone will let me know :D

UDP provides massively better quality of service for realtime data wherein there aren't usually dependencies between particular pieces of data.

For example, if you are transmitting the position of some guy in a world, N times per second ... and you drop one particular packet ... that's fine, you just get the next one and you have more up-to-date information anyway.

TCP will block the entire connection when that packet is dropped, waiting until it is received again, and not giving any of the subsequent information to the application. This is bad in THREE different ways: (1) By the time the new position is received, it is old and we don't care about it any more anyway; (2) Subsequent position data was delayed waiting on that retransmit and now most of that data is junk too, EVEN THOUGH WE ACTUALLY RECEIVED IT IN TIME AND COULD HAVE ACTED ON IT, but nobody told the application; (3) Other data on the stream that had nothing to with the position was similarly delayed and is now mostly junk too (for example, positions of other guys in totally other places in the world).

It is hard to overstate how bad TCP is for this kind of application.

You are absolutely correct, but not all games are FPSes, I just think that the generalisation in the premise that UDP is better than TCP for networked games is naive, you have to put into consideration the type of game and what data is actually being sent.

What I said is true for almost every kind of game. POV of the camera has nothing to do with it.

The main exception is low-number-of-player token-ring style games like RTSs with tons of units. Those usually simulate in lockstep, with the full state of the world extrapolated from inputs that consist of a very small amount of data. This means network traffic is relatively low, but in order for this to work you have to have complete knowledge of everything and exactly when it happened, which means no packet loss can be accepted and everything must be processed in order. So then you have the same kinds of problems as with TCP (even if the underlying transmission is via some other protocol) ... thus these games operate with some large amount of latency to hide these problems.

But, this network design is only the case for a minority of games. Just about any modern multiplayer game that is drop-in/drop-out, where the developers really care about quality of experience, is better off going UDP. (This is not to say that developers always do the best thing, since it's much easier to just say screw it and talk over TCP and call it a day. The temptation to do this is heightened because of all kinds of problems with NAT punchthrough and whatnot; because so much traffic is Web-oriented these days lots of routers mainly care about that, which causes all kinds of interesting annoyances. Thus games that do talk over UDP generally fall back to TCP if they are unable to initiate a UDP connection).

Well, there is one other case of games that run in lockstep, which is when they are console games made by developers who want to avoid incurring the costs of running servers (which are often much higher on consoles because the platform holder charges you out the nose). When you are running in lockstep like that it is more like the RTS scenario above, and thus it doesn't matter much if you use TCP because you are already taking the quality hit. But this is a cost-cutting kind of decision, not an it's-best-for-gameplay kind of decision.

P.S. It's not a good idea to call someone naive about a subject where you yourself may not know enough to correctly judge naivete.

> What I said is true for almost every kind of game. POV of the camera has nothing to do with it.

You misundrestood what I'm talking about, I'm not talking about the camera position, I'm talking about different games like turn based games, chess, board games, etc..

Lockstep is out of the scope here, surely not needed in a chess game. I'm talking deterministic games here.

> What I said is true for almost every kind of game. POV of the camera has nothing to do with it.

Again, out of scope, I'm saying UDP is not the answer to all games and the assumption that if a game is networked then you should go with UDP is naive as I'm trying to say, forget open world, forget FPS, think about a simple turn based game.

> P.S. It's not a good idea to call someone naive about a subject where you yourself may not know enough to correctly judge naivete.

Where did I call anyone naive ? thre's a difference saying you're naive and saying "the assumption is naive" I do believe i know enough that's why I'm commenting in the first place, even though I stated I might be wrong about some points, I did not personally call anyone names, go back and read my comments.

If you are solving an easy enough problem, then sure, you can use a crappy technical foundation. You can use anything if the problem is easy enough.

The point of engineering is to solve actual hard problems.

What people usually want is all the bells and whistles of TCP without the occasional huge latency spike introduced by the guaranteed ordered packet delivery to the application layer.

Unfortunately, all the alternatives (SCTP etc etc) seem to have foundered, probably due to aggressive filtering by border routers/firewalls making it impossible for alternative protocols to gain traction.

Exactly my point about the whole "many variables involved", many networks still filter UDP as you say, which is really bad if you want to make sure everyone can use your game

SCTP droped 100% time as there no proper PNAT available for it. To make traction other protocols there need "happy eyeballs for transport" so it can be transparent for end-users


You are wrong, I would attack your argument if you gave one beyond that TCP has some convenient features but you didn't. So all I have to say is that you should try playing slitherio agario on an internet connection that isn't in a metropolitan area.

There is a reason Webrtc exists.

I'm surely wrong at many things yes, but you are also wrong at some points, if i play a game in a bad network setting, UDP will still be a bad option, packets will be probably very mangled on a bad network and will offer no superiority over TCP, you will see more blockage on TCP, but you will probably also see your player jump all over the place with UDP coming in totally out of order, and if you implement your own way of reordering packets, then you are almost reimplementing TCP over UDP again. if a network is bad, it's bad no matter what networking protocol you'll be using.

The only thing that makes UDP faster is that it comes without TCP's convenience features, and for many games, this is more than enough and the performance penalty is not usually worth it (unless you're a AAA with lots of resources and dedicated team just for the networking part)

besides that if you simply opt-in to UDP because it's better, then you'll spend much more time working on the network than actually working on the game.

my problem was with the generalisation in the premise, that `UDP is better than TCP for networked games`, this is not the case for a deterministic game of chess or any card/board game.

UDP's only point in this article is realtime data intensive games like FPS'es.

also I'm not putting security into consideration in this argument.

What's your thoughts ?

  then you are almost reimplementing TCP over UDP again
you implement TCP features only when you need them. player positions for example are not critical, you can miss some but your game would still work if you are interpolating things correctly. if your player position message gets lost, you don't need it anyway you only need the newest position. same goes for order of the packets, if you are not needing such a thing, your lost packets would delay everything in the game.

For these reasons, TCP suffers a lot more if the network is bad. You really don't need to be a AAA company to implement TCP features on UDP.

  UDP's only point in this article is realtime data intensive games like FPS'es.
Well article says, as you quoted:

> It would greatly improve the networking of these games.

And 'these games' are he means real time games like agario. It is not a FPS but it is a real time game. There is no reason to use UDP if it is a turn based game, like a card game. But a real time is pretty much always would be better with UDP

Totally agree with you there, but player position criticality is not for me or you to decide, it's game dependant, my worries is not about dropped packets, but packets that are in the wrong order, only in a game where combat has to be precise I would say it's critical.

Yeah you don't need to be a AAA for sure, i was exaggerating a bit to make my point, what i'm trying to say is that newcomers to gamedev will end up struggling, TCP isn't easy, neither is UDP, without proper understanding of both protocols there will surely be mistakes made.

I do agree with the article 100%, I'm not saying in that use case that TCP is better, I'm just disagreeing with the premise as it's a bit vague and implies that you know what agar.io and slither.io are, I know both games and i know they're realtime, but there are people out there who don't know what they are, so I'm saying that just thinking that UDP is faster than TCP so it's best for networked games is naive, a better way to say it is just mentioning that this is for realtime games, better yet realtime games that can't allow network delays and are not critical, WOW for example uses TCP for some parts of the game that are not critical, where player position doesn't really affect the game, they end up with huge latency now and then, but the game is programmed in a way that some latency won't matter. in other cases they use UDP, that's also one of my reasons to say that UDP is not the only answer.

So yes, you are 100% right, and the article is 100% right, I just don't like the premise (the way it was written at least)

Everyone seems to miss the whole point of my comment up here, so here is what I was trying to say:

1- UDP is not always the answer, don't overcomplicate things, I'm trying to say that there are games like chess,board games, basically deterministic games that are not realtime. it is very wrong to use UDP for that game style, it's just a huge overkill.

2- UDP can in fact be slower for some use cases than TCP, there are lots of variables and unknowns.

3- Just opting into UDP because "it's faster" is very wrong, before jumping to UDP, there is a whole lot of networking theory that needs to be learned, and I'm not concerned about AAA here, I'm concerned about new beginners and indies that will read this post and think that this is the only right way to do things.

4- UDP is filtered on some networks still, you'll lose some players

5- If your network is so bad that TCP is not fast enough or that older packets are blocking newer ones, it'll be the same with UDP, packets will come in mangled and out of order, ant the game's "Quality" will still be bad with your avatar flying all over the place. until the developer implements a way to reorder packets, which is still more work.

6- I'm not a network professional, neither are lots of people. and if you don't have a strong knowledge to make sure your game's networking is in top shape you might end up making mistakes that will cost to fix.

7- The premise that UDP is going to greatly improve the game networking is naive (It's ok to call things naive by the way, don't be too sensitive) I'm a big fan of gafferongames, there are tons of great resources there, I'm only saying the premise is naive, not the person who wrote it. of course the author knows what they're talking about and I do understand they mainly target RTC and games like agar.io, I would want them to state more clearly that this is only valid for non deterministic, RTC, FPS, whatever you call it. for other cases, it's not the best.

8- Security wise, I'm not sure how easy it is to secrue a UDP conection, but it's surely not straightforward and easy as TCP with SSL for example.

Finally, there is never one true answer to your project's networking technology, lots of developers mix and match between TCP, UDP to get best of both worlds I would just say use whatever works best for you as long as you have enough knowledge and good reasoning to back your decision on what to use!


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact