Hacker News new | past | comments | ask | show | jobs | submit login
WebRTC is now a W3C and IETF standard (web.dev)
645 points by kaycebasques on Jan 27, 2021 | hide | past | favorite | 91 comments



This is so exciting. WebRTC is our best hope to have video interop between platforms. I love that it works outside web browsers, compeitors like WebTransport assume a 100% browser world. Or you have protocols like RTMP/SRT... that will never make it into the browser.

WebRTC might be our best bet to establish P2P connectivity between all languages/platforms. Would love to get rid of the single point of falure in Pub/Sub systems. WebRTC also feels like a great path towards easy cloud-agnostic code. You can use lots of different languages, and not dependent on SDKs/Servers.

* https://github.com/aiortc/aiortc (Python)

* GStreamer’s webrtcbin (C)

* https://github.com/shinyoshiaki/werift-webrtc (Typescript)

* https://github.com/pion/webrtc (Golang)

* https://github.com/webrtc-rs/webrtc (Rust)

* https://github.com/awslabs/amazon-kinesis-video-streams-webr... (C/Embedded)

* https://webrtc.googlesource.com/src/ (C++)

* https://github.com/rawrtc/rawrtc (C++)

----

Then you have a couple that aren't Open Source. But proves it is possible for these platforms also.

* Shiguredo (Erlang)

* |pipe| (Java)

----

If you are new to WebRTC I have been working on making it more accessible https://webrtcforthecurious.com. I am currently pretty tied up with Pion so haven't been able to make much progress lately.


Huh? Webtransport isn’t trying to compete with webrtc. It’s a server/client protocol not a p2p protocol. And it’s not designed with video conferencing in mind like webrtc is. Webtransport is trying to be a better alternative for websockets with support for h2/h3 and unreliable delivery.

> WebTransport assume a 100% browser world.

Webtransport should work anywhere that http is available. There’s nothing browser specific about it.


WebTransport (QuicTransport) was cited to be an alternative by one of the RFCs authors [0] he gives his reasoning in the thread.

I consider WebRTC Agnostic because you just need to exchange the Offer/Answer. With something like https://tools.ietf.org/html/draft-murillo-whip-00 it means I can upload video without running any kind of SDK/blobs locally. With WebTransport we are going to have no interop between services/clients. There will be need to be code to signal with each.

[0] https://news.ycombinator.com/item?id=23540253


I am not sure I follow this. WebTransport defines a way to transfer arbitrary data over HTTP/3, it's not a media protocol. I expect that a well-defined media transfer protocol will eventually emerge for media over WebTransport, and it will make its way to IETF.

(FWIW, you can send RTP over WebTransport datagrams, instead of using SRTP and ICE-lite)


> a well-defined media transfer protocol will eventually emerge

Until then interacting with a 'WebTransport Media API' involves me running code distributed by the person running the API. With WebRTC I can exchange a Offer/Answer and then have a bi-directional media session. I appreciate that these lower level APIs help companies that need the flexibility. I worry that the complexity will lock out Open Source and smaller companies. Smaller companies are going to have to figure out things took years to solve with WebRTC. Stuff like

* Congestion Control/Error Correction trade-offs and Latency

* Simulcast

* Re-negotation

* Rollbacks

* Capability Negotation

I was always a big fan of ORTC. Give flexibility to the power users, but give an even playing field to small players.

> RTP over WebTransport datagrams

I don't feel strongly about QUIC vs S(RTP). WebTransport doesn't force RTP, so it doesn't help unless I control everything. Bridging will get a lot harder. Right now it is nice that Reports/NACKs/etc.. can cross protocols.


> Until then interacting with a 'WebTransport Media API' involves...

Until then just use webrtc for client/server video. P2p protocols work just fine in centralised contexts.


If you are looking for cross-platform FOSS P2P, ProtocolLabs' (makers of IPFS) https://libp2p.io (P2P conns over TCP, UDP, QUIC, WebSockets, WebRTC) with implementations in various languages (Go, Rust, JavaScript) with varying degrees of features is great to have in the toolbox, as well. Unless of course you're sold into a solution like https://tailscale.com, who I believe are readying a userspace embeddable implementation of their P2P mesh wireguard overlay network, which then would be a compelling alternative, too, in the not so distant future, especially because of its security posture (though only the clients are FOSS).


+1 for libp2p. While I've found their public WebRTC stars (basically signaling servers) quite slow for now, js-libp2p is growing very quickly and quite usable from my experience.


I agree that WebRTC is super exciting, but if you want to deploy it into production, you need to have a STUN server and a TURN server.

Until/unless we've all moved to IPv6 without any NAT, we'll be stuck needing STUN and TURN to make it work.


There are several free STUN services available, and you can fix a lot of connections with just STUN. STUN is just an IP lookup effectively, so it's very cheap to deploy.

TURN is of course an entirely different beast because it's proxying all the data between clients. It's also where some more complex video stuff happens potentially. The good news is that there are solid, open source TURN servers available to run on your own hardware. The downside is that running those servers for non-trivial amounts of clients gets pretty resource intensive pretty quick.


So would STUN allow a connection to be established between a a user inside a LAN and internet user?


In my experience you typically don't need STUN for that to work - the ICE flow of WebRTC will handle regular NAT[1].

It's typically when ICE can't determine the first few steps (eg your public address) that you need STUN to help you out.

TURN tends to be needed in hostile networking situations like double NAT'd networks or very restricted networks like hospitals.

[1] - https://webrtcglossary.com/ice/


This is why I didn’t get really excited about WebRTC when I looked into it before.

And the problem isn’t with WebRTC, it’s with the current infrastructure of the net that prevents some clients from just communicating directly without some element of centralization. And some of it’s our own fault.

A cynical way of looking at it might be that WebRTC is being allowed because it helps scale the web while still being dependent on a central weak link of control.

But it is what it is.


When used in a client/server context where the server has a public IP address, webrtc can be deployed entirely without STUN/TURN. The mechanism is peer reflexive candidates, wherein the server learns the address and port of the client based on the IP packet of the incoming ICE negotiation request. I have used this approach on production systems.

For peer to peer, you'd have a tough time avoiding STUN/TURN anyhow.


Only a specific form of low-quality NAT--one that is certainly often encountered but isn't at all common at the router or ISP level--is incompatible with WebRTC in a way that would cause you to need TURN.


> Until/unless we've all moved to IPv6 without any NAT

hell will freeze before that happens.

This is the bread and butter of ISP. Without NAT they lose too many revenue sources.

They will hold to their NATs forever, claiming something like user security. While pushing for IPv6 to reduce their operating costs (those IPv4 costs add up)


The majority of IPv6-capable ISPs are using prefix delegation to allow NAT-free operation. Heck, even phones are getting prefixes.


What revenue sources does an ISP get for having NAT? In Australia, you pay retail ISPs to provide backhaul from the national broadband provider. I pay $10 per month extra for a fixed public IPv4 address, the public /64 for IPv6 comes free. That's it.


Which ISPs do deploy/force NAT with IPv6? (I don't trust the industry enough to say there are none, but it'd be highly unusual)


WebRTC is also a decent P2P data protocol, and to have a P2P protocol like that enshrined in IETF standards means that P2P is now considered a "normal" thing on the Internet.

This in turn means that symmetric NAT will be considered sort of broken, which it should be as it is horrible for anything P2P or any novel protocol for that matter. Of course IPv6 is the ultimate long term solution.


"Of course IPv6 is the ultimate long term solution."

I feel like this is similar to

"We'll have nuclear fusion in the next 30 years."

:)

However if you look at the graph from google - https://www.google.com/intl/en/ipv6/statistics.html

maybe it'll be here sooner than we think...


It would be nice if they broke that chart down by AS numbers, as I suspect much of that is the mobile carriers. Here is some data [1] looks like about 27k AS numbers comprise all of the ipv6 usage. I stand corrected, looks like mostly mobile and Amazon for the US and then the rest is several other countries that are primarily using ipv6.

[1] - https://www.cidr-report.org/v6/as2.0/#Aggs


We've never had practical fusion. We used to have frictionless P2P with talk, ytalk, and similar chat tools. Then NAT came along and ruined everything.


True story: I was giving a talk on ZeroTier (I'm the original author/founder) at a university. After explaining some of the networking, a student raised his hand and asked me how it could transfer data between devices without a "cloud."

This student didn't understand that it was possible for one device to talk to another device directly.


Nuclear fusion may be closer than you think too. We've been developing better and better superconductors that could yield more compact and more powerful electromagnets, which is a major bottleneck to making it practical.

ITER is huge and fabulously expensive, but it's intended to be more of an experiment/testbed that happens to maybe generate some power. It's not intended to be a practical reactor.


Side note: Try browsing the internet v6-only. It's pretty eye opening.


My understanding is that currently people set up a STUN/TURN server to relay WebRTC, so it's not really used as P2P but there is still some benefit to the protocol itself.

To move to true P2P for WebRTC seems impossible, as even if 80% of web users were on IPv6 without NAT (and even 80% seems a bit hopeless at this point considering we've been transitioning to IPv6 for 20 years now), you'd still need a fallback configuration for the 20% that don't. There doesn't seem to be a feasible gradual transition strategy.

This is not really my area of expertise, so I'd love to be proven wrong.


STUN doesn't relay. It assists with P2P connection setup. TURN relays, and is intended as the path of last resort. The vast majority of WebRTC traffic is P2P.


The vast webrtc traffic is in reality between a peer and a sfu (selective forwarding unit) on a server. There is almost none real p2p webrtc traffic happening in reality because most people don‘t have enough upload bandwidth to transmit their video to all participating peers.


> There is almost none real p2p webrtc traffic happening in reality because most people don‘t have enough upload bandwidth to transmit their video to all participating peers.

Many providers choose to use SFU's even when not needed because they have features like recording that require it. I've gotten much better quality and latency on normal webrtc than I have over slack, teams, hangouts, even over bad mobile networks.


That’s a design choice. The architecture doesn’t require a middlebox.


Most human interaction where videoconferencing applies happens in small groups.


There's still a single point of failure: STUN server.

Comment below talks about some ideas that are new to me about how to navigate around nAT


Why is a STUN server a single point of failure? You can specify multiple and a STUN server does not need to keep any state at all, it is simply a way for a client behind a NAT to get their public IP/port, right?

STUN servers are cheap to run (they require very little processing and bandwidth as opposed to TURN), and setting up a few on each public cloud should be easy for anyone using ICE seriously. There are also public STUN servers still available, although I would not rely on them personally.


That's interesting thanks for that info I've never setup webrtc with two stun servers.

But still I think at least how I read single point of failure: it's reliant on a 3rd party. even if multiple 3rd party it's not realistically trustable in the way 'owning' the full stack would be. Though host your own as someone else points out below.


Unless we get NAT-PMP or PCP in the browser I don't know if we have an alternative.

In the LAN I am excited about this thread https://discourse.wicg.io/t/idea-local-devices-api-lan-servi...


If you're running a Web etc app, you can always run a STUN server too. It's not perfect but it's a step in the right direction.

I'd love to see a pool of STUN servers available for this, with some randomization.


Well you can give an array of stun/turn servers in the client without needing any special load balancer so its not really hard to do better than one point of failure.


IPv6?

Still will need hole punching but no “IP discovery”.


Thank you for the effort you have put into "WebRTC For The Curious". I was looking for information on WebRTC and this looks like a good introduction.


Don't forget Google's which is probably the most robust: https://webrtc.org/ (source at https://webrtc.googlesource.com/src). Other libs, such as mediasoup, take pieces from this.


Updated! This is my list of 'Other WebRTC implementations' from FOSDEM talk, mea culpa :(

mediasoup is moving towards dropping Google's implementation https://github.com/versatica/mediasoup/issues/344 excited to see congestion controllers that are easily usable by others!


They discontinued that in 2022.


Might as well piggy-back on this: any hints for debugging WebRTC issues? Was on a call again today that had a weird mix of people not seeing some other people, sometimes only getting audio or only video, ... where I'd liked to but couldn't try to pinpoint where the issue is coming from.


If you’re using Chrome you can get some insight by opening chrome://webrtc-internals in another tab.

There is some documentation here https://testrtc.com/webrtc-internals-documentation/ and while trying to remember that link I just came across https://chrome.google.com/webstore/detail/analyzertc-visuali... which looks promising to make the internals data more usable.


thanks, will take a look.


It's not obvious that this is an issue with the WebRTC sessions themselves. If a server is responsible for muxing the streams, each client could be in a properly negotiated session with the server, and due to an internal software defect with how the server renders the streams for a client, the client could experience exactly the symptoms you describe.


it's P2P, not using an SFU.


I think webrtc's success for video will be hugely dependant on which codecs can be used with it.

The gap between webm and zoom proprietary scalable codec still makes or breaks a call.

I've seen news of av1 working in real time up to 720p but I haven't tried it myself yet.

That could make a difference for the open web.


Can’t you use WebRTC data channels to send and receive arbitrary data? I don’t know a whole lot about this, but I saw a WebRTC version of Jacktrip that seemed to do this.


Aside: I've heard about web platform tests, but I never realized that it could be a viable alternative to MDN for representing browser compatibility information. In this case they have a pretty good collection of tests [1] that give you a picture of overall WebRTC compatibility. Of course MDN is still the gold standard and we on web.dev will always link there as our default choice when they do a good job representing browser compatibility data, but in this case they don't seem to have a good table representing the state of WebRTC at large [2] (happy to update the article if it's just an oversight on my part and they do have a good browser compat table somewhere that I missed). Specifically, it seems like the WebRTC API page [3] should have a list of the browser compatibility of each of the interfaces that are mentioned on the page.

[1] https://wpt.fyi/results/webrtc?label=experimental&label=mast...

[2] https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API

[3] https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API

Disclosure: I edited the web.dev announcement


I have tried using this API and found it incredibly technical and complex. Personally I find the PeerJS library an excellent wrapper rendering it useable.

Hopefully over time it will become more accessible.


> I have tried using this API and found it incredibly technical and complex.

It is meant to be relatively low level. It's like WebGL, in order to draw a simple square on the canvas it takes likes 50 lines of code just to bind attributes to buffers and stuff, link, compile your shaders and execute the program on the GPU. It takes 3 lines with Three.js. You are probably not meant to use it directly if you don't know the domain, but use a library.

So I'd argue that if every single WebAPI was like that, the web would have been a failure.

At the same time, it's also a great playground if you INTEND to learn some things that can be useful OUTSIDE the browser. For instance, WebGL knowledge can be directly transferred to any OPENGL application, because the API is similar to C or C++ version of it.


I think the comparison to WebGL is apt. It's also worth adding that scaling WebRTC -- both to calls with more than a few participants, and to large numbers of users on a service -- is still a bit of a specialized undertaking.

Partly because it's intended to be a really, really flexible API, the abstractions are pretty leaky. You end up learning a lot about video codecs, UDP and RTP networking, and browser implementation details as you build out WebRTC applications!

(I'm the cofounder Daily.co of a YC company that tries to make WebRTC "just a few lines of code" for common use cases, and to be the best possible application-level APIs and global infrastructure for advanced developers.)


The main achievement of the technology is to work around the shortcomings of other technologies. Complex might be the wrong word as it's not rocket science; convoluted sounds better.


I belive the extensible web manifesto is could give some historical context for low-level web APIs: https://extensiblewebmanifesto.org/


I think that what it's doing is sufficiently complex that the API is always going to be difficult. As long as abstractions like PeerJS exist I'm OK with that.


Any specific problems you recall?


I've done WebRTC cross-platform between iOS, Android and web browsers (in the end only Chrome...wasn't the web developer though, perhaps it was fixable) and I can tell you even while you're using Chrome, an OS built by the creator of Chrome and the WebRTC part of Chrome repackaged as an iOS library there were many, many caveats getting a solid connection. And the fun of iOS background modes while getting an incoming call. Of course the Google examples were all broken in their own little devious way. And there is really jack shit on WebRTC on the web, seems like everybody that figured it out just said "fuck it I'm not giving away this for free" after spending possibly months to get it working.

Though some people here seem to complain about no audio or no video...I have to say once we got it working it kept working. Sometimes the websocket we used for signalling dropped it's connection but the video call kept on trucking on some really bad WiFi!


I've dabbled with WebRTC API over the years, and I think I understand and can explain fairly clearly (to myself) the user's end of it.

Having said that, it's an absolute nightmare of an API that should probably be re-done from scratch.


Any specific problems you recall?


AFAIU there need to be more than one independent implementation to reach recommendation status. Last time I checked chrome and firefox both used the google implementation[0]. Has this changed? Which other implementations do exist?

[0]: https://webrtc.googlesource.com/src/


The W3C process does not strictly require independent implementations. It's a good safeguard against issues, and not having it opens the spec up to questions, but this has been covered elsewhere in the case of WebRTC:

Multiple independent protocol implementations exist (and the protocol part was handled at IETF, through their processes), so interoperability on protocol level has been shown even if browsers happened to pick the same one.

The W3C spec describes the wrapping JS APIs, sufficiency of that spec has been shown by the browsers independently building those layers around a WebRTC library. Especially given the working real-world use WebRTC already has it those two layers combined are a good argument that the spec is suitable.


Probably an issue that should've gotten addressed ahead of standardization, IMHO: https://github.com/w3c/webappsec-csp/issues/92


Can we now get support for it in Safari WebView on iOS?


Apple is the new Microsoft when it comes to foot dragging, don't hold your breath.


Doesn't iOS Safari already support WebRTC?


Safari does, and SFControllerView does. WKWebView does not. Which means apps can’t use WebRTC inside simple WebViews, and Chrome can’t support WebRTC on iOS.


Oh, hmm. That's quite unfortunate…


Didn't we get support for it last month


Here is the link to the WebRTC specification 1.0, published Jan 26, 2021: https://www.w3.org/TR/webrtc/


Jitsi made it into the announcement!


Anyone have a good open-source webrtc project? I’d like to donate this domain to the cause https://webrtc.io


Oh nice! Can't wait for streaming services like Twitch, Mux, or even Azure Streaming Services to accept WebRTC!


how far this has come.....from a dream, over 10 years ago, to this. I know it still seems complicated in some regards etc but the possibility of video over web, p2p capabilities etc instead of having to use terrible apps, Apple stuff, quicktime, etc just to have some video conferencing....was a pipe dream


AFAIK WebRTC is already widely adopted, so what does it mean actually for most of the people?


Still requires signal server.

Is it possible to avoid the signal server in some way? So we can have true p2p network inside a browser directly.


If you're willing to solve the discovery and NAT traversal problems yourself, but it's perfectly doable - AFAIU the WebRTC protocol doesn't care how you solve those problems, just that you do. See e.g. this little demo: https://github.com/lesmana/webrtc-without-signaling-server


If two browsers are inside the same LAN, are they able to communicate with each other directly without any server?


Yup. And if one of the servers is on a public IP then no other servers generally needed


Or if both are on un-NATted IPv6, and can get open ports through their respective firewalls.

Specifically, you need to construct an object[1] that kind of looks like an SDP[2] message and pass it into the API; the demo applications generally get this from a signalling server, but you can make it based on whatever input you want.

[1] https://developer.mozilla.org/en-US/docs/Web/API/RTCIceCandi...

[2] https://developer.mozilla.org/en-US/docs/Glossary/SDP


looks like the only useful info in the SDP are the `a=fingerprint:sha-256 ` line, and the `a=candidate:` for the actual IP. Anything else can be hard-coded in both peers.

Is there a way to further strip down the long-ass sha-256 fingerprint? A 6 digit pin should be sufficient for communications inside a LAN.


The fingerprint is part of the TLS/SSL-specific transport, and is needed for crypto purposes. If you want to communicate in the clear and roll your own security go ahead, but compared to the media you're going to be pushing down the pipe the memory and bandwidth used up for certificate info is peanuts.


i always stayed away from WebRTC because of that tunnel thing that seemed like a real pain in the a55


Does it still leak IP addresses?


Here is the standard discussion how to limit IP exposure:

https://tools.ietf.org/html/draft-ietf-rtcweb-mdns-ice-candi...

It is a trade off as this would also prevent applications like snapdrop.net working in the local network.


No, the negotiation phase doesn't leak ip addresses anymore


> WebRTC implementations are required to support both Google's free-to-use VP8 video codec and H.264 for processing video.

I understand that backwards compatibility is very important, but this adds to technical debt.


I don't know if people remember the fraught Codec War of 2014, but the choice of VP8 and H.264 both being "Mandatory To Implement" was the compromise that the IETF reached, despite much opposition from both sides.[0]

As often happens with wars, though, the outcome in 2014 wasn't enough to permanently settle the matter, and there was a further outbreak of fighting later between VP9 and HEVC. More recently, some are saying[1] that the Codec Wars are back, with the arrival of AV1 as a contender.

[0] https://bloggeek.me/winners-losers-webrtc-video-mti/

[1] https://bloggeek.me/av1-vs-hevc-webrtc-codec/


It is pretty nice to have both!

For beefier machines VP8 works great, and you don't need to worry about cost. For IoT/Embedded space you usually just get H264 (from a hardware encoder). If we didn't have both I think it would shut out a lot of interesting use cases.


IMHO it would've been better if they specified some clearly out-of-patent and also simple-to-implement codec like H.261 or MPEG-1 as a minimal requrement.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: