Hacker News new | past | comments | ask | show | jobs | submit login
Hello Firefox, this is Chrome calling (chromium.org)
1386 points by robin_reala on Feb 4, 2013 | hide | past | favorite | 178 comments

Yes, please, please, please kill Skype. Kill it mercilessly. Using it has been the worst experience I've ever had with an application.

I've been in a long distance relationship for a few years now and Skype has sadly been our main mode of video communication. The app crashes when I search chat history, can take upwards of 90+% of my CPU, literally forcing me to shut down every other application I have running. The forums are full of complaints, all unanswered. We've tried gChat and the plugin either goes undetected or slows down my machine as well (although not as bad as Skype). I'm running a 2010 MBA, so I've got the resources.

So death to Skype, and open arms to WebRTC.

The one key element of a Skype Killer, which isn't present in WebRTC, is a prescribed infrastructure to support these communications. Skype is made up of users who actually are, usually, unaware that they are also routing calls for other users. Even more precarious (and absent from WebRTC) are the skype SuperNodes, which no one is going to redo for WebRTC.

The things which we love about skype, most notably the directory and ease of routing, are currently missing from WebRTC. Someone might come along and make them, but WebRTC is not inherently different from Skype's infrastructure, it's just new. Someone still has to build the supporting services to turn this into a Skype Killer.

The user supernodes were mostly retired after microsoft bought Skype. See: http://arstechnica.com/business/2012/05/skype-replaces-p2p-s...

Retired and centralized are different terms. I think you'll find the supernodes are alive and well (just centralized in Redmond). The obvious implication is that it is very hard to monitor a decentralized infrastructure but trivial if the servers are local.

If that was your point, I apologize for being redundant.

It would seem to greatly lower the cost of building a Skype-killer, though, and that's a pretty big deal.

Sort of. Now you don't need to build client software, but that was never the real prowess or power of Skype, it was always the infrastructure.

Think of it this way. Skype has 2 things that WebRTC doesn't: Signaling and a Directory. Without those two components, you just have a phone without a phone number :/.

Lower barrier, yes. Easy? No, there are still significant costs.

Seems like Facebook is pretty close to solving signalling and a directory - in effect it already has them for their messaging service. If you enable chat on Facebook then people can already find you for chat and know if you're [reporting that you're] available.

FB is already in bed with Skype though I think so not sure how that would pan out in practice.

There's a difference between phone numbers and the Facebook directory. One is a user-owned identity which can migrate between carriers, the other a walled garden. With a phone number, I have an identity which can be linked to many services including Facebook. With a Facebook login, I am beholden to Facebook as my directory.

I'm not saying that this is relevant, because for the majority of users it isn't. I'm saying this for the hackers who just watched Facebook deny Voxer. If Facebook can capriciously decide who I can and can't talk to, that's not really what I would call an identity. What Facebook gives you is a slice of an identity, the slice it thinks you ought to have.

For many that is enough; for others it is an insult.

Google twelephone

Have you tried Google Hangouts? I've had some success with that lately over Skype.

I would expect Google Hangouts to use webRTC when available and use the Talk plugin only as a fallback.

I like Google Hangouts better than Skype considering most people have Gmail and they know how to use it. No need to create separate accounts or any other installation hassles specially good for older folks. I wish the plugin came pre-installed with Chrome Browser. And video quality could also use some work.

I love Google Hangouts, and prefer them to Skype, but they bring my laptop to its knees. Multicolored pinwheels everywhere.

Meetings.io was our go-to service for a while, it was just so insanely better than Google hangouts and skype. It showed so much promise, until sadly they were acquired and now it sucks.

i tried and it's a pain to install the plugin. never works on my debian-sid.

It works flawlessly on Ubuntu, Fedora, Arch, etc. I have tried in all these distros myself.

It doesn't have fullscreen.

I'm the opposite. I don't want to use anything except Skype for voice chat. Nothing comes even remotely close in sound quality. If I had the same cpu/crash issues as you I'd be loudly cursing it's name. Luckily I don't and until something can provide better quality I won't be using it.

Right, for voice it's fine. For video it's garbage. I don't quite see how you're the opposite.

“Nothing comes even remotely close in sound quality.” — Mumble?

With Skype I can play games with a group and put the mic in always on mode. No feedback, no audio from speakers, and no keyboard clicks. It's pretty damned amazing to be honest.

Ventrilo and Mumble do have far superior support for massive number of people in a chat room. If you're participating in large MMO raids or Eve Online type festivities the Skype doesn't have the necessary features. For my ~5 player parties Skype blows everything else out of the water and isn't even close.

WebRTC incorporates the SILK audio codec Skype is using, too, except it can go even higher bitrates. WebRTC should sound at least as good as Skype.

Mumble currently uses CELT and Skype currently uses Silk.

The two codecs have been merged/improved into Opus, which is one of the mandatory audio codecs for WebRTC.

But all three are moving to Opus, so very soon it's not going to be the codec itself that will differentiate audio quality between them.

I don't think it's about compression. I think Skype's invested most of their resource into, eg. noise and echo cancellation, reducing environment noise, working around glitchy / clicky sound capture devices, latency, ...

I have about 5 guys that play, and the "chat room" design of Mumble makes a lot more sense for our use case (which could be different from yours). We hop in the server whenever we want, just to hang out, then use it for games when we are all in.

It's also secure, FOSS, cross-platform, and implements a consistent UI. Skype has none of these.

Mumble definitely is the best VoIP solution I've used in terms of sound quality.

Well this is still in beta and already poses threat to Skype.

Not saying Skype is perfect (it isn't), but WebRTC by itself doesn't come close to being a Skype substitute.

While stability for particular client-side bits (media, transport, etc) will now now fall on the browser vendor, you're still going to depend on another service for all the niceties (media and signaling relay, contacts, presence, history, etc), and they could screw it up just as badly as Skype, or worse.

Skype could just use WebRTC as another transport -- there's no technical reason they can't.

You're correct that you'll depend on another service for all the niceties, but I for one am super excited about decoupling all of those services from the connection itself. Hopefully it'll lead to a lot more competition and innovation in the space.

At the very least I hope it forces IM apps like Skype and Facetime to support WebRTC (that is possible for native apps, too, right?) so that in the future we get interoperability between them.

Wouldn't it be nice if WebRTC pushed video-calling into becoming like e-mail, and you could take to anyone, regardless of what "video-calling app/service" they are using?

> Wouldn't it be nice if WebRTC pushed video-calling into becoming like e-mail

WebRTC doesn't define a signaling protocol. There are already protocols for that. SIP, XMPP/Jingle, etc. It would be nice if people started using them more. WebRTC only defines an offer/answer API (which is actually a big difference!)

So having two different services use WebRTC doesn't mean they'll be interoperable. (Unless the services intentionally make an effort to federate.) Any third-party attempt at interop between incompatible services is going to involve spoofing a connection to each service. While that's not impossible, WebRTC really isn't the interoperability panacea that people seem to think.

What matters most to me is video chatting that doesn't suck. If WebRTC is that and only that, I would be more than willing to use another chat app as a buddy list and for chat history.

Skype has to be killed just because it's a closed non interoperable network. WebRTC can be misused as well if closed non interoperable networks built on it will pop up around like mushrooms. If WebRTC will be used to plug into federated XMPP/Jingle networks - then it'll be good.

I use Sococo's Teamspace, which I help write as my job. We're cognizant of cpu usage especially on laptops and pads. We're ok now, and our next release will use substantially less CPU.

Our codecs are WebRTC-derived, and its pretty stingy with memory and cpu so that's cool.

Confused; is it inappropriate to suggest other startup's products here? WebRTC etc are good technology, that's why we picked them up. All the issues mentioned by the poster are important to all of us in this space. Sococo is responsive, constantly doing customer experience surveys and trying to address quality.

Do you imagine Sococo is some giant corporation? Its a startup like so many here. I was the 3rd person in the company, the 1st Software Engineer, so I'm pretty proud of the progress we've made in the last couple of years. There are a lot of big players in our market, and we've done a good job of beating them technically even though we're small.

The funny thing is I remember Skype being great, back around the release of version 2 or 3. They just started nailing fat on everywhere they could, and haven't taken a moment to look back since.

By your own admission, Skype is the best video communication platform that exists but is far from perfect. It's hard to disagree. Why isn't someone creating something better?

Last I checked everyone was trying to create something better. Google, Apple, Facebook, Tencent, carriers, and many startups.

How do you compete with free ?

With free.

With better.

Google+ Hangouts are infinitely better. Here's a writeup I did on the benefits: http://slant.co/topics/what-is-the-best-video-chat-service/o...

Another really important thing Skype can do is terminate calls on regular landlines. That lets Skype plug the gap from old to new, as its network grows.

Google has the infrastructure to make it happen via WebRTC, but it would need to be integrated with accounts and payment.

I'm not 100%, because I do audio-only rather than video, but I think Trillian will handle that for Skype, and its a little bit more stable. I agree that Skype is one of the worst apps out there.

If both of you have Apple devices, I highly recommend FaceTime. In my experience, it has better video quality than Skype.

I feel like DataChannels are the best part of the WebRTC standard (or would be, if they were usable): a true p2p connection allowing transfer of arbitrary data without plugins or anything. This gives multiplayer games, file transfers, realtime chat and collaborative editors (let your imagination run wild...) and the only thing a server is required for is establishing the connection (and saving state). This functionality is much more exciting than simple video or audio chat.

I agree, and it's easy to tell just in the way you can describe it. The first thing video/audio chat brings to mind to anyone is 'Skype in the browser'. That's still cool and there is definitely more potential than that, but just having an arbitrary connection is a lot more open ended and exciting.

This is what really excites me. I want to build a P2P DHT in the browser, basically see if I can implement something like Freenet entirely in JS.

You might be interested in KadOH:


It's a JavaScript implementation of the Kademlia DHT. It doesn't support WebRTC as a transport yet, but they're working on it. The code looks pretty reasonable, and they seem to have some momentum going.

Thanks! This looks really interesting.

Unfortunately, if this page is accurate, DataChannels don't really work yet:


A shame, because I have a use in mind for them. :)

Firefox has as full DataChannel implementation (SCTP-based), while Chrome has a partial one (RTP-based). They don't interoperate yet. I implemented most of the partial implementation and am working on the full implementation now.

I'm glad that you're excited about them. I am too. I hope to have it ready for you soon :).

Good to hear.

Out of curiosity - do you mean that you're writing a full SCTP-based implementation, or completing the RTP-based one? The standard draft specifies SCTP, right?

edit: this was addressed in another post.

It works on some alpha release of both Firefox and Chrome. There are already some file transfer apps if you're curious, although I haven't tested myself: https://github.com/lindstroem/FileTransfer and https://github.com/peer5/sharefest . Also, the folks from easyrtc ( https://github.com/priologic/easyrtc ) have already started working on adding DataChannels to their library.

A bit-torrent like client in the native browser would be interesting... wonder where Opera is on this one.

I tried this less than a week ago on Linux platform and was unsuccessful. I am also waiting on this feature, want to do some cool stuff :-)

They recently appeared behind an about:flags flag. :)

Well, when I heard that WebRTC was in Chrome stable I thought this must surely include DataChannel, since it's simpler than a full-blown video stack. So although it being behind a flag is better news than it not being there at all, I was still pretty disappointed when I read that page :)

edit: also, I don't see it as an about:flags flag.

edit 2: I should just tunnel data through video then, what could possibly go wrong?

It's not quite that simple. We (I work with the Chrome team) have an RTP-based data channel (sort of what you call "tunnel through the video" already in Chrome), but it provides no congestion control or reliability. To provide those, the IETF has proposed as the standard to use SCTP. Firefox has already integrated SCTP into their implementation, and we're working to get it into Chrome. So, you can start using the API now, but it'll get much better soon.

Sctp is a pretty cool protocol. I'm actually planning on including it as part of a user land network stack I'm slowly working on.

Sorry, I forget not everyone is on the Dev channel, but I have this available:

Enable RTCDataChannel. Mac, Windows, Linux, Chrome OS, Android: Enable experimental RTCDataChannel for peer-to-peer data communications. Disable

Version 26.0.1397.2 dev

What sort of security is in place - e.g. what's stopping a pop-up ad from logging keystrokes and sending them to a remote endpoint?

This is possible with current technology: $(document).keypress(sendKeypressInfo)

$(document).keypress won't pick up keystrokes from across tabs or windows. So unless the popup is the active window, you should be safe from something like this.

Hmm, we are not talking about web page js here, we are talking about Chrome API js here. It's far more powerful.

Won't the Same Origin Policy limit the exposure via ajax?

They don't have to use ajax. They can load an image with an arbitrary url and pass the keypress data in the url parameters, or dynamically create a script tag, or create an iframe and submit a form in it, etc. The script tag method also lets them get data back from the remote endpoint, if the remote endpoint is kind enough to encode it as JSONP.

I think he's referring to a hostile script trying to bind to keydown -- usually you shove the banners in iframes to limit this possibility when you include external untrusted content. I assume the same holds true here, though.

The Same Origin Policy can be overridden by the site accepting the connection (http://www.w3.org/TR/cors/), so assuming that site is hosted by the attacker it wouldn't be helpful. If the site used (and the browser supported) a Content Security Policy (http://www.w3.org/TR/CSP/) you could restrict such outgoing connections.

I don't think this is an issue. The popup would only capture whatever keystrokes are typed into the popup (as dbaupp illustrated). DataChannels doesn't change the boundaries within which a webpage/Javascript runs

Hopefully DataConnections will also be used for video an audio so that encrypted connections can be used.

(Just so you don't copy my mistake, I was incorrect when I called them DataConnections, the term is DataChannel.)

They already are. But sure, you could do that if you wanted.

It's true that the audio, video, and data are all encrypted. But it's not the case that you can send audio and video over the data channel. Well, you could send audio and video data, but there's no way to pipe audio and video data into the data channel. That's an interesting idea, but it would probably take a long time to figure out and become part of the standard, if it ever happened at all. We'd probably need a worthwhile use case to justify it.

My bad, I haven't played with this stuff in... jeez, nearly a year. I had a PoC with an signaling server written in Go, but it hasn't even been updated since the move from ROAP to JSEP and I haven't played with the compat js shim either...

Yeah, let's start calling it web 4.0 now. Truly exciting stuff.

Note that Chrome (or the application) chooses to reverse the self-view so that it acts like a mirror, whereas Firefox (or its application) chooses to display what the other person sees.

It's a hard choice to make.

Actually, not really. It's all HTML 5 video elements on the page, so this is all you need:

    .mirrored { transform: scaleX(-1); }
And you can make a button to add/remove that class as needed.

However, I do hope the implementations are consistent as to which way it flips the video.

Changing it is easy. Adding options is an easy way to avoid choices.

But Jabbles' point still remains: It's a hard choice to make.

I don't think it's a hard choice for the browser to make- send the data as it is captured by the camera. How that is displayed is something else, but as timdorr demonstrates, it's not difficult to toggle.

It's a hard choice for the human who has to pick one or the other and justify that decision.

In what way? The developer should have no problem justifying the decision. "It's what the camera sensor sees".

I absolutely understand the application in webchat and why you would want it to be mirrored at a page level, but I'm at a loss to understand why the browser implementation would try to reflect that.

"reflect" tsk boom!

I agree that it's a hard choice; I had to make this exact choice last week when working on a video chat service. Possibly because of the nature of the service (tutoring/teaching) it felt incredibly unnatural to see a non-mirrored image of myself. I tried it out both ways and it really did feel quite peculiar, and I had a hard time figuring out my own movements (kinda like trying to shave using two mirrors).

The last time I tried this it also flipped the controls. Is this still the case?

Are there any usability studies on video self-views? What are the precedents from other video conferencing software like Apple's FaceTime and Google Hangouts?

I expect it heavily depends on the application, context and user role. So it's a great thing to offer developers the choice of whether to mirror or not.

Why would you make it a mirror image? If they have anything written out it would be backwards.

Because people are used to mirrors. Even people who work around video cameras for a living mix up left and right on daily basis.

You see a mirrored image of yourself and a non-mirrored image of the other party.

Why are they different?

Because then the thumbnail where you see your own face behaves more like a real-world mirror, which is easier to reason about if you want to adjust your position.

Well, it's encrypted, so that seems like a reasonable alternative to the potential of being spied upon when using Skype.

It may be encrypted -- but is it protected against traffic analysis?

The following papers are a good starting point:

[1] Guessing the URLs being browsed by users over an encrypted TLS session: https://research.microsoft.com/en-us/um/people/gdane/papers/...

[2] Guessing the co-ordinates of where a user is scrolling around on Google Maps over an encrypted TLS session: http://www.ioactive.com/pdfs/SSLTrafficAnalysisOnGoogleMaps....

[3] Guessing the content of encrypted VoIP conversations: http://www.cs.unc.edu/~amw/resources/hooktonfoniks.pdf

[4] Guessing communication paths on the Tor network with only a partial view of the network (not strictly related to encryption but the principles of traffic analysis are relevant): http://www.cl.cam.ac.uk/users/sjm217/papers/oakland05torta.p...

[5] Guessing passwords sent over the SSH protocol using keystroke timing analysis: http://users.ece.cmu.edu/~dawnsong/papers/ssh-timing.pdf

To explain further in the context of WebRTC, traffic analysis of the encrypted data could determine the pace and duration patterns of speech between multiple speakers. This could:

(a) narrow down or uniquely identify each party

(b) provide information on the mood of each party (are they interrupting other speakers more often than usual?)

(c) guess the nature of the call (information dump from one person to another, one person quizzing another person, ...)

(d) determine the languages used in the conversation

(e) guess demographic information from the conversation including approximate level of education/intellect, age, male/female, ...

(f) narrow down the physical location of speakers who are attempting to mask their identity through intermediate nodes

So is Skype. Unless there's evidence that Google couldn't be compelled to comply with US law and honor a legally binding order to intercept a Chrome video call?

The intercept service you're talking about is CALEA if I'm not mistaken, and it's not yet clear that CALEA applies to IP communications.

VoIP is covered by CALEA, but it isn't yet clear if Video is covered. There's a bit of a raging debate about this in Telco circles. There are basically two arguments:

1) Companies that do not operate exchanges are not liable for CALEA compliance

2) CALEA compliance is not clear.

For the first argument, many folks interpret the law as only covering companies that have equipment inside of phone exchanges (CLECs and ILECs). There is a 3rd class of operator that is only IP with no equipment in the exchange. It is not yet clear if this 3rd class has CALEA as a requirement (Goolge is all IP).

On the second point, there's no clear documentation about acceptable formats for release. Can I send raw log files? Does it have to be a csv? None of this is clearly defined anywhere.

In short, it's a lot more tangled when it comes to video. I'm not certain the Feds could've gotten access to Skype monitoring without $MSFT buying Skype.

Yes, but Skype is not fully decentralized. Skype is no safer than any e-mail service.

Google wouldn't be able to intercept a peer to peer connection, isn't that one of the strengths here?

So if we encrypt stuff we are not spied on? Naive....

Encrypting data certainly increases the barrier to entry for spying.

No it doesn't, it means if somebody doesn't really want to spy on you they just will not waste their time....

Isn't that the definition of a raised barrier? Even if it isn't very high.

Yes it does. Adding encryption, even if it can be broken through sheer computational power, still increases the price tag associated with spying, therefore increasing the barrier to spying.

Yeah, but what does it matter then? I can stand outside the bank safe all day and stare at it but unless I know the code I'm still just looking at a locked box. Strong encryption is the same way, feel free to stare at my VPN stream all day, it won't help you.

You fail to get my point, why do you think that that the US has has little issue with allowing encryption to be used these days?

How strong is your 'locked' box? It's only a bunch of 1s and 0s.

Maybe because they realize that treating encryption as munitions is totally unenforceable and strong encryption is basically public knowledge at this point? I know everyone has a cousin in the NSA that knows someone who knows someone who knows how to break Triple DES in 3 minutes, but there's no evidence that is actually real. Even if the NSA or whoever is years ahead of the public in math research (they aren't, BTW, these people would make far more money in the private sector) there's still no evidence to the smartest mathematicians and cryptographers that this is the case. The biggest threats to encryption is parallel computing, weak security, and possibly quantum computing in the future, not a backdoor.

Not to be pedantic but Triple DES could be broken relatively easily by the government, which is why it was replaced with AES. I completely agree with what you are saying in the post, I just wanted to bring up that Triple DES is considered insecure.

Sources: http://arxiv.org/ftp/arxiv/papers/1003/1003.4085.pdf http://delivery.acm.org/10.1145/360000/358718/p465-merkle.pd...

Unless the NSA/FBI/whatever is decades ahead in mathematical research from what is common in academia, or has computers that are many orders of magnitude faster than what we have today, well encrypted data is basically random data to anyone who lacks the keys. That's a pretty tough locked box.

"Unless the NSA/FBI/whatever is decades ahead in mathematical research from what is common in academia"

I wouldn't be surprised. Basically 9/10 people that study cryptography in the US end up working for the NSA. Look at the NSA's budget, and then look at how many cryptography professors there are.

However, that said, IF they managed to decrypt AES somehow, they are severely limited in what they can do with that information. Basically anything they do with the decrypted data has to in no way alert anyone that they have that capability.

So that data DEFINITELY can't be used in court, nor for a whole array of other more covert operations.

The NSA has not only got to be decades ahead of the rest of the world, it has to be completely certain that no-one else in the world will make the same discovery for at least another few decades.

It seems unlikely in the extreme that the NSA is both so far ahead of the rest of the world, and so sure that their discoveries will not be replicated for decades.

And who packages that box?

You're veering awfully metaphorical, but it might be relevant to point out that the crypto implementations in wide use today are typically open source, and the algorithms themselves are public, and widely reviewed by independent cryptographers who have no incentive to do anything sinister.

In fact, they have a good incentive not to do sinister things. If they do, and other cryptographers find out, they will lose a lot of standing in the academic community.

Imagine how much more awesome this will be when we also have IPv6. We will have no more need for things like ICE [1], just direct point to point communication. Oh, and multicast for giant video chat sessions.

[1] http://en.wikipedia.org/wiki/Interactive_Connectivity_Establ...

Multicast on the internet is a nice dream. Will it actually happen with IPv6? My understanding is there is a barrier to adoption ­— basically all routers involved need to be aware of the multicast initiation/join/leave protocol, and basically, there's a bunch of old hardware/software out there that isn't aware. Is all IPv6-capable hardware capable of this (given that IPv6 came out in ... 1998)?

I had the impression that the biggest hurdle for multicast to happen on the internet is the insane memory requirements for routers to track all the multicast memberships (as well as being extremely open for abuse -- imagine one client subscribing to every multicast stream out there and never leaving, that would flood the incoming pipes for that ISP)

While I don't know anything about how multicast is specified, I would certainly hope that it is possible to implement in such a way that each router only needs to track the memberships of it's direct neighbors. They would then themselves advertise that membership in order to serve their neighbor's request.

As for the abusive client, I say if the ISP has any sense of self preservation, they should be interested to police this themselves. For example, knowing how much bandwidth the client is paying for, they could refuse additional or even cut existing memberships of that client once that limit is reached.

The problem is that a router's direct neighbors may be subscribed to millions of multicast groups and nobody has enough TCAM to track those memberships.

IPv6 is also pretty much a requirement due to the whole multicast MAC address collision thing I believe.

That's never going to happen b/c there is too much legacy infrastructure and the benefit is marginal.

All current home routers are configured to do NAT, so you're not going to be able to have a Skype 2.0 be P2P b/c then you'd need to explain to millions of people how to reconfigure their routers.

Also P2P opens a whole new can of worms when it comes to security.

> That's never going to happen b/c there is too much legacy infrastructure and the benefit is marginal.

It is going to have to happen, or there will be no internet.

> All current home routers are configured to do NAT, so you're not going to be able to have a Skype 2.0 be P2P b/c then you'd need to explain to millions of people how to reconfigure their routers.

Current home routers will be replaced within a year or two. They do not last that long. Some people are starting to not have dedicated routers, instead renting equipment from the ISPs.

> Also P2P opens a whole new can of worms when it comes to security.

Firewalls have been around for a long time. We'll survive.

You're also assuming that there won't be NAT with IPv6; I don't know that that is a safe assumption.

"Current home routers will be replaced within a year or two."

Yeah right! My grandma isn't gunna be changing her router. And she'll be pissed if her stuff stopped working.

At then end of the day, if you have the choice between making a NAT friendly product that everyone can use and a non NAT friendly product that has some extra privacy - you'll be choosing the NAT friendly option.

"Firewalls have been around for a long time." The thing in windows that everyone disables?

NAT is transparent and fool proof. How can you ensure a 3rd party won't be able to connect to your fancy new P2P chat client? Security holes are pervasive in networking code so without NAT and a with a bit of sloppy code, the "bad guys" can at any time connect directly to your computer. Can you imagine the chaos that would happen if there was a zero day bug in a P2P Skype. With NAT 99% of your problems are gone. It's physically impossible to connect to a computer that doesn't have port forwarding.

Unless the US gov't comes in an enforces a switch to IPv6 (like it did with digital TVs and all those subsidized converter boxes) directly connecting to computers isn't going to happen.

NAT doesn't protect you from security problems, it just makes it harder to connect directly to you, requiring an intermediate exchange that is on a accessible server, resulting in more points that can be compromised.

It reduces, rather than increases security, since now the communication can be compromised by a security hole at either end (and NAT doesn't stop the machines behind it from being compromised) or at the exchange intermediating between them (which, most likely, neither party has any control over or detailed knowledge of the security practices in place on.)

And major ISPs are deploying IPv6 now: no mandate required.

"requiring an intermediate exchange that is on a accessible server"

Exactly. So the onus of security is pushed off solely onto the centralized intermediary. In my example it's the Skype servers.

They can very easily firewall and filter all the connections. They can have a much stricter filter then what you have on your computer. (ex: packets have to very strictly conform to a certain standard generated by the client side program)

Centralized servers are also more secure because you don't have any access to the server code and it becomes virtually impossible to look for exploits.

Also if any bug IS found, then patching it is trivial b/c it's at one central point. If worse comes to worst you just shut down the server and now all your clients are safe.

No, the onus of security isn't pushed off on to the intermediary. The communication can still be compromised by compromise of either endpoint. The intermediary is an _additional_ point of failure.

With P2P communications between Ann and Bob, a compromise of Ann's machine or Bob's machine compromises the communication.

With NAT preventing P2P communication between Ann and Bob and requiring them to communicate through intermediary Charlie who is publicly accessible, compromise at Ann's, Bob's, or Charlie's location compromise the channel.

Systems can be compromised without hosting publicly-visible servers, as has been demonstrated in every remote browser-based exploit ever.

So, Charlie's system may be more secure than Ann or Bob's systems, but that doesn't matter because it doesn't _replace_ Ann and Bob's systems, which are still part of the communication channel. More points of vulnerability always means less security, even if the new point of vulnerability is, considered alone, more secure than the most secure existing node.

I'm quite pessimistic about the infrastructure, considering the speed of acceptance of IPv6.

It's good to see this moving forward. A few questions need to be addressed in order for this to become really useful.

How is XMPP/Jingle supposed to work over WebRTC? I.e. in order for the browser to connect to a standalone VoIP client for example (think of implementing Google Talk plugin in pure JavaScript). From what it looks like, this can require some support for this scenario in XMPP servers (not unlike they need something to support XMPP over WebSockets). So this is really an important piece of the puzzle which needs to be ready.

Interoperability with V/VoIP protocols will still require a translation layer somewhere. That can be on the browser, or on the server, but you will still need some server-side component to relay the signaling information through (eg, websocket proxy), or a browser-interoperability mechanism (for xmpp, BOSH). If the remote endpoint doesn't support WebRTC's flavor of SRTP, then you're going to have to relay media through your server.

If SDP offer/answer is still being considered for WebRTC, then that should simplify interop a bit since media negotiation shares a common denominator with most other V/VoIP protocols (which is really the only relevant part when we're talking about WebRTC). Of course, all that is moot if both endpoints can't agree on a codec to support. Opus is brand new, and no one cares about VP8. Hardware devices are not likely to be compatible anytime soon (without a transcoding layer in between), which leaves softphones.

That's what I meant - most probably some server support will be required. I hope ejabberd and others will address this. Until that WebRTC won't be useful for building web based XMPP/Jingle clients.

Regarding codecs - most XMPP/Jingle clients support VP8 and Opus is catching up too. It's not like there are many of them around anyway. Farstream supports both if corresponding gstreamer components are present.

IIRC, ejabberd is agnostic about the XMPP messages going through, you should be able to use strophe.js to craft the proper Jingle packets from the browser's offer/answer session description.

Given codec and SRTP compatibility with clients, eg GTalk (which I don't know about), you should be good to go.

But ejabberd still had to enable support for BOSH or WebSockets first. Without it Strophe was of no use. So I assumed that similar thing has to happen with in this case in order to be able to route the Jingle stream from the browser over WebRTC. If the server modifications won't be needed at all - then of course it'll simplify things. Also, clients don't always use SRTP - there can be other methods like ZRTP for example.

ejabberd already supports BOSH. The rest is up to the client since it's P2P.

Fantastic news. I feel like the basic videochat model we see in WebRTC demos is only the beginning of the possibilities here.

When all browsers support WebGL and WebRTC I'll be fascinated to see what people more creative than myself can create.

Chrome now has WebGL enabled for everyone that supports it, which is a big leap towards getting the majority of users on compatible browsers.

Do you know if IE will implement it?

I have a terrible confession to make. I can't bring myself to be too excited about this, yet. I can already make a video chat from my Linux box running either Firefox or Chrome to my mother running IE. What, exactly, are the benefits of this?

No third party plugin is required to be installed.

I get that, though I do not necessarily get what makes that so awesome. Essentially, my browser is already a third party plugin to my computer.

I think I would be more excited if this were distinct plugins running in the respective browsers. Would be much more indicative of a truly free environment, instead of just two of the top contestants doing something.

It's so that if you tell your mother to get google chrome, she does not also have to get a separate plugin.

You aren't exactly making this sound more free. I would honestly prefer it if I could just buy her a large TV and we could setup video calls with it. (They already have cameras built in nowdays.) Hell, why stop at the TV. Her phone likely has all that is needed for this. Does this go any further to making that happen?

I guess I can see how it does. I mean, the idea is that this protocol can be used to communicate between two vendors. But, that can already be done. It is done, on a regular basis. What, exactly, makes this special?

At a technical features level? Absolutely nothing. This provides nothing to an end user that they couldn't already get. The only thing this changes, from an end user perspective, is the number of steps it takes to do things.

tl;dr: This is an evolution of browser standards, not a revolution in consumer software (from an average end-user POV)

From a developer POV, this is pretty cool because now it is (or at least, is well on its way to being) significantly easier to build an app that relies on real-time media shared between peers with only the only onus on the end-user being having an up-to-date browser. The simplest and most obviously useful application of this is a basic VoIP tool.

I think this is ultimately my confusion. I'm at the point now where I'm actually usually advising family members on devices that rely less on browsers. Why not show a chat from gmail/whatever in a browser to a regular app on an android/ipad device? Even better, include another app in there. Just to really drive home how "open" this is.

Also, why is this stuff better than SIP related technology from a while back?

I believe gchat would do that, google hangouts might integrate with the ios/android app as well (I have not tried this).

The excitement here is that this is a non-flash solution, so technically it will work on any device that can run Chrome, regardless of whether or not it can / wants to run Flash. So in the near future this may make it into Chrome for Android and iOS.

WebRTC is a very necessary step in removing Flash's hooks into the modern internet; after this, there will be very few reasons to support it at all anymore. And that is a very good thing for speed, compatibility, and security.

I guess I'm just confused by two points. First, the odd belief that non-flash automatically equals good. Second, that non-flash for some reason necessitated "in browser." Why?

Neither of these automatically grants any additional security. If anything, it is just tying us to fewer vendors that can do this. I guess it doesn't matter, but I can recall a time when I was able to choose which application handled certain content type. We seem to be saying we do not want that for video now.

the "web" bit.

It's like how WebGL is very exciting, to people doing web stuff, even though it has nothing over other similarly named stacks.

Awesome technology but christ that conversation was awkward. Can't wait to play with it though.

I haven't really been keeping up on WebRTC, so forgive this fairly ignorant question.

Will the framework allow me to create a one to many connection? ie. presentation mode where one person could broadcast their audio / video to many viewers? Or is it simple a 1:1 connection?

"Hello, Mike."

"Hello, Joe! System working?"

"Seems to be."

"Okay, fine."

Looks neat.

One thing I wonder about tech' like this, is that it is encrypted from you to the service, but there is no assurance of privacy.

Someone who runs a service like this can easy drop in on your calls and ease-drop. Which is a legitimate concern if someone wanted to use this either in a corporate environment or for very private calling (e.g. husband and wife, doctor and patient, etc).

No current video tech' really offers much in the way of assured privacy. Skype used to but it has been largely rolled back since Microsoft took over.

My understanding is that it's possible to not have a 'service' and instead form connections from one person directly to another. That decentralization would be a privacy boon.

Well, without knowing the details of how they think two computers can reliably form a secure link without foreknowledge of each other, this doesn't exactly sound bullet proof. Self signed certs or something similar can only go so far. Hell, CA verified certs don't exactly go as far as most should be comfortable with.

Unfortunately, no: both browsers need to exchange volatile information to establish the connection. It includes information like how to traverse NATs to get to the other browser. You need some more centralized service to exchange that information and bootstrap the process.

As others have pointed out, the data doesn't go through a central service, but is actually p2p. The only thing a central service would be required for is establishing the connections so that the browsers can find each other on the internet.

I believe WebRTC is peer to peer.

We are in the process of adding persistent chat and conferencing to Twelephone (http://twelephone.com). These features, in addition to our existing audio/video calling and presence using Twitter as a directory service, should put our free service near feature parity with Skype. We're using HTML5 WebRTC with encrypted peerconnections and soon datachannels. Stay tuned...

Very cool, but there's still a lot that each group needs to do to be truly interoperable. You can see a rundown of the issues on webrtc.org:


Most of it can be worked around, but for a web developer, there's still a lot of wiring that has to get untangled for it all to work as seamlessly as it appears in the video demos.

Definitely some cool technology, though.

What kind of video codec/protocol/format/platform/whatever does it use? I was having a devil of a time trying to get something up that's real time-ish, lately:


Does this mean direct peer 2 peer connection within browser? Under what conditions can there be direct peer-to-peer connection? For example is it possible to have a chat application that is purely peer-to-peer?

I was very amused at how Hugh Fenning just read the script the whole time.

I was really looking forward to this! Would it be possible to have more than one person connected in a html5 mmo game?

It looks like it works very smoothly and the fact that no account or additional software is needed is great!

It is also supposed to work Chrome to Chrome right? It says the room is full. Anybody get same error?

Looks great, is the Google Dart team working on getting this integrated?

I'm not following what this is. Can someone explain it like I'm five?

The first guy is talking to the second guy, even though they're very far apart. We used to have to use a special part of the computer for this, but now we can use the regular part that we use for most of our other stuff.

We also used to have to ask another person every time we wanted to talk to each other this way. Now, some of the time, we can talk to each other without asking him first.

Explaining it like you're ten might be more useful:

You know Skype? Well, now we can do that on the Internet [Ed: I know, but you're ten, so bear with me]. Also, we can send the video straight from one computer to another. We used to have to send it far away, to another computer, and then he would send it to the person we were talking to. We can also hide what we are saying from other people.

Pretend you're mailing a letter to Grandma. Before, we had to mail a postcard to our mean aunt, who would send it to Grandma. Now, we can put it in an envelope and send it straight to Grandma's house.

Remember when we videochat with Grandma using Skype? Well, for that to happen she has to use Skype and so do we. There is no other use for that program but to videochat.

Now, we can videochat using the same program we use to play those silly Flash games instead. And we don't even have to be using the same program, she can use Firefox, and we can still use Chrome.

This also opens new things you can do that you couldn't before. Given that Chrome or Firefox do a lot more than just videochat, we can now embed videochat in all the other things the browsers can do, like in games or live support.

Phew, just restored equilibrium by upvoting only this submission

FF is not worried. Chrome has a conflict of interest when it comes to ad blocking. FF will always do it better. And ad networks are a major infection vector, so more users are needing ad blocking than ever before.

This has nothing to do with the article. Though, since I just love me some flame bait, Google will never take ad block out of Chrome because someone would just fork Chromium and keep it in.

You are saying google is irrelevant to chrome? I disagree.

Are you saying google is irrelevant to firefox?

I know that Netscape open sourced its code and that become Mozilla/FF.

A final good bye to all those crappy flash-based chats.

That's cute.

Please don't let the DoJ see this video. They will want a backdoor at the protocol layer.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact