Hacker News new | past | comments | ask | show | jobs | submit login
YouTube's road to HTTPS (youtube-eng.blogspot.com)
282 points by seanwatson on Aug 1, 2016 | hide | past | favorite | 93 comments



> We found that HTTPS improved quality of experience on most clients: by ensuring content integrity, we virtually eliminated many types of streaming errors

Wow, what were carriers doing to the streams?!


It's long been known that some carriers purposefully degraded quality of youtube to keep from saturating their interconnections. That's the reason[1] behind the Google Video Quality Report[2]. Youtube does (did?) redirect you to a comparison of how well your ISP streamed content compared to others when there were streaming problems in some cases to combat this.

1: Well, I'm assuming, but it seems likely this was the main reason.

2: https://www.google.com/get/videoqualityreport/


Degrading video quality is not a bad idea if you have limited bandwidth.

This would help in cases such as airplane flights. One person watching HD cat videos is going to consume more bandwidth than 20 people doing work.

Now, assuming bandwidth increase is difficult to achieve, they would be forced to keep increasing the price instead.

Prioritizing non-video content is challenging if all content is encrypted.


Why not limit the bandwidth per user instead? Youtube itself will auto-adjust quality based on connection speed.


Exactly – this is both easier to implement and has the advantage of not incentivizing people to use something like a VPN to get more bandwidth at the expense of everyone else.


Per user limits are not easy to implement, maybe easier to visualize - consider that there are a bunch of packet gateways sitting behind a load balancer and each HTTP session may end up on a different server. There is no entity that counts the live bandwidth usage on a per user basis, let alone control it. Billing and metering is done on a session basis through logs. So from T-Mo's point of view it is much easier to detect a HTTP session as video and just throttle that session.


It is very easy to implement this (I have worked on such a limiter before). You pick the point of entry into your network (wi-fi connection, ISP connection), keep packet and byte counts for every such point of entry, and limit them.

ISPs have it especially easy, because they can be assured of being able to distinguish traffic from a given user (hardware control of the medium). It's a bit harder in wireless scenarios, since the client can spoof multiple different IDs, but it's hard for them to keep open a TCP connection under those conditions.


Eh? Even cheap home routers these days allow you to set QoS rules per device. Have I misunderstood something?


Just because your cheap home router does it doesn't mean it scales to thousands of users on one router. Some home routers are actually quite capable AND very unsaturated. I'm not trying to defend carriers but it is a very apples to oranges comparison.


We did that on low-end PC hardware 15 years ago for conference and guest networks (800+ simultaneous active users, LAN and WiFi).

I find it unlikely that Linux, FreeBSD have gotten less efficient since then and the hardware has made enormous improvements, far in advance of the common uplink speed.


Are there thousands of people on a single flight?


And you thought legroom was bad now....


Cheap home routers with QoS has OUTBOUND QoS.


Per-user limits is certainly way easier to implement than deploying a cluster of proxies doing on-the-fly transcoding of video streams(!)


My university limits my bandwidth to 20mbps for HTTP/HTTPS and lower for SSH. Are they in a different position to normal ISPs?


Really? T-Mobile should know your mac address and ip and be able to rate limit based on that.


Let's say there are only 2 users using it. Then a lot of bandwidth would go unused if each user has a hard cap.

Ideally, there would be a dynamic limit based on network utilization and type of content consumed.


Don't think hard cap. Think avail bandwidth/num of users.


The auto-adjustment is a problem for fast but traffic limited connections because it detects that the connection is fast, then it switches to HD or even 4K and the traffic limit gets used up faster even if the user don't need the better quality stream.


Not sure about other services but at least GoGo was limited to about 500kbps per user last I heard.


> Why not limit the bandwidth per user instead?

Because that would require actual effort from ISPs, which would be in conflict with their current business practice that can be summarized as "to us, you are all equally worthless".


> One person watching HD cat videos is going to consume more bandwidth than 20 people doing work.

One HD stream probably uses more continuous bandwidth as those 20 people so just limit that. Allow bursts maybe.

> Prioritizing non-video content is challenging if all content is encrypted.

Good! It's none of their business what I am doing with my bandwidth!


The problem here is from who's perspective is there "limited bandwidth"? To the ISP there is limited bandwidth available, but to the customer they are having the service they pay for purposefully degraded because the ISP doesn't want to deliver on what they've marketed.

> Now, assuming bandwidth increase is difficult to achieve, they would be forced to keep increasing the price instead.

And they should. That ISPs commonly use deceptive practices in pricing and delivering service is not a good thing. It reduces market information There should be much more granular pricing based on what's actually delivered, but the major ISPs don't want to go that way because then they would actually have to account for how what they deliver so often falls below what they market.

For example, when I moved into my brand new built house a couple years back and got a 25 Mbit Comcast connection set up, the following conversation happened:

Installer: Wow, you have the best signal I've every seen actually.

Me: Really? That's good. So what throughput am I seeing?

Installer: Let me check. (Installer does a speed/circuit test). About 14 Mbit.

Me: Didn't I order 25 Mbit?

Installer: Yes, but lots of things can affect that, such as line quality...

Me: (Having worked at a local ISP multiple times in the past for years, cuts him off, realizing the futility of this conversation). Okay, that's fine.

In what reality do "the best signal I've ever seen" and 56% of the advertised throughput coincide? (this was not because the connection was overused by others in the neighborhood either, it was fairly consistent at 14 Mbit).

> This would help in cases such as airplane flights. One person watching HD cat videos is going to consume more bandwidth than 20 people doing work.

So would separate tiers of connectivity. If you are doing business, you may be happy with a guaranteed minimum throughput, while other people (such as those streaming) might be fine to take up the slack or excess (since you can cache future video). We've had this for a long time through QoS.

> Prioritizing non-video content is challenging if all content is encrypted.

So don't prioritize based on content, prioritize based on connection.


FWIW, every time (about 6 times in 3 years I think) I've complained about slow connections to my ISP, with documentation from speedtest.net that I'm getting less bandwidth than promised, they've given me that month free of charge.


Sounds like there's some room here for a service that does speed tests and files complaints each month automatically.


Yeah, I thought about it, but we recently got the decent-speed option included almost-free with our cable TV (15 mbps for $4/month), so I couldn't be bothered anymore. Used to be on 30 mbps for $40/month.


In which case you do what should have been done all along: Rate limit without regards to packet contents. People that still need better QoS than what that offers can pay for a higher tier to get a bigger chunk of the pie. With effective rate limiting in place the video bandwidth can be dropped by the server as necessary to maintain the stream. The infrastructure in the middle doesn't have any reason to meddle further.


It's turning out that HTTPS is more important for content integrity than for security. It's not surveillance that's the big problem for static content. It's middle boxes designed with the delusion they're entitled to mess with the content.


> It's not surveillance that's the big problem for static content.

Why would you say this? Surveilling what URLs someone is accessing / content they are watching / books they're checking out of the library has been a major security issue, historically.


HTTPS doesn't conceal the domain, or the length, or the order of requests. It's less of an issue for Google, because so much stuff comes from one domain. For small sites, figuring out who read what isn't a hard problem. In practice, you can probably buy that info from an ad tracking service.


You didn't qualify your claim as only pertaining to small sites. You simply said surveillance is not the big problem for static content. Given that most static content is served by large sites and the article is about Youtube, you haven't really supported that claim.


I'd temper that statement significantly. Content integrity is the visible consequence of improved HTTPS compliance and use.

The biggest factor about surveillance is that you are rarely aware you're being surveilled. Direct evidence is rarely present.

Case in point, Michael Lewis's Flyboys. HFT trades intercepts weren't being overtly signalled, but were only evident when trades were structured such that they bypassed the opportunity to intercept intentions at the first market.


Attempting to inject adds? Reduce the size by applying JPEG compression? :)


When I had Sprint, they did in fact scale down and recompress all images in lower quality JPEG.

Some carriers even attempt to realtime transcode video into a lower bitrate. The implementations of this are universally poor and produce rather broken files.


The more modern implementation of this (I'm looking at you, T-Mobile) is just to throttle all connections to content providers. They rely on the content providers' own adaptive streaming technology to deliver the appropriate 480p video when presented with a constrained connection.


> The more modern implementation of this (I'm looking at you, T-Mobile) is just to throttle all connections to content providers.

If YouTube can't handle traffic shaping, Youtube is broken.


If your ISP applies artificial traffic shaping, your ISP is broken.


What's "artificial" shaping?

The average consumer ISP has an over subscription ratio of 70-to-1.

Unless you want to pay 70x as much for your "100mbit" connection, there are going to be times when packets get dropped.

Isn't it better to drop packets fairly among subscribers? No shaping would result in whoever is using the most dominating everyone else.

This basic principle is still neutral; you don't have to shape based on destination / content provider.


I should be able to turn it off, but I like that tmo throttles my video. I only get 2.5 gb, I don't need 1080p.


You can turn it off, but it removes the zero rating. Up to you either way.


I'm on the light package so I don't even get the zero rating.


What the ISP is doing isn't any different than what any private network could do. If YouTube can not handle that, Youtube is not playing well with standard network practices and is broken at a technical level.

Your ISP doing traffic shaping is a question of if they should, not a technical question.


In this case, that's what "broken" means. Not that there is something technically wrong, but that they are doing things that they shouldn't.


That shaping should not be broken by HTTPS. The only things that would be affected by HTTPS is carriers munging the bits.


Not ideal but that's a lot less intrusive.


The most common implementation of this appears to be from a company called Bytemobile which also injects some pretty buggy JavaScript into HTML content.

The good news is that you can disable all of their degradation by adding "no-transform" to your Cache-Control headers:

https://kornel.ski/en/proxies

“The standard HTTP header Cache-Control: no-transform which tells proxies not to modify responses was respected by all proxies I've encountered. This is amazing! It wins an award in the "Best Supported Obscure Header" category.”


I used to think it was silly to use a VPN on a cellphone not on WiFi. Not these days, not so much.


I grew up in rural Iowa with a poor Internet connection (satellite) that dropped packets if there were thick clouds, any snow or rain, even wind sometimes. Integrity checks would have helped a lot.


TCP already covers that sort of thing. What HTTPS adds is integrity checks against an intelligent attacker, not just random natural interference.


TCP only gives you 16 bits of checksum, which is really not enough. I used to work for a CDN which served downloads over HTTP, and when the object size grew to a few gigabytes a non-negligible percentage of users ended up with corruption.

The SSL MAC is valuable even in the absence of enemy action.


This is a really interesting 'real world' anecdote. Do you know of any related data available publicly? i.e. "on our network, X bits in Y terabytes end up with undetected corruption, meaning that approximately Z% of downloads of a 2GB file will have at least one bit error".


From http://noahdavids.org/self_published/CRC_and_checksum.html

In "Performance of Checksums and CRCs over Real Data" Stone and Partridge estimated that between 1 in 16 million and 1 in 10 billion TCP segments will have corrupt data and a correct TCP checksum. This estimate is based on their analysis of TCP segments with invalid checksums taken from several very different types of networks. The wide range of the estimate reflects the wide range of traffic patterns and hardware in those networks. One in 10 billion sounds like a lot until you realize that 10 billion maximum length Ethernet frames (1526 bytes including Ethernet Preamble) can be sent in a little over 33.91 hours on a gigabit network (10 * 10^9 * 1526 * 8 / 10^9 / 60 / 60 = 33.91 hours), or about 26 days over a T3.


Makes sense. Thanks for the informed correction.


The worst is when you hit bandwidth caps and they cut you off. Gotta love background downloads...


Remember those "download accelerator" programs? Even when we had dial-up (no weather interference over the air), we had corrupted downloads quite often.


I imagine this was a snarky comment about services like T-mobile's binge. They limit video 720p or 480i at a moderate bitrate for mobile. Its supposed to be opt-in but I wouldn't be surprised if many mobile providers peppered the video stream with errors to knock down costly high-bitrate 1080p streams.


Opt in? It was opt out unless it changed very recently. I had a call with their corp office to plea why my plex server should also qualify but they said it wouldn't.


Remind me the madness that mobile carrier are doing to the TCP stack.


any links with technical details about this?


What a strange thing to automatically (I assume) add to the end of these blogposts:

> Sean Watson, Software Engineer, recently watched "GoPro: Fire Vortex Cannon with the Backyard Scientist."

> Jon Levine, Product Manager, recently watched "Sega Saturn CD - Cracked after 20 years."


Not automatically included :) I came across the Sega Saturn video a couple weeks ago from this thread: https://news.ycombinator.com/item?id=12074096


Ah interesting. So is this just something you guys do for fun at the end of your posts? Great example of dogfooding, especially coming from the big guys :)


At least they didn't watch anything embarassing. My Youtube recommendations page is a dumpster fire.


TD-Linux: Hacker News Commenter - Last watched "Relaxing Background Videos Volume IIV: 4 Hours of a Peaceful Dumpster Fire (with sound)".


I've watched worse, such as 10 hour loops


Indeed, I had FUKKIRETA X 9 on my Watch it Again. I did.


Clearing YT history often helps.

I'd vastly prefer channel blocks.


I highly doubt that is automatically linking videos. Click the Sega one, it's linked to a specific spot in the video.


It would be interesting to know how much overhead HTTPS added for them. Eg, a comparison of CPU usage between HTTP and HTTPS on the server side, whether or not they are using hardware offloads, etc.

Netflix has some papers about their experience transitioning to HTTPS:

https://people.freebsd.org/~rrs/asiabsd_2015_tls.pdf https://people.freebsd.org/~rrs/asiabsd_tls_improved.pdf

However, I expect the YouTube workload would be vastly different from the Netflix workload due to the sheer size of the YouTube catalog.


> Luckily, hardware acceleration for AES is widespread, so we were able to encrypt virtually all video serving without adding machines.

Seems that was answered in the article, no?


I would also be interested in that. What differnce would you expect from the catalog size? I would guess that aside from maybe the handshake, decoupling storage and encryption would not be worth it in all cases where the storage medium is slower than AES-NI and this would apply even more to youtube than to netflix, exactly because of the bigger catalog.


With a small catalog, you can pre-encode everything. That means that in the unencrypted case, media files can be sent directly from the kernel's page cache, as unmapped I/O. Eg, the data is not even touched by the CPU. But in the encrypted case, you're suddenly touching every byte with the CPU, thereby doubling your memory bandwidth requirements.

With a large catalog like YouTube's, I would expect that most of the titles are saved in some master format, and are encoded on the fly. That means that the data is already being touched by the CPU to re-encode it on the fly for whatever format is needed by a particular client.

My expectation is that if YouTube was re-encoding on the fly due to the catalog size, they would suffer less of a CPU hit than Netflix did, so SSL would be "easier" for them to implement.


> encoded on the fly

no, and often their encoders will go bad and produce invalid output (blocky/purple) in one of the resolutions/formats. There is ZERO options for fixing that other than deleting your video and reuploading (losing all the comments/upvotes). Even if you have 200K subs there is no way of contacting YT for help, mmaybe they will speak with you at 1mil subs :/


>> What differnce would you expect from the catalog size?

The Netflix OpenConnect applicances were maxing out the local SSD storage - YouTube isn't keeping as high of a percentage of active content on the servers so I imagine the CPU's aren't being kept as busy, or there's less CPU in their web-facing boxes. NetFlix is using 28-core CPU's in some of their latest machines.


> In short, some devices do not fully support modern HTTPS.

I'd love to know which devices, and what "modern HTTPS" features they don't support.


Android < 4.1 doesn't properly support TLS 1.2.

PayPal were originally going to stop supporting non TLS 1.2 (TLS 1.1, 1.0, SSL 2, SSL3) connections in June - they've now pushed this to next year: https://devblog.paypal.com/upcoming-security-changes-notice/


Its only fairly recently that Android got TLS 1.2 support with 4.1 in 2012. The problem is that for years budget Chinese android phones were shipping with 2.3, which was very light on resources and good for low powered phones with minimal ram.

I believe this changed with an effort to use less ram and cpu with 4.4 or 5.0, but by then millions of these phones were sold and are still out there. Heck, up until a year or two ago, these were on shelves in the US at budget places like Cricket. I think almost 10% of Androids in the wild still don't support 1.2. Maybe more considering Google has limited snooping abilities in China and may not fully know the extent of these installs.

IE7/IE8/IE9 have a combined global marketshare of about 8% also.


> The problem is that for years budget Chinese android phones were shipping with 2.3, which was very light on resources and good for low powered phones with minimal ram.

Not just budget Chinese phones. Low-end Samsung phones and such, too.


Android 2.3 doesn't even support SNI for virtual hosts on SSL.


On the desktop side, IE9, the last version on Vista, which is in extended support until April 11, 2017, only supports up to TLS 1.0.


My NAS has security settings for modern, intermediate and old browsers that says it's to do with cipher suites they support -

https://imgur.com/a/PmaUE


Those settings are likely to be derived from Mozilla's recommendations with the same names:

https://wiki.mozilla.org/Security/Server_Side_TLS

You can read the details, rationale, and information about supported clients there.


Love, love love love love LOVE my Synology!


Is there an 'Unencrypted user traffic by device type' [1] chart just for YouTube traffic, as opposed to all Google services?

Specifically, are non-HTTPS connections coming from desktops, mobile/device outside of an official app, or mobile/device through an official app?

[1] https://www.google.com/transparencyreport/https?hl=en#device...


So annoying when websites hijack pinch-to-zoom touch events and replace them with "load a random article".


> Today we added YouTube to Google's HTTPS transparency report. We're proud to announce that in the last two years, we steadily rolled out encryption using HTTPS to 97 percent of YouTube's traffic.

What's the point of the report if you only add things that are already in good shape?


> You watch YouTube videos on everything from flip phones to smart TVs

what year is this?!


I am less curious about the year and more curious as to how the hell flip phones are rendering video. Like 64p 3GPP or something realtime transcoded?


ASCII art.



Last I checked, YouTube still does not support encryption when broadcasting a live stream.


How'd finance decrease


The irony eh!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: