Hacker News new | past | comments | ask | show | jobs | submit login
YouTube requires the RC4 cipher for videos (productforums.google.com)
46 points by theandrewbailey on July 2, 2014 | hide | past | favorite | 45 comments



Interesting. It looks like the page itself is encrypted with more standard crypto using AES [0], but the video itself is served from a connection using RC4 [1].

As far as I can tell from this article [2], attacks on RC4 are only theoretical at this point, but it seems reasonable that they should upgrade the video serving domain to be consistent with the page itself (though maybe it's done for performance reasons - either encrypting a large video stream costs a lot of CPU time or decrypting it on the client is too slow or CPU intensive for watching video smoothly, so perhaps they are sticking with RC4 for a good reason). Still, as my old crypto prof said, "Attacks only get better over time. They never get worse."

[0]: http://i.imgur.com/OcpziFa.png

[1]: http://i.imgur.com/57T8bay.png

[2]: https://community.qualys.com/blogs/securitylabs/2013/03/19/r...


Attacks on RC4 are publically theoretical but people have come forward to say that the NSA[1], and likely other intelligence agencies, can break RC4 in real time. Given what we know about the theoretical attacks, this does not stretch the imagination.

[1]: https://twitter.com/ioerror/status/398059565947699200


The attacks are not "theoretical". They are in fact trivial to implement. What you mean to say here is that the attacks are difficult to launch in practice. That is true: while the code to attack RC4 is simple, the attack generates a huge amount of bandwidth.


For the record, Google has been working on adding support for more secure ciphers [1] and the fix is being (has been?) pushed as we speak. Users already reported seeing support for AES-GCM (see Jas Per's scan[2]).

[1] https://news.ycombinator.com/item?id=7775052

[2] https://www.ssllabs.com/ssltest/analyze.html?d=r4---sn-uxaxo...


It looks like one AES-GCM and one RC4 ciphersuite, but with RC4 preferred?

"Support" for AES-GCM doesn't mean much if an RC4 ciphers is enabled and preferred by the server.

1. There are production servers that are still using RC4 only.

2. Consequently, clients can't disable RC4

3. Consequently, servers that prefer RC4 but offer better ciphersuites are insecure because they will negotiate RC4 even with better ciphers available. Client workarounds might be possible (involving multiple handshakes for rc4-only servers) but aren't implemented?


> My guess is someone at Google read an outdated, incorrect blog post about the BEAST attack and saw the recommendation to force RC4. They then did so without bothering to learn anything about what they were doing.

Oh, well, no. Really, this is not something that can happen at Google. The security people working there are for sure not clueless and a similar decision is not taken so lightly at that scale.

What happened is that they looked into it more than OP, benchmarked it and decided that the performance overhead of a cipher different from RC4 was not acceptable for the video streams. Or they have hardware streaming those videos that only supports RC4.

It's not small play as OP seems to believe. I'm actually amazed that they enabled AES-GCM as a fallback. Btw, RC4 remaining preferred points more in the direction of performance reasons.


[deleted]


Test one of the content servers that the Youtube player connects to. It's a domain ending in googlevideo.com

Disabling all RC4 ciphers breaks the Youtube player, unless your browser supports TLS_RSA_WITH_AES_128_GCM_SHA256, which isn't widely supported.


its the content servers that require RC4. You probably checked main site, not the server that holds video clips.


Honestly, I couldn't care less if my YouTube video stream was encrypted or not. So I'm on the side of the original poster. That said, Google obviously is against that. So I guess he should have just pushed for newer cipher support saying RC4 was not legally possible for his organization. Google adding support for newer ciphers and getting more users seems like a clear win for them and probably only takes their equivalent of updating a Chef or other infrastructure recipe since it is just a setting.


RC4 is faster and less cpu intensive -- Leading to a better user experience, and lower costs which is probably why they force it.

I know a lot of times we prefer arcfour (RC4) for backend "trusted" ssh connections because it is faster the other ciphers.


tldr: AES-NI is faster than RC4 on Intel processors > 2009. 170% faster actually [5]

"The RC4 is fast" meme is true, because RC4 uses simple math operations like modulo and bitwise AND. [1]. CPUs have always done these operations fast. Other crypto algorithms used by SSL, like 3DES, were deliberately designed to be slow in software. [2].

However, modern Intel chips have dedicated instructions for perform AES, called AES-NI [3]. For these chips, an AES operation is not broken down into a series of opcodes the CPU executes. The CPU just does it. AES-NI is actually faster than RC4. [4]

We need to kill the RC4 is fast, everything else is slow myth. "TLS has exactly one performance problem: it is not used widely enough." [5]

1- http://en.wikipedia.org/wiki/RC4#Implementation

2- DES was designed around bit swapping, not byte swapping, which is super easy to do in hardware and much slower to do in software.

3- http://en.wikipedia.org/wiki/AES_instruction_set

4- http://zombe.es/post/4078724716/openssl-cipher-selection

5- https://istlsfastyet.com/


I 100% believe you. All of my speed testing revolves around benchmarking ciphers with openssh. The fastest was arcfour and the second fastest was blowfish. After that was aes128... But my company used AMD for whatever reason. I suspect though that if client and server both have AES support built into the cpu or use a crypto chip that AES would be the fastest as you said.

My point, however, was just to counter the speculation that Google uses RC4 because they misinterpreted BEAST results.


is AES-NI available for ssh connections? Would be interesting to see an updated benchmark like http://blog.famzah.net/2010/06/11/openssh-ciphers-performanc... if it is.


AES-NI is an extension of the instruction set of the CPU just like x64 extensions, or the older MMX, MMX2, SSE style extensions. They are opcodes that are hardwired into the CPU itself.

Software has to be compiled to take advantage of these additional instructions. Much like how you can compile software for say, i586 instead of i386, and get code that executes faster. so it all depends on how OpenSSH is compiled (and, to some extent, whether the OpenSSH source code has extract compiler flags/code sections)


It should be. There's also chacha20 if you need a fast software cipher.


People only disable RC4 due to PCI audits, because PCI vendors think RC4 is practically broken. It's not. In practice, it's still secure: https://en.wikipedia.org/wiki/RC4#Security. The only way to practically attack RC4 is to obtain a "large number of TLS encryptions". How much?

"The number of connections/sessions needed to reliably recover these plaintext bytes is around 2^30, but already with only 2^24 connections/sessions, certain bytes can be recovered reliably."

This is even more than what is provided in a YouTube video.

You need 16 million connections with 1GB of data, _in_ _each_ _connection_ to practically attack RC4. Sure, treat that as a wakeup call, but let's be realistic about it. RC4 is only _theoretically_ broken, similarly to SHA1.

By comparison, OpenSSH still uses MD5 hashes for their private server and client keys. MD5 is practically broken. So, if you want to "do something", fix OpenSSH.


That is a really poor analysis. RC4 is not "secure". RC4 is terribly broken, it's just not broken in a way that's likely to be relevant to Youtube. You should probably get your crypto analysis from Crypto Stack Exchange, not from Wikipedia.

The last widely publicized attacks on RC4 used browsers and session cookies as targets. The attacks are practical --- in fact, they're among the simplest of the low-level cryptanalytic attacks (the flaws are statistical, and statistically trivial). Our implementations of the attacks clock in at hundreds, not thousands, of lines of code.

RC4 isn't a hair-on-fire problem and I wouldn't really worry too much about it at Youtube. But if you were operating a bank or a payment processor, I would tell you to make sure you weren't using RC4.

MD5 is broken in that you can generate collisions for it. But HMAC-MD5 is not broken; there is no theoretical attack on MD5 that breaks HMAC-MD5. It takes more than a simple collision to break HMAC.

The risk that systems that use HMAC-MD5 (and things like it) are taking is different from the risk that systems that use RC4 are taking. MD5 is broken, but HMAC-MD5 is actually (to the best of our knowledge, and with a very low margin) secure. RC4 is not actually secure in any scenario. But the cost/benefit of exploiting RC4 to recover the plaintext of a publicly-available streaming video is not great.


Funny. I'm getting the analysis from Daniel Bernstein and other cryptographers, not Wikipedia. I'm only showing Wikipedia, because it cites the references:

http://blog.cryptographyengineering.com/2013/03/attack-of-we...

http://www.isg.rhul.ac.uk/tls/

http://link.springer.com/chapter/10.1007%2F978-3-642-19574-7...

RC4 is not practically broken. It requires too many connections (about 16 million, give or take), getting identical data with different keys, and it requires GB of data for analysis.

EDITED TO ADD: I agree we should move off it, but as I understand the current attacks, only small pieces of information can be retrieved from RC4 in a practical attack. With that said, the practicality of the attack is on the edge of feasibility, and it's only going to improve over time. So move, but don't lose any sleep over it.


I don't know how to respond to this. 16 million connections is nothing. Gigabytes of data is nothing. Home Internet users incur gigabytes of bandwidth expense to get Torrented copies of movies they don't even end up watching.

I don't know how much clearer I can say this: RC4 is breakable with attacks that take mere hundreds of lines of code, and the attacks are bounded only by your ability to generate millions of connections and gigabytes of transferred bytes. It's 2014.

I'm getting some of my analysis from Bernstein and Patterson too, but more of it from the experience of having implemented some of the attacks, and working directly with people who have implemented the rest of them. You should take my word for this and stop using RC4. The attack is not "on the edge of feasibility". Yes, the attacks have been leveraged to recover "small pieces of data" from connection. That's because the data you care about in an HTTPS connection is small: you're trying to recover the session cookie.


With due respect, LOC seems like an awful metric to gauge protocol brokenness. I can write a one-liner that counts to 2^128, but that doesn't demonstrate that it's easy to count that high.


Which is why I didn't just cite LOC. There are attacks that requires many millions of ciphertexts and that are difficult to implement (for instance, they might speed up a brute-force search). That's not what the RC4 attacks are; they're a simple statistical process that directly reveals plaintext bytes.


My highly optimized Nginx web servers, behind a load balancer, can't handle tens of millions of connections. If it got anywhere near that point, Nagios would be alerting of load, HTTPS being unresponsive, RAM filled, swapping to disk, no pong reply due to a saturated network, and a whole load of other issues. My HA Nginx setup will die long before you reach tens of thousands of simultaneous connections, let alone tens of millions. I'm guessing I'm not the only one.

Practical attack against my server? Nope. You'll kill it before you get anywhere.


Those connections needn't be concurrent.


And if not concurrent connections, it's not feasible given time constraints.


Could you be more specific? I don't know what you're trying to say here.


> By comparison, OpenSSH still uses MD5 hashes for their private server and client keys. MD5 is practically broken. So, if you want to "do something", fix OpenSSH.

Did you mean to say OpenSSL there? OpenSSH uses PAM. PAM can use MD5 but OpenSSH has no control or oversight over that, the distribution you're using normally determines the hashing algorithm. You can also trivially reconfigure PAM to use something stronger if you so wish.

If you meant to say openSSL then your point is still quite confusing. MD5 is used in a few places around OpenSSL (e.g. digests, signatures, etc) none of which are weakened by MD5's weaknesses.


No. I meant OpenSSH. See a previous reply.


By comparison, OpenSSH still uses MD5 hashes for their private server and client keys.

Can you clarify?


When you initiate a connection to an OpenSSH server, the server presents you with its public key, which is hashed with MD5, and the result displayed. You _should_ verify this with the server administrator that the fingerprints matched, and you are indeed connected to the server you intended, rather than a hijacked connection.

MD5 is resistant to 2nd preimage attacks, however. So technically, OpenSSH is fine using MD5 to display the fingerprint to you. With that said, I also don't see the problem moving to a stronger algorithm.


Um, no. People disable RC4 because it's rubbish.

If I may quote GCHQ - a "surprising [...] cryptologic breakthrough" was obtained a few years ago from NSA.

We don't know for certain if that referred to an RC4 break specifically, but from context, I think that is the most probable candidate - a bulk cipher they've never used themselves, but widely-deployed internationally (at the time, and still to a large extent) on a huge number of websites largely because of advice to mitigate the BEAST attack (a tricky, active attack) not by switching to TLS 1.2, but by switching to a stream cipher designed in the late 80s for the 8086 that is showing its age poorly. That was probably not good advice.

What we do know for sure is that it's in a poor state; worse than any other cipher deployed in mainstream crypto. RC4 is Considered Harmful.

Go back and review what you've encrypted with RC4 previously, and evaluate if an eavesdropper could have recorded it and saved it for later when they do have a practical attack, and if so, if that's a problem - because in the state it's in, that is not a theoretical concern, because we know some Nation State Adversaries (cough) certainly do that en masse as general practice.

The IETF TLS Working Group have actually had a big discussion about this already - resulting in a live Internet-Draft not merely deprecating, but actively prohibiting RC4's use in TLS, which obtained strong-to-unanimous support in the Working Group and in my opinion will very probably be adopted as a WG document: https://tools.ietf.org/html/draft-popov-tls-prohibiting-rc4-... - long story short, you MUST NOT use RC4 in TLS, don't even think about negotiating it, kill it with fire.

If you're still using it in anything else, consider this your I-told-you-so warning.

If you want a replacement, consider ECDHE-RSA-AES128-GCM-SHA256 or - if you want a modern stream cipher that's actually 256-bit secure yet faster in software than RC4 (unless you're on an 8086...) - ECDHE-RSA-CHACHA20-POLY1305: coming to an AEAD stream near you, and already with a draft implementation live in Chrome and on most of Google's properties thanks to agl's patches that more recently have become BoringSSL - but sadly not YouTube's video CDN yet, as the article notes.

I couldn't speak for OpenSSH, but I also think your information, or your software, may be out of date. SSHFP DNSSEC records, in particular, have had SHA256 on the Standards Track RFC for two years and change - https://tools.ietf.org/html/rfc6594


> You need 16 million connections with 1GB of data, _in_ _each_ _connection_ to practically attack RC4.

This isn't accurate.

There are two major attacks described in the recent literature on RC4.

The first targets single-byte biases in RC4. Simply: the distribution of key stream bytes at a given index is biased across all keys. By collecting a large number of ciphertext samples, you can match the bias in the ciphertext to the known biases in the key stream to recover plaintext.

These biases are only significant early in the key stream, say the first 256 bytes or so. Because they're only prevalent early in the key stream, you need to make a lot of connections.

This attack isn't a great fit if your target is a user's HTTP session, for two reasons. One is that you need to make a ton of connections, and there's not (to my knowledge) an easy way to force that from the attacker's perspective. Even if there were, you would incur quite a bit of overhead from all the TLS handshakes. The second and more important reason is that the HTTP session cookie just isn't going to show up in those first 256 bytes.

The second attack targets double-byte biases in the RC4 key stream. Again, these biases are known and measurable across all keys. By collecting a large number of ciphertext samples, you can perform analysis to recover plaintext. This is a little more complex because there is interdependence between bytes, but still not so tough. (Search for "hidden markov model" and "viterbi algorithm".)

The double-byte biases show up at regular intervals over any given key stream. This is important! It means the number of connections is irrelevant to plaintext recovery. What matters is just the amount of ciphertext you can collect.

This makes it a much better choice to attack HTTPS. We can collect all the ciphertext we need from one TLS session (or three or four or five, it doesn't matter). And we can also exert much finer control over where the HTTP cookies show up. As an attacker running JavaScript in a user's browser, you have this ability.

As the research notes, neither of the outlined attacks is practical today. But they're both knocking on the door. What if higher bandwidth or browser changes allow an attacker to take samples more quickly? What if someone refines the algorithms outlined in the paper? To the latter question, there are additional multi-byte biases (Mantin, 2005) they don't account for in their attack. It doesn't seem far-fetched to suppose these attacks could be improved.

So, is RC4 a game-over vulnerability in the context of TLS? Of course not. But conscientious site operators should look to migrate away from it as soon as practicality allows.


I think we need to be very careful with how much work the word "practical" is doing in these sentences. Sometimes we say an attack is impractical because it brings the cost of finding a key or a collision down from "cosmically intractable" to "still outside the reach of the NSA". But the RC4's practicality is not like that: here, by "practical", we mean that the attack isn't point/click simple.

('sdevlin implemented the double-byte bias attacks at Matasano, for what it's worth).


Is this me, or does the linked thread read like a warning about the brain damage bureaucracy can do to people?

tl;dr

- A company wants to be PCI compliant.

- PCI forbids using RC4 encoding.

- YouTube now requires encryption and uses RC4 for the actual video stream (I guess for performance reasons).

- Company employee thinks that opening YouTube somehow damages their PCI compliance, because the site uses RC4.

- The company employee would rather have YouTube serve its site with no encryption, rather than use RC4, as somehow this would be better for their PCI compliance.

What am I missing here that makes this less absurd?


The OP in that thread made it quite clear in a subsequent post:

We don't have a requirement to encrypt Youtube video because Youtube videos are outside the scope of the policies we have that mandate encryption. Those policies prohibit the use of RC4 and because browsers don't allow varying the cipher support on a per-site basis, the result is nothing may use RC4. Because we don't have to encrypt the video and because Google doesn't support anything we can use we're left with the options of either finding a way to allow Youtube videos without encryption, or block Youtube on secure subnets.


It sounds more like it's a practical sysadmin stuck in an impractical pointyhead-caused hell.

1. Youtube needs to work

2. It's not considered sensitive by the business (and I hope to god nobody uses Google credentials)

3. He can't change PCI-relevant settings or he'll get fired (and the business could potentially lose some sort of financial accreditation, or endanger a business partnership)

4. He can circumvent the requirements by disabling SSL for Youtube, which seems to be a loophole in PCI's laughable "security" requirements.

Sucks, but it happens. I'm sure he's polishing up his resumé.


Absurdity aside, I'm rather more concerned that a computer inside the PCI fence needs to be able to view YouTube videos. With ads. Malware tainted ads.


1. PCI compliance is a requirement for some types of business

2. PCI compliance requires that some connections are encrypted

3. PCI compliance requires that encrypted connections follow a set of guidelines, including no RC4 cipher

4. Rule #3 is not limited to encrypted connections mandated in rule #2; this allows people in the organization to assume that all encrypted connections are at least minimally 'secure', rather than 'some connections are secure but some are more secure than others'.

PCI compliance doesn't care about unencrypted connections, because they mandate that sensitive data travel over encrypted connections, so unencrypted connections aren't relevant to them at all.

So basically, PCI says 'no RC4 ciphers in your enterprise' (for stupid reasons) and does not allow for exceptions (for sane reasons). Google is using RC4 only (for what I assume are practical, if stupid, reasons).

It's kind of surprising, since Google (or at least, large swaths of Google) must be PCI compliant considering that they do payment processing, etc.


You would have to use a separate browser, with RC4 enabled, just to watch YouTube (and other non-secure things) and then a different browser without RC4 enabled to do everything else. And a way to prove in a PCI audit that you never use the RC4-enabled browser for anything secure.


Honest question, why would anyone want a youtube video stream to be encrypted?


For the one of the reasons you might want the various things you look at online to be encrypted - for privacy. If the stream is encrypted, an eavesdropper can't tell which video you're watching. This might be even more important if that video is unlisted or private - an eavesdropper could watch the video that they wouldn't otherwise be able to.


googlevideo.com is a CDN. Up to what point do they control it?


Most (if not all) other Google properties have no trouble supporting better encryption. This is probably an oversight.


Or its for performance reasons. Grandma's computer might have issues decoding flash video in whatever codec it now uses (h264 I assume) that's also wrapped in 128/256-AES coming to her at HD quality.


It's probably for performance reasons.


I think I've had a word about this before. It's a completely separate CDN from 1e100, often colocated, and purely software.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: