
Netflix HTTPS performance experiments for large scale content distribution - cpeterso
https://lists.w3.org/Archives/Public/www-tag/2015Apr/0027.html
======
semenko
The first thing that came to mind was @agl's discussion of TLS's overhead with
Gmail: [https://www.imperialviolet.org/2010/06/25/overclocking-
ssl.h...](https://www.imperialviolet.org/2010/06/25/overclocking-ssl.html)

It's a bit hard for me to reconcile "<1% of the CPU load" (Google) with "up to
50% capacity hit" (Netflix).

~~~
jewel
It's surprising. They say that they are using the hardware AES support in the
CPU too. They do mention that they are no longer able to take advantage of
sendfile() which was letting the bulk of the data transfer to happen
completely in kernel space.

So, in their case, they had already optimized things to such a degree that
they were having a lot of throughput and very little CPU usage per byte. In
fact, it's possible that on their old system the data streams wouldn't have to
be touched a byte at a time, as they could be loaded into main memory via DMA
from the disk and once that finished they could be transferred to the network
card, again via DMA. Once sendfile() is called, no further context switches
are necessary for that request.

Testing with `openssl speed -evp AES128` on an i5-4670K I'm able to get 6.7
gigabits of encryption out of AES-NI. I tried running two concurrent copies
and they each got that speed, which tells me that it's per-core. That's easily
enough speed to saturate a 10G connection.

It's quite possible that they have much older or slower hardware involved,
since when you're building a server like this normally the network or disks
are the limiting factor.

~~~
Freaky
> Testing with `openssl speed -evp AES128` on an i5-4670K I'm able to get 6.7
> gigabits of encryption out of AES-NI

Bit of a crappy test, though - tight loop over a single 8k buffer (used in-
place). It probably executes entirely in L1/L2.

They provide details of the hardware they're using:
[https://openconnect.itp.netflix.com/hardware/](https://openconnect.itp.netflix.com/hardware/)

The fanciest in their "IO-optimized" SSD-driven beast is a 2.7GHz 12 core Ivy
Bridge. You've got a 20% clock advantage, and Haswell reportedly reduced AES-
NI clock latency from 8 to 7 cycles for a ~14% improvement. Doing the math I
get 55Gbps for the lot.

That's without doing any packet processing, no waiting on memory accesses, no
interrupts, no context switching, just perfect scaling with a tight loop over
the same 8k buffer - and they're driving 4 10Gbps NICs. Not much left over.

~~~
xxs
>> It probably executes entirely in L1/L2.

L1!

------
Lx1oG-AWb6h_ZG0
I understand the need for HTTPS when browsing Netflix, but is there really any
benefit in encrypting the actual media stream?

The file is already encrypted for DRM purposes... if thry want to go beyond
that, how about a system where each edge server gets a differently encoded
file from the master?

~~~
dantillberg
Without encrypting the video contents, anyone sniffing the data stream can
easily tell what you're watching, when you paused, or skipped, or re-played,
etc. They may not be able to steal your money with that data, but it still
exposes real, personal information.

(though... I was just thinking about this, and due to the use of variable-
bitrate encoding, the varied rate of data transmission over time (even though
encrypted) would give each video a fingerprint that an observer could use to
determine whatever it is that you're watching, anyway, albeit with greater
effort that simply inspecting the unencrypted stream)

~~~
Lx1oG-AWb6h_ZG0
If the DRM key is transferred securely, how can an attacker tell what you're
watching without fingerprinting the entire Netflix library? You can't just
open up parts of the stream and parse it as a plain video file to my
understanding.

I'll give you the rewind scenario, but play/pause events are going to be
equally obvious in HTTPS. Besides, what exactly does the attacker learn? That
you were watching Netflix during that time? The HTTPS traffic to Netflix will
be an equally good indicator of that.

~~~
comex
Fingerprinting the entire Netflix library - especially if you limit yourself
to popular shows - probably isn't that hard. I concur with not really
understanding the point, though.

One possibly relevant factor is that shared encryption keys make it harder to
authenticate the data: you can't rely on an HMAC, and RSA verifying every
frame would be too slow. You can still do it efficiently with a hash tree, but
I bet Netflix currently isn't, and TLS allows browsers to enforce verification
on behalf of the user. Without verification, you could potentially exploit the
browser with malformed audio or video data, or freak the user out by injecting
some other video...

------
jluxenberg
EME stands for Encrypted Media Extensions [http://www.w3.org/TR/2015/WD-
encrypted-media-20150331/](http://www.w3.org/TR/2015/WD-encrypted-
media-20150331/)

------
eyeareque
I'm not familiar with AWS, but wouldn't a Cavium based card make this a lot
easier for Netflix?

~~~
justincormack
They do not use AWS for streaming, just for the control plane.

I was at the talk and they did not mention hardware offload, but there are a
lot of problems with having full control.

------
M2Ys4U
Maybe if they're having rouble with EME + HTTPS they should stop being
defective by design and abandon their mistake of using EME.

