
The End of Video Coding? - mfrw
https://medium.com/netflix-techblog/the-end-of-video-coding-40cf10e711a2
======
carbonatedmilk
Pull quote:

We run compute on the cloud and have no real-time requirements. I was asked by
the Chair, “How much complexity increase is acceptable?”

I was not prepared for the question, so did some quick math in my mind
estimating an upper bound and said “At the worst case 100X.”

The room of about a hundred video standardization experts burst out laughing.
I looked at the Chair perplexed, and he says,

“Don’t worry they are happy that they can try-out new things. People typically
say 3X.” We were all immersed in the video codec space and yet my views
surprised them and vice versa.

~~~
carbonatedmilk
This makes entire sense to me: When you're going to stream the episode of
Stranger things to 15million people, who really cares what the one-time cost
to encode is? Surely you'd take algorithmic complexity that was 100x or even
1000x the baseline if it provided you with even a few kB of bandwidth savings?

~~~
lostcolony
You still care, since at some point the cost of the extra encoder-hours used
is going to outpace the bandwidth savings. I'm guessing she mentally assumed
'at no bandwidth savings, how much margin do we have?', since the amount saved
wasn't ever quantified.

~~~
derf_
There are a couple of things that make Netflix somewhat unique.

1) They have a fairly limited catalog (contrast with the constant ingestion of
new content by Youtube, for example)

2) The cheapest way for them to ensure they have sufficient capacity to handle
peak loads leaves them with a lot of extra compute during non-peak times. That
excess compute is essentially free.

~~~
ktta
>That excess compute is essentially free.

Since they host on AWS, wouldn't they just scale down during non-peak times
and save money?

~~~
majewsky
The AWS-hosted part is not the CDN afaik, which will probably make up the vast
majority of their infrastructure.

~~~
walrus01
Speaking as an ISP: Netflix traffic comes from the Netflix ASN and peering
session. Their globally distributed CDN/caching/video file storage servers are
not on AWS to the best of my knowledge. Netflix traffic does not come from our
peering with Amazon's ASN.

------
dingo_bat
It seems to me that despite the tech, Netflix video quality is really
horrible. Youtube is consistently much higher quality, even though Netflix's
own "fast.com" tells me my connection is capable of 75Mbps downstream. Which
should be enough for crisp 1080p. Multiple times I've been so frustrated with
Netflix's quality that I've started a movie, stopped it because it looked like
something out of a VCR, torrented the true 1080p version and watched that.

~~~
c2h5oh
Most likely you're watching in 720p.

In Chrome, Firefox and Opera you're getting 720p max.

To get 1080p you need to be watching in Internet Explorer, Safari or on a
Chromebook.

To get 4k you need to be watching in Edge and have a 7xxx or 8xxx Intel CPU
and HDCP 2.2 enabled display.

Source:
[https://help.netflix.com/en/node/23742](https://help.netflix.com/en/node/23742)

~~~
nine_k
Do they offer any reasons to these limitations?

In particular, is it due to DRM requirements, or pure performance? I suspect
it's the former.

~~~
Crosseye_Jack
It’s DRM. The widevine conf they are using means they are decrypting and
decoding in software when you use Chrome or Firefox. When you use Edge you use
a different DRM scheme that allows allows decrypting, decoding and rendering
in hardware so Netflix offered content upto 4K in Edge with Recent Intel
CPU’s. (Last time I checked Ryzen has only just come out with no onboard GPU.
But support for recent Nvidia GPU’s was promised, it’s been a while so the
landscape may of well changed) If you didn’t have the latest Intel CPU it
called back to an older version of PlayReady (sure that’s the brand name of
MS’s DRM - on phone and a bit lazy to look it up) that still surported 1080.

See in Widevine there are a number of “levels”, the highest being when it can
decode, decrypt and push to the frame buffer all in a secure zone. This can
not be achieved (atm, well atleast the time of my research into the matter)
with widevine on Desktop, so in such a setup widevine will only decrypt upto
720p content.

When running on Android and ARM this is possible and you can get 1080p, which
is why you can get cheap android based tv sticks (even the old Amazon Fire TV
sticks) supported 1080p but your gaming rig and Chrome could not.

Don’t work for Widevine, Google, NetFlix or anyone else for that matter. Just
a nerd with too much time on my hands so I looking into this stuff. Any
corrections welcomed :-D

~~~
Crosseye_Jack
No longer able to edit so replying to myself: Taken from another post of mine
about widevine 7 months ago where we was discussing why the RaspberryPi
couldn't support 1080p Netflix
([https://news.ycombinator.com/item?id=15594460](https://news.ycombinator.com/item?id=15594460)
and a link to the comment chain to make it easier for anyone reading -
[https://news.ycombinator.com/item?id=15586844](https://news.ycombinator.com/item?id=15586844))

> As far as I understand it there are 3 security levels to widevine Level1
> being the highest and 3 being the lowest.

> Level 1 is where the decrypt and decode are all done within a trusted
> execution environment (As far as I understand it Google work with chipset
> vendors such as broadcom, qualcomm, etc to implement this) and then sent
> directly to the screen.

> Level 2 is where widevine decrypts the content within the TEE and passes the
> content back to the application for decoding which could then be decoded
> with hardware or software.

> Level 3 (I believe) is where widevine decrypts and decodes the content
> within the lib itself (it can use a hardware cryptographic engine but the
> rpi doesn't have one).

> Android/ChromeOS support either Level1 or Level3 depending on the hardware
> and Chrome on desktops only seems to support Level 3. Kodi is using the
> browser implementation (at least when kodi is not running on Android) of
> widevine which seems to only support Level 3 (So decrypt & decode in
> software) and therefore can not support hardware decoding. But that doesn't
> mean that hardware decoding of widevine protected content can not be
> supported on any mobile SoC. Sorry if I gave that impression.

> When a license for media is requested the security level it will be
> decrypted/decoded with is also sent and the returned license will restrict
> the widevine CDM to that security level.

> I believe NetFlix only support Level 1 and Level 3, which is why for a while
> the max resolution you could get watching NetFlix on chrome in a desktop
> browser was 720p as I believe that was the max resolution NetFlix offered at
> Level 3 and we had to use Edge/IE(iirc) to watch at 1080p as it used a
> different DRM system (PlayReady) and why atm Desktop 4k Netflix is only
> currently supported on Edge using (iirc) Intel gen7+ processors and NVidia
> Pascal GPUs (I don't know if AMD support PlayReady 3.0 on their GPUs as I
> don't have one so not really had the desire to investigate, I'm guessing
> that current Ryzen CPUs do not as they currently don't have integrated
> GPUs).

------
zellyn
I see this:

    
    
      In the ITU-T VCEG and ISO/IEC MPEG standardization world, the
      Joint Video Experts Team (JVET) was formed in October 2017 to
      develop a new video standard that has capabilities beyond
      HEVC. The recently-concluded Call for Proposals attracted an
      impressive number of 32 institutions from industry and
      academia, with a combined 22 submissions. The new standard,
      which will be called Versatile Video Coding (VVC), is expected
      to be finalized by October 2020.
    

Can anyone with more industry knowledge chime in here? To me, that sounds a
lot like the kind of group that created the patent- and royalty-encumbered
stuff the AOM was created to avoid.

~~~
derf_
I have some industry knowledge. You are exactly correct.

------
ksec
>To address the _disconnect_ between researchers in academia and
standardization and the industry users of video coding technology,

This annoys me quite a bit. Because then it list out the so called large-scale
video encoding from Facebook, Twitter, and Youtube. As if Video Encoding are
only done by OTT providers. And the Video Encode from iPhone ( Consumers ) TV
broadcast ( the good old content distributor ), livestream of events from
Sports to Olympics.. etc doesn't matter. It is a very Silicon Valley
mentality, and shown in AOM / AV1. After all they are creating their own codec
for their own use. While other industry codec organisation will have to take
care many "edge" use case.

>So how will we get to deliver HD quality Stranger Things at 100 kbps for the
mobile user in rural Philippines? How will we stream a perfectly crisp 4K-HDR-
WCG episode of Chef’s Table without requiring a 25 Mbps broadband connection?

It is interesting this 100kbps bitrate and rural Philippines comes up. Because
this is the exact same quote from Amazon's video specialist Ben Waggoner
mentioned on Doom9.

Shouldn't we be a little more realistic with the bitrate? We have 20 years of
experience and research and yet we still don't have a single Audio codec ( two
dimensions ) that could perform better then MP3 128Kbps at half the bitrate.
Opus only manage to slightly edge it out at 96kbps, and that is with selected
samples. There is only so far we can go, 100Kbps is barely enough for Audio.
And we have Massive MIMO, and 5G, both will bring immense capacity increase to
current Network. There is so much in the pipeline to further increase
efficiency, capacity, lower latency, cost, and power. It is a little hard to
think designing a 100Kbps for Video.

Currently Youtube streams 1080p AVC @ ~2.2Mbps. Which seems to be fine with
most people already, especially on Computer / Tablet or Smartphone Screen
Size. HEVC can probably do similar quality with 1.5Mbps. VVC should be aiming
at below 1Mbps. Netflix is doing 15Mbps 4K Streaming with HEVC ( And people
are complaining about quality already, I have no idea why I don't watch
Netflix) VVC should really be aiming at better quality with 8Mbps. We should
aim we specific bitrate and resolution with Real World Encoder as anchor, and
specific quality to achieve.

~~~
hcnews
In most of your reply you have failed to account for the third world (and
frankly a decent chunk of first world). It seems to me you don't have an
understanding of your user base. Please dig into how many people are actually
on HD-capable bandwidths.

I am glad that Silicon Valley is driving it since they have users all over the
world, not just fiber-enabled users. If and when other people have a better
aim, vision and understanding of the problem, I think you should find a good
collaborative atmosphere.

In the meantime, let's not design for 100m users when we should be thinking
about 2-3b users.

~~~
kolpa
You aren't disagreeing with parent, who said that the 2-3B users don't have
HD-capable connections. The parent is saying that the industry should try
delivering SD-quality video before hand-waving about near-infinitely
compressible HD-video.

For example, one solution for low-bandwidth environments is to edge-cache it
(farther toward the real edge than "local ISP") and then spread it locally via
peer-to-peer short-distance radio (wifi, bluetooth, local cellular). This is
not a 100x data compression challenge; it's a local delivery infrastructure
challenge.

------
MayeulC
Are there transmission schemes that can transfer multiple independent streams
that add up for better quality?

It could be streams containing information about smaller subblocs, etc.

If you take a fourrier transform, the more coefficients you include, the more
faithful the reproduction is.

Splitting the quality level into multiple independent streams could have
multiple advantages :

\- better use of Multicast, as viewers for different quality settings get the
same data, so you use approximately the bandwidth for one full quality video
instead of full quality + medium + low...

\- save on storage space for the same reason \- save on transcoding time, as
only one pass is needed

\- better suited for distributed storage/transmission, on a platform like
PeerTube, where every p2p client could contribute back instead of clustering
by quality.

I am not working in this field, but I know a fair bit about compression, and
this seems a no-brainer to me. Is it already done? Where? Or did I oversaw
some issues?

~~~
0x09
This is known as scalable coding and has been included as an extension or
profile in at least every ISO and ITU standard since MPEG-2, see for example
[https://en.wikipedia.org/wiki/Scalable_Video_Coding](https://en.wikipedia.org/wiki/Scalable_Video_Coding)
(this article seems to be specific to H.264)

~~~
MayeulC
Thank you a lot for the proper technical term. Do you happen to know if it is
widely used in web-based streaming applications (I would be surprised, as I
don't think I encountered it before), if ffmpeg supports it (in the case of
H.264),and if AV1 includes such an extension?

------
lopmotr
"new techniques are evaluated against the state-of-the-art codec, for which
the coding tools have been refined from decades of investment. It is then easy
to drop the new technology as “not at-par.”

This seems to be quite a general technology problem that looks like it applies
to car engines, batteries, computer memory and any other highly optimized
mature technology. I wonder if there's some way to change incentives so others
get a chance. It happens in evolution in nature too and it's sometimes solved
by mass extinctions. Hopefully that's not the only way.

~~~
pishpash
That's what research programs were supposed to do. The problem occurs when
research is also evaluated against such specific things as wanted by industry
that there's no room for new ideas to be fully funded for the long maturing
process.

------
vidanay
I want to see an encoding that is basically RLE using offsets and lengths of
π.

~~~
21
You mean, FLIC?

[https://en.wikipedia.org/wiki/FLIC_(file_format)](https://en.wikipedia.org/wiki/FLIC_\(file_format\))

~~~
vidanay
I don't see any references to pi in that.

------
shmerl
A post about innovation and moving beyond block-based encoding and no mention
of Daala's idea of lapped transforms?

~~~
cjensen
There is no shortage of new ideas for video encoding. The post is welcoming
both new ideas and encouraging more new ideas rather than being any kind of
summary of current tech.

~~~
shmerl
I guess there might be no shortage in general, but there are few which are
publicly developed with the aim to make free codecs. Which other efforts are
there besides Daala?

~~~
kylophone
AV1

[https://en.wikipedia.org/wiki/AV1](https://en.wikipedia.org/wiki/AV1)

~~~
shmerl
It's still block based. I was talking about new approaches which the blog post
is referencing.

