
AV1 beats x264 and libvpx-vp9 in practical use case - runesoerensen
https://code.facebook.com/posts/253852078523394/av1-beats-x264-and-libvpx-vp9-in-practical-use-case/
======
vladdanilov
Meanwhile, there are 2.8 MB of images in the article with the header image
alone being a 1.6 MB JPEG (!). They can be reduced by at least 50%, given that
the deringing filter works fine on contrast text/charts. Even more, the
optimal format for this type of images is PNG not JPEG. After the conversion
and optimization, they take 889 KB [1], >3 times smaller, and doing the same
on properly resized originals without JPEG artifacts can bring them down to
100-200 KB.

[1]
[https://github.com/vmdanilov/optimage/files/1898955/images.z...](https://github.com/vmdanilov/optimage/files/1898955/images.zip)

~~~
triangleman
Totally appropriate self promotion, I approve.

------
vanderZwan
> _Our tests were conducted primarily with Standard Definition (SD) and High
> Definition (HD) video files, because those are currently the most popular
> video formats on Facebook. But because AV1 's performance increased as video
> resolution increased, we conclude the new compression codec will likely
> deliver even higher efficiency gains with UHD/4K and 8K content._

This remark about higher resolution reminded me of something I've been
wondering about frame rates, which are essentially also a higher resolution,
except across the dimension of _time_.

Assuming low to negligible noise from the camera (or a CGI, which avoids the
problem altogether, shouldn't doubling from 30FPS to 60FPS have less than
double the size increase? The reasoning being that changes per frame are
lower, leading to better predictions.

Is there any rule-of-thumb formula for expected size increase as frame rate
increases?

~~~
jd20
The delta encoding will be more efficient with higher frame rate, but also
x264 has a tweak which decreases quality factor as frame rate increases. Why?
The reasoning was that, since the frame is displayed for less time, you don't
have as much time to appreciate the details and hence fewer bits can be used,
while still maintaining the same perceptual quality.

I found this out the hard way when encoding time lapses, even if the delta
between frames is identical, the final choice of frame rate greatly affects
output file size.

~~~
hnaccy
If all frames have lower quality wouldn't it still be perceived regardless of
frame rate?

~~~
mAritz
I'm no expert but I would say no. Each frame delta might have less/same
information, but since you're applying more deltas/second, it gets better
again.

~~~
vanderZwan
Wouldn't this lead to bigger error accumulation? At least, my gut feeling says
"less information per delta = larger error accumulation", and on top of that
we have more deltas/second

------
keldaris
"Our testing shows AV1 surpasses its stated goal of 30% better compression
than VP9, and achieves gains of 50.3%, 46.2% and 34.0%, compared to x264 main
profile, x264 high profile and libvpx-vp9, respectively.

This sounds like AV1 is actually worse than H.265, at least at low-ish
resolutions. Am I misreading these results?

For context, I've moved to using H.265 (via the x265 encoder) for archiving
720p and 1080p video. At least for low bitrates (in the 500-1500 kbit/s range)
it gives me over 50% smaller filesizes compared to x264 high profile at
roughly the same quality. The encoding speed is much slower, of course, but
not by more than 10x on modern CPUs, typically less. I had hoped to move over
to some AV1 implementation in a few years, but these results make that look
doubtful. Are the supposed advantages of AV1 over H.265 confined to ultra high
resolutions and/or high bitrates?

Please note that this is a purely technical question. I don't care about the
patent issues and for those that do, they've been covered at length in other
comments already.

~~~
tveita
"compared to x264 high profile at roughly the same quality"

Compared how? With what kind of source videos?

Unless you are matching SSIM on similar content I don't think you can
meaningfully compare your numbers to the ones in the article.

~~~
NelsonMinar
Me, I just take x264 video I downloaded off the back of a truck and re-encode
it to x265 with default settings on FFmpeg. It looks the same visually to me
on my TV but the file size (including audio) is typically 25%. I realize this
is the opposite of a carefully controlled codec experiment. Just saying in
practice it works pretty well.

If you're curious about x265 in practice, the pirate scene group PSA-Rips is
doing a good job recoding stuff to x265.
[http://psarips.com/](http://psarips.com/)

~~~
marcusjt
To be a little fairer you should transcode it to both x264 and x265 so that
the outputs are can be compared to each other (rather than comparing the
transcoded x265 to the source x264 as you did, which isn't quite the same
thing).

~~~
NelsonMinar
I did that experiment with a file I found online. The source x264 video stream
was 1600 MB. My x264 recoding was 290 MB, and the x265 was 80 MB. (There's
also a significant savings in ffmpeg recoding the audio, which accounts for
the file sizes I reported). In all cases I'm sure my result is lower quality,
but still more than acceptable for me watching TV with low expectations.

------
LeoPanthera
AV1's competition is x265, not x264.

~~~
lern_too_spel
AV1 is a codec, not an implementation. Its competition is the upcoming H.266.
VP9's competition was H.265, and libvpx-vp9's competition was x265.

~~~
niftich
The exact correspondence between the VPx-es and the MPEG codecs is not that
relevant, because old codecs don't go away when new ones come out. In fact,
hardware decoders are irrevocably baked into devices where consumption
happens, so anything H.264 or newer becomes difficult to displace -- like
Google's ham-fisted push [1] for VP8 and VP9 on YouTube for their own benefit,
burning through people's batteries in software instead of serving H.264.

But I'd argue that the MPEG codecs roughly slot in between each tier of VP8,
VP9, and AV1, because the MPEG codecs are tremendously flexible (as can be
seen in Netflix's tests [2]), have dozens of competing encoders of different
qualities, and dozens of profiles that constrain or unlock advanced features.
So in a way, it can be simultaneously true that AV1's competition is both
H.265 and whatever comes after H.265.

[1]
[https://news.ycombinator.com/item?id=13230676](https://news.ycombinator.com/item?id=13230676)
[2] [https://medium.com/netflix-techblog/more-efficient-mobile-
en...](https://medium.com/netflix-techblog/more-efficient-mobile-encodes-for-
netflix-downloads-625d7b082909)

~~~
TheForumTroll
How is serving better quality with less bandwidth "ham fisted"? Both Netflix
and Google says the codec is better. If anyone was ham fisted it was Apple who
fought to stop free codecs'.

~~~
acdha
He explained that in the same sentence. An imperceptible quality improvement
is not worth an order of magnitude increase in CPU usage, and the network
savings aren’t enough to make up for dropped frames on hardware which is more
than a year old.

------
vbezhenar
I wonder about CPU decoding. I never cared about codecs, because every movie
was played at proper FPS on my 2013 MacBook Air. But recently I downloaded 4k
H.265-encoded movie and it turns out that MacBook just wasn't able to decode
it at proper speed. I hope that AV1 will be more efficient.

~~~
freeone3000
The determining factor is hardware decode support, not efficiency. H.264 is
worse than H.265, but if that movie was H.264, it would have played back
flawlessly due to the Intel hardware decoder.

~~~
TD-Linux
It would also play back flawlessly even without the hardware decoder, because
H.264 is so fast to decode nowadays. If AV1 is fast enough it can allow people
to use AV1 without buying new hardware.

------
formatkaka
Encoding time is 667x compared to VP9. How can it be used in production ? 667
TIMES LONGER

~~~
akmittal
How about decoding performance, which is much more important.

~~~
opencl
Decoding is going to have be supported in hardware for any sort of mass
adoption anyway.

~~~
akmittal
Isn't same true for encoding.

I don't see hardware support hardware support coming in next 2-3 years. Then
also everyone is not going to change their hardware immediately. Good Software
performance is quite important.

------
CharlesW
The interesting bits for me:

"Our testing shows AV1 surpasses its stated goal of 30% better compression
than VP9, and achieves gains of 50.3%, 46.2% and 34.0%, compared to x264 main
profile, x264 high profile and libvpx-vp9, respectively.

"However, AV1 saw increases in encoding computational complexity compared with
x264 main, x264 high and libvpx-vp9 for ABR mode. Encoding run time was
9226.4x, 8139.2x and 667.1x greater, respectively…"

~~~
niftich
Let that sink in:

'Encoding run time was 9226.4x, 8139.2x and 667.1x greater, respectively…'

The codec is good, but the encoder has a long way to go. This places it
squarely in the realm of only worth it right now if you're serving the same
video millions of times (someone graph this?).

It's a bit hyperbolic to claim that this is a 'practical use case', but maybe
at Facebook/Netflix/YouTube's scale, it is. But it will be exciting to watch
this space.

~~~
whatever_dude
I'd assume encoding was largely software-based using a POC encoder while
x264/vp9 video encoders probably already take advantage of hardware
acceleration shortcuts evolved over many years.

> the main focus of current AV1 development is on speed optimization to make
> it practical for use in production systems

Remember: first make it work, then make it right, then make it fast. Seems
they're only starting the 3rd step now.

~~~
niftich
libvpx, the VP9 encoder library used in this test, has no support for any
hardware encoder blocks for VP9 [1], so it does everything in software. There
are ways [2] to compile some support into ffmpeg-with-libvpx that makes it
able to invoke the hardware encoder in newer Intel CPUs (Skylake or newer)
[3][4] (using vp9_vaapi) but it's doubtful that this was used, since their
command-line switches indicate nothing of the sort.

x264 is a software-only encoder that provides no hooks into hardware
acceleration. Their ffmpeg command line indicates that no hardware
acceleration hooks of ffmpeg were used.

[1]
[https://github.com/webmproject/libvpx/blob/master/CHANGELOG](https://github.com/webmproject/libvpx/blob/master/CHANGELOG)
[2]
[https://gist.github.com/Brainiarc7/24de2edef08866c3040805048...](https://gist.github.com/Brainiarc7/24de2edef08866c304080504877239a3)
[3]
[https://cgit.freedesktop.org/libva/commit/?id=fb57f5c15e72c3...](https://cgit.freedesktop.org/libva/commit/?id=fb57f5c15e72c39efeb2a10492a327110d6a06c8&utm_source=anzwix)
[4]
[https://en.wikipedia.org/wiki/VP9#Hardware_implementations](https://en.wikipedia.org/wiki/VP9#Hardware_implementations)

~~~
lallysingh
but let's remember that x264 has had a lot of time to find drop optimizations
in simd use and time for CPUs to start optimising for it as a use case.

------
StreamBright
Is this going to be supported as a native platform on iOS and Android? Let's
say I would like to start a new company that delivers video content to people.
Could I use this today as the primary format for that? The performance numbers
look great.

~~~
Mindwipe
No, because there's no hardware decoder support for it yet in common chipsets.
We're probably three years away from that, realistically.

------
ksec
From Doom9

Harmonic showed a comparison demo at NAB, pitting AV1, AVC, HEVC and JVET
against each other. At the equivalent bitrate of 1.9 Mbps they got the
following results:

Codec -- PSNR -- VMAF

AVC -- 32.7 -- 58 HEVC -- 36.7 -- 80 AV1 -- 37.2 -- 83 JVET -- 38.5 -- 88

------
JoachimS
(Several have asked, but I haven't seen a good response yet.) What are the
computational requirements for AV1 decoding compared to x264, v9 etc? Is it in
the same order, multiple orders worse? Or lower?

~~~
Radle
They are nearly identical. But your current CPU does not have a integrated
hardware decoder for AV1, thus it might practically be slower for now.

------
ncmncm
I wonder if anyone has tried rendering directly to AV1 (or x264, or what-have-
you), e.g. in a video game. There is all this hardware devoted to turning a
few bytes of input into lots of pixels out. Maybe you could do less work
producing just the few bytes, and let dedicated hardware generate the pixels.
Of course you don't need for the CPU to generate the bytes; code running on
the GPU would still do that, but with less work to do, you might get higher
frame rates, or better detail at the same rate.

You heard it here first. (That is, unless it doesn't work.)

~~~
msravi
> Maybe you could do less work producing just the few bytes

It takes _more_ work to produce fewer bytes, because you've to pack the same
information in fewer bytes. Entropy and all that.

~~~
ncmncm
Not if the information starts out in fewer bytes. Then, you're just
rearranging the bits into renderable order. "Chicken" is a very small number
of bits that would need to be expanded to more bits (leg down here, wing over
there, beak up here) before handing them over to the decoder to make actual
feathers. The representation in a game's scene list is overwhelmingly more
compact than the input to a decoder would be, and most of it barely changes
from one second to the next, as the pixels are all replaced dozens of times
over.

~~~
baseethrowaway
The information does not start out in fewer bytes. It starts out in all the
bytes that represent the 3D models of entities you're interacting with in your
game and the textures to cover them, at least. That's usually more bytes than
needed to show a 32 bit 4K picture. The rendering process reduces all the
jungle you have in your VRAM to a few kB that are briefly shown on your
screen.

It seems like you don't know the basics of 3D graphics nor the basics of video
encoding. It is extra work and would save some bandwidth needed to pump the
picture to the display, which is not an issue in 3D graphics (we have
dedicated cables).

------
gok
Compression standard finalized in 2018 is more sophisticated than one from
2004? Say it ain’t so.

