I agree AOM is the best quality now. But between Rav1e and SVT-AV1 it's not so clear for me. When you look metrics, SVT-AV1 is clearly the winner. But when I do side to side comparison ( e.g with online comparison tool like https://svt.github.io/vivict/ ) for me SVT-AV1 blur too much and don't have the same details retention than rav1e and I found rav1e much more pleasant ( my opinion) . I hope at some point there will be some real human a/b testing study.
The fact that YouTube uses it only proves the increasingly obvious business model they use in the space, namely the monopolization of the encoding ecosystem by literally throwing away money to steer people clear of rav1e.
See some benchmarks here:
rav1e has a ways to go to match it.
That's just the UX bottleneck, but there's another elephant in the room: it's cost prohibitive. Encoding time is so slow that we're still measuring in frames per minute for most software encoders. If I want to move my whole video encoding pipeline to AV1 from h264, I need around 100x the horsepower to encode. That's 100x the server costs, and as someone who's looking heavily at using AV1 for the video site I'm working on now, it's simply cost prohibitive.
Don't get me wrong, the steps that Netflix are taking with SVT-AV1 are amazing. We're seeing a huge improvement from the 500x vs h264 it was showing last year, but it still needs a huge amount of effort before it's ready for prime time. I'm really hoping we see some early hardware encoding/decoding implementations for AV1 given the number of companies who are in support of it.
YouTube has lots of AV1 encodes. You've probably watched AV1 encoded content without realising it. Here are a few of examples. To verify that they're AV1 encoded, right-click on the video and select "Stats for Nerds". I'm playing them in Firefox 75 beta.
Halo trailer: https://www.youtube.com/watch?v=Fmdb-KmlzD8
Despacito music video: https://www.youtube.com/watch?v=kJQP7kiw5Fk
Porsche Taycan commercial: https://www.youtube.com/watch?v=92sXWVxRr0g
For some videos YouTube has AV1 encodes up to 480p, others are up to 1080p, and some are up to 4K.
Which is cool, but means other orgs that are only interested in cost savings may not want to jump in quite yet (though they should probably be preparing so they can switch it in when that point arrives).
An example video: https://vimeo.com/362164795
Hardware decoding is fine. Hardware encoding is fine if you want to live-capture something. But if you want to optimize for quality/byte then hardware encoders usually fall short and I assume they would do even more so with something as complex as AV1.
I'm more curious why encoders are all-CPU and not offloading some parts of the search to GPGPU code.
Apple is a governing member of AOM (Alliance for Open Media).
A patent-free implementation of a hardware AV1 decoder is available right now.
Apple makes its own chips with significant revisions every year. Apple seems to like being on the forefront of mobile silicon and already stuffs its chips full of things orders of magnitude more elaborate than this.
Apple now runs their own streaming network and being able to offer AV1 encodes stretches their bandwidth further. Consider also that iPhone chips of today inevitably become the Apple TV chips of tomorrow.
But most importantly of all, there's no commercial or competitive incentive for Apple not to do it. All of its serious competitors license the patent encumbered codecs and would need to continue doing so indefinitely just like Apple.
License fees are only paid to Blu-Ray / disc and Encoder / Decoder per Unit.
But HEVC is already in silicon, and that's the only thing that would matter in long run.
Vendors kinda don't get why they need to waste silicon on AV1, and for whose money.
If that was the case AV1 spec version 1.0 would not have been postponed multiple times simply because the spec did not had any hardware decoder cost / performance in mind. And then there was the errata to version 1.0 which makes it incompatible to AV1 1.0.  Those vendors being a member of AOM means literally just a member only. They may or may not share any common interest.
This doesn't change the fact that HEVC got into silicon faster, and that was it.
I'm not an expert on HEVC or AV1 specifically, but in my experience the fundamental operations used by codecs are quite similar. You don't necessarily need significantly more silicon to add AV1-support.
Remember that power-consumption is the main driver in IC development these days. It's not unreasonable to keep some dark silicon around to save power when needed.
Edit: Update, it looks like the story is not opted into distribution / should not be part of the paywall.
Can anyone from Medium chime in?
Maybe nobody there can figure out how to set up a web server? Maybe they get a kickback from Medium subscriptions? So many questions.
You can usually bypass the paywall with the archiver: https://archive.ph/gv8T6
Huh? It's their business for sake.
> For estimating encoding times, we used Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz machine with 52 physical cores and 96 GB of RAM, with 60 jobs running in parallel.
Would love to see some comparisons with amd, at some point.
Is there currently an easy way to run this on multiple machines? Back in the day, I'd probably try a single image cluster, like openmosix - but I'm not aware of anything similar for modern setups - neither self-host or "hardware on tap"?
Here are some comparisons of various SVT-AV1 releases on Intel and AMD CPUs:
Why is it 1-Pass? and if I may, use the word "again". Every time when it is compared to LibAOM, it is always one pass.