I'll have to see it to believe it, because av1 has a long way to go to become even remotely comparable to x265 in encode times, let alone superior.
With ffmpeg built from git, I can encode a 1920x1080 video file to x265 (with a boatload of parameters and options, via a custom threadpool I've written that can saturate all the cores regardless of input stream complexity or size) at 9.2fps on a 16-core 1950X with sufficient RAM.
The same harness powering ffmpeg's av1 encoder (not the fastest, they haven't switched to rav1e yet) does not manage 2fps (I'm letting it run to see what it ends up with, but it'll be a while for this short 3:13 video).
As long as AV1 encoding (at useful compression etc) is within a small factor of HVEC, it will not be a factor in its success. License, quality, compression, and decoding speed are the things that matter.
sintel.y4m 1280x544 21G
FPS: 140.65, File size: 114M
FPS: 34.891, File size: 39M
So AV1 looks very promising considering it's in its infancy yet.
The only video options I'm using are:
-preset:v veryslow -crf:v 18
"It’ll be interesting to see if we find out more info and are able to test this encoder in the coming months."
>BBC, which has no skin in the game, claims AV1 is less efficient than HEVC
"I would call this test flawed as AV1 has consistently shown to perform better than HEVC."
Every article on AV1 that I have read is like this, except for https://codecs.multimedia.cx/2018/12/why-i-am-sceptical-abou... -- they are always blatantly cheerleading new advancements, like only being 5x slower to decode than VP9 or 10x slower to encode than x265 or whatever. But the advancements are not phrased like that, of course--you are never reminded that the competition continues to clobber AV1 in every aspect but filesize/quality efficiency.
The BBC does have skin in the game. They have many people in house who have invested in HEVC, and put their names behind that decision; furthermore, maybe you'll remember when the BBC spent a huge amount of money on Dirac, which seems to have continued to be extremely impractical in the medium to long term. Maybe the BBC doesn't have a spectacular track record for picking the winners in the video codec game. ;- )
Added: The BBC's comparison also seems to be between professionally-configured and tuned HEVC/VVC encoders, supported by their vendors; and what seems to be a default-configured libaom, with no consultation with the vendor.
Which competitor beats AV1 in licensing? VP9 matches it in licensing. No other codec beats it. Even Leonardo Chiariglione (founder and chairman of MPEG) says AV1's licensing has MPEG beat:
It's no surprise then that AV1 (which is basically VP10) uses more resources, but compresses more, and that Netflix and Intel are keen to replace the libvpx derived encoder with their own implementation of the standard.
Just because something is open does not mean it is meant to serve consumers' interests.
Google and Twitch also care about encoding time, but in a different way - they have strong use cases for real time encoding being possible.
Everyone cares about decoding efficiency - Google wants you to be able to watch more videos and thus more ads. If your battery is dead, you can't watch any ads. Netflix wants you to get hooked on more series. etc.
> Just because something is open does not mean it is meant to serve consumers' interests.
This is true in general, but I really don't see how it applies to this case at all. The big companies have incentives that actually do line up pretty directly with user's interests.
"We will be satisfied with 20% efficiency improvement over HEVC when measured across a diverse set of content and would consider a 3-5x increase in computational complexity reasonable."
Encoding time takes a backseat.
And yet Netflix is working on SVT-AV1 with Intel, which is one of the fastest AV1 encoders available:
How does that fit with your narrative?
Of course they do. Google wouldn't sell many Pixels if you could only watch half as much YouTube or Netflix on a single charge compared to, say, an iPhone. Netflix wouldn't sell many subscriptions if you could only watch half as much on a single charge as Hulu or Prime Video either. Your point about encoding is valid, but Google and Netflix are definitely incentivised towards low decoding complexity on the client.
1. The difference between $0 and even the smallest license fee is essentially infinite; just the friction to track and collect license fees would basically kill off the open source ecosystem. Also, it really rubs some people the wrong way that you can get a complete operating system and hundreds of apps for free but a single modern video codec used to cost money.
2. Codec patent licenses are sometimes kind of trolly; some codecs want to charge fees per minute of video instead of per encoder/decoder which is an accounting nightmare and feels abusive.
Let us begin with the fact there are at least three different legal entities in which you have to talk with, each with different schemes:
At least with H.264 you only had one.
Is this known to have ever been achieved?
Also all of this is applicable to countries where those patents are working, of course.
I don't have a problem with paying license fees, it is just most of those companies don't agrees on it.
I could only wish we make the standard as free for Software Encode and Decode.
All Hardware Accelerated Encoder and Decode will be $0.5 per unit. And $0.3 for only Encoder Or Decoder. With no Caps.
For Mobile alone that is anywhere between $360M to $600M alone assuming all devices supports it. And if we include PC, Tablet, Console, All other accessories it is up to $1B per year, and for the life time of the Codec easily $10B+ in patents total split across all companies.
The consumer will be paying for it, and we all enjoy better video quality with smaller downloads. Unfortunately most of the sweet spot for newer codec tends to be in 4K or even 8K. I wish they could put a lot more focus on 1080P at 1 / 2Mbps. Where the vast majority of video on Internet could settle on.
To my eyes, there's a pretty decent chance that the Sisvel pool is an HEVC industry ploy to propagate FUD around AV1 and VP9 (suspiciously both, the latter seeming to have been the subject of no lawsuits, despite being deployed widely for about six years), to stop the bleeding w.r.t. HEVC licensing (which is still an absolute steaming dump in the lap of your legal department).
I'll believe it when I see the receipts, and even then, it'd probably be less of a minefield than HEVC. And to be fair Sisvel has not, in other industries, seemed to be a particularly insidious actor; but given the number of big fish who are obviously using VP9 encoders to do tens of billions of dollars of business without licensing anything from Sisvel, I can't see it as anything other than bluster.
It is completely unlikely that AV1 infringes on none of them or any other patents.
However, there's a big giant gap between a valid patent and actually expecting to get royalties without even revealing which patents are supposedly being infringed on even now that AV1 is standardised and in production usage.
So do I think that AV1 is patent-free when all is said and done from a legal standpoint? No. Do I think it'll be royalty-free anyway? Yes.
> 1.3. Defensive Termination. If any Licensee, its Affiliates, or its agents initiates patent litigation or files, maintains, or voluntarily participates in a lawsuit against another entity or any person asserting that any Implementation infringes Necessary Claims, any patent licenses granted under this License directly to the Licensee are immediately terminated as of the date of the initiation of action unless 1) that suit was in response to a corresponding suit regarding an Implementation first brought against an initiating entity, or 2) that suit was brought to enforce the terms of this License (including intervention in a third-party action by a Licensee).
* JVC Kenwood Corporation
* Koninklijke Philips N.V.
* Nippon Telegraph and Telephone Corporation
* Orange S.A.
* Toshiba IPR Solutions, Inc.
If they are intending to initiate any patent litigation (which is going to be a very long time yet not least of all because they won't even reveal which patents) I think they're going to be at a losing end very very quickly having waited this long for VP9/AV1 to establish themselves.
(Or a similar tort/breach. In Australia it would constitute a case of misleading or deceptive conduct under the ACL.)
Patent cases of this sort seem to broadly favour the litigant so why stir the hornet's nest.
For the cost of some legal fees, a loss seems like a great value.
You won't have to wait long, more information will be coming out later this month. I manage the codec engineering team at Mozilla and we are co-hosting the Big Apple Video 2019 conference with Vimeo at their space on June 26th:
Have a look at our speaker list. Zoe Liu, the Co-Founder and President of Visionular, will be presenting the Aurora AV1 encoder, but this is just one of many talks that will look at the state-of-the-art in video technology. Hassene Tmar of Intel will give an overview of SVT-AV1.
The conference is free to attend, but please register. We will be live streaming the event with remote participation for those who can't make it.
You could have made that statement for all previous codec generations as well. We always buy comparatively small improvements with much higher computational costs. And Moore's law makes it always worth it. In the end filesize/efficiency is why we are doing this stuff in the first place.
They ran a similar comparison before and published more details. In that one, AV1 lost a little bit to x265 on objective PSNR but beat it on subjective evaluation.
In this latest result, they don't give as much info, but AV1 is closer in PSNR (ahead for 4k) while also being 25x faster than it was the last time, so presumably it would still win on subjective quality.
The parameter “--end-usage=q” was set to force fixed QP encoding according to the QPs in Table 1 and “--threads=1” was used to run the encoder in single-thread mode. The parameters “--passes=1” and “--lag-in-frames=0” were set to run AV1 in single pass mode without the possibility of looking ahead in the video sequence before encoding. Finally, the internal bit depth of the codec was set to 12 as typically used during the AV1 development. Finally, for all encoding technologies, each sequence was split into chunks of one Intra period (approximately 1 second, as defined in the RA configuration for HM and JEM), which allowed each chunk to be independently encoded in parallel. This coding configuration was adopted to reduce the overall time needed to encode with AV1, intead of encoding each 10 second sequence sequentially.
The BBC essentially forced the GOP size to 1s for libaom. In every benchmark I've run where HM and libaom are _not_ run in ridiculous modes, libaom has been about 30% better bdrate. This is consistent with many, broadly reported, independent analysis such as MSU, Facebook, etc. The second BBC evaluation which at least enabled two pass (the configuration AOM was developed in) shows much larger gains^^. They still did not address the inequity in gop length. Even old codecs like h264 can see a 10 to 20% gain on sequences from longer GOPs.
AV1 is far from perfect, as some of it's features are very computationally complex to implement today, while HEVC and VVC have "better known" complexity for their features. But the BBC benchmark analysis is simply not accurate.
https://code.fb.com/video-engineering/facebook-video-adds-av... https://www.elecard.com/page/aom_av1_vs_hevc http://iphome.hhi.de/marpe/download/spie-2017.pdf https://bitmovin.com/av1-multi-codec-dash-dataset/
Edit: PS. The BBC evaluation is good for what they where benchmarking. Will AV1 today beat HM for live streaming? And the answer is, no. The Intel SVT encoder may change that soon. Libaom was not configured, or tested for that. But this is an odd benchmark to put forward as explicitly mentioned, speed is not what _either_ code base was built for.
2) I also don't really see why this would inherently disadvantage AV1, though obviously it would dampen any efficiency gain. (Why would it dampen it? Because it's not like one encoder is going to draw keyframes way more efficiently than another encoder, and as you lower GOP size, way more of the filesize gets taken up by keyframes, proportionally).
3) One-second GOPs are pretty out there, but we do live in a world where two-second GOPs aren't that unusual, so it's not that crazy.
1) AV1 has significant advantages in its inter compression vs HM. AV1 also has explicit tools for dealing with short GOPs that where not tested. See: https://jmvalin.ca/papers/AV1_tools.pdf for inter prediction.
2) AV1 and HM both have a different set of tools for intra prediction and as such different coding efficiency on key frames.
3) Yeah and two second gops are really bad for video quality. 4-6s is far more common except on latency sensitive applications.
I'll also point out that the reported MOS scores have several oddities, like an 8mbps av1 stream being rates lower than it's own 6mbps stream.
That reminds me of the old, hilariously broken ffmpeg AAC encoder where increasing the bitrate decreased the quality after a certain point, at least in my tests. So although my guess at what would be causing weird stuff like this would certainly be bad testing methodology by BBC rather than a quirk in the encoder, there's a non-zero chance that I'd be wrong.
However, I think libaom performance was the deciding factor for this. They say right there that they chose this arrangement because it would allow them to deliver 10s chunks with reasonable latency.
With SVT-AV1 and appropriate hardware, there's no reason they couldn't do 10s like they should, and in that case AV1 would probably shine.
Twitch (and others) is looking into optimizing for live steaming. A good comment with links from last year:
And the article summarise nicely everything I loathe about AV1 and Baidu. I could only wish VVC or EVC fix what problem we have, or if not let hope they do better with AV2.
P.S - I am still not convinced Apple is fully on board with Open Media Alliance.
Twitch will be doing exactly that. They're using VP9 now:
They will be moving to AV1 in the future. They've contributed features to AV1 specifically for the low latency live streaming use case:
This is extremely unlikely. But a game changer if true.
https://aomediacodec.github.io/av1-spec/av1-spec.pdf is the most authoritative source.
"All superblocks within a frame are the same size and are square. The superblocks may be 128x128 luma samples or 64x64 luma samples. A superblock may contain 1 or 2 or 4 mode info blocks, or may be bisected in each direction to create 4 sub-blocks, which may themselves be further subpartitioned, forming the block quadtree."
"use_128x128_superblock, when equal to 1, indicates that superblocks contain 128x128 luma samples. When equal to 0, it indicates that superblocks contain 64x64 luma samples. (The number of contained chroma samples depends on subsampling_x and subsampling_y.)"