Microsoft Edge does support AV1, but weirdly only through a Microsoft Store extension [1], even though Chrome has support built-in. This actually really sucks because in practice hardly any normal consumers would bother to install a strangely named extension, and so web developers have to assume it's largely unsupported in Edge. Safari ties support to a hardware decoder, which I suppose is understandable to avoid accidental battery drain from using a software codec, and means eventually in some year's time support can generally be relied upon when enough new hardware is in use. But that won't happen with Edge as things stand!
I think it's high time the web had a single audio and video codec choice that is both open and widely supported, which is why I've proposed support for AV1 and Opus for the Interop 2024 effort [2] [3].
Microsoft Edge USED to support AV1 through the extension, but decided to disable the support altogether since v116. The "Can I use" website [1] has the up-to-date information on this.
This can be treated as a rumor but I heard a Microsoft developer on Mastodon hint that they had been having trouble with a patent troll, so maybe that's the reason behind the kind of obnoxious workaround too. But that they were dealing with it.
It's supposed to be royalty free, but there are patents for the techniques used in it which are used defensively. If someone has a patent to a technique used in AV1, then they could still demand royalties and some patent trolls have been trying. Wikipedia has a section on it.
Wasn’t part of AV1’s charm that anyone trying to sue an AV1 licensee would be nuked from orbit by the legal expertise and patent trove of the collective AV1 companies / AOM?
Because the US legal system (and many others) are adversarial, people can still sue you on very flimsy grounds and you still have to respond or you lose by default.
It may very well be that the troll doesn't have a solid case, but you don't hook any fish if you don't get your line in the water
Thay doesnt work on patent trolls becausr all they do is sue people. They have no products of their own to build their own patent liability risk so they arent kept in line. The solution to patent trolls is pretty simple yet impossible. Make having a patent contingent on having a product in the market that uses the patent with a short grace period of a few years. If you have no product for sale you can point to after the grace period you lose the patent. Also make patent holder at least 50% liable for any licensor product abuse. Then they will be stopped from patent trolling by their own exposure, as it should be if you are giing to have a patent system like this.
Trolls don't like fighting in courts, because most of their patents are utter junk. They rely on protection racket intimidation when courts are avoided and they are paid for that.
So while fighting them in court is costly since it requires research of prior art and doing all the necessary work to invalidate those junk patents, it's still often a proper tactic to destroy them.
And AOM has deep pockets to bust them or show how it doesn't apply. It's an alliance for a reason - let them work together then. More than a thousand sounds like complete bs already.
Alas the USPTO is happy to grant patents on anything under the sun, so you’ll never know when someone shows up with a patent for ”simulating a television broadcast via decompressing stored video on a computer” or something similarly broad.
And searching for applicable patents opens you to more liability, so you just gotta go in blind and hope for the best
> ...which is why I've proposed support for AV1 and Opus for the Interop 2024 effort...
A very nice initiative!
Feels like we've been lagging behind on development in this area for some time. And with a wider support (in both hardware and software) more actors should be able to enter this field in the future.
I really hope this gets enough traction in a near future!
I don't accept the vague claim of "patent trolls" as enough for Microsoft to bar support of AV1 in Edge.
This is extra bad when you consider what Microsoft does to pressure people into using Edge, despite Edge itself being literally Chromium with features disabled.
Don’t forget an image codec with HDR (10-bit) support!
Jesus wept, this is just a still frame from any of: h.265, AV1, or VP9. Those, or a JPEG XL file.
None of these work in general, especially outside of the Apple moat.
Oh, sure, you can buy a $4000 flagship Nikon camera that can output HDR files in HEIF format, but they won’t open on Windows and look garbled on Apple devices.
This is so stupid now that the iOS version of Adobe Lightroom can edit HDR photos and export them in three formats…
none of which can be viewed as HDR on any iOS device! I’ve never ever seen this kind of retardation before — software that is this fundamentally incompatible with the only OS it runs on!
I go on this rant approximately annually. It’s been about a decade. I expect to be stuck using SDR JPG for another decade at this rate.
> Safari ties support to a hardware decoder, which I suppose is understandable to avoid accidental battery drain from using a software codec
I still have a pretty deep dislike for Google for turning on AV1 for everyone in Chromium. It’s the ultimate “fuck you, I care more about my bottom line than your user experience”.
Edit: and clown HN rears its head again. I guess cutting users their battery life to a third is worth it as long as Google saves a little bandwidth?
Doesn't the Media Capabilities API [1] provide a way to determine if a codec is "power efficient" (presumably meaning hardware supported)? So then you can switch to another codec if AV1 isn't hardware-supported.
What I'm saying is in an ideal world web content should be able to detect whether some codecs are not hardware-accelerated, and so such workarounds should not be necessary. Of course, lots of naive web content might just check if it's supported and use it anyway... but surely the big sites like YouTube get this right?
Software decode has its uses - if you just want a small GIF-style looping clip, hardware support doesn't matter much, and it's nice to have one codec that can be relied upon to work everywhere.
How much silicon does it take to add an AV1 decoder to a chip? The areas Apple highlighted in their A17 release looked pretty substantial, but I wasnt sure if it was to scale.
I'm pretty sure that a video codec ASIC would share some building blocks across codecs, with per-codec parameterization, so I don't think Apple literally added a single "AV1 box" to the A17/M3 die.
I'm sure the usual suspects (Synopsys, Broadcom, ARM, Xilinx, etc.) would be happy to license something. But from what I can see all the big players make their own. I guess they're easy enough to implement yourself (as a big player) and important enough to not want to leave it in the hands of a third party.
There are also likely opportunities for additional efficiencies when you make a custom {en,de}coder for your system. I suspect (but haven't confirmed) that the typical Intel/AMD/Nvidia/Apple multi-function media engine isn't just a collection of completely independent encoder/decoder blocks for each codec but a kind of simplified specialized microcoded CPU with a collection of fixed-function blocks which can be shared between different codecs. So it could have blocks that do RGB->YUV conversion, Discrete Cosine Transforms, etc. and you can use the same DCT block for AV1, HEVC, and AVC. Maybe you can also create specialized efficient ways to transfer frames back and forth with the GPU, for sharing cache with the GPU, etc.
My understanding (now several years out of date) is that Arm decided not to pursue licensing a design, because every customer they talked to had requirements that were so different that they would have essentially been one-offs that each required significant additional engineering. I cannot speak to the others you mention.
I believe the team we worked with at Arm during AV1 standardization is no longer there, which is too bad. They were really great guys to work with.
Your suspicion is mostly correct, though obviously you cannot share too much of the DCTs as these must be bit-exact and are different for each of the standards. But especially things like the compressed tile cache for reference frames used in motion compensation are extremely complicated (to save memory bandwidth and power) and entirely shareable. The SRAM used for line buffers is also a lot of area and shareable. And so on.
It is a pure marketing slide. The M3 variants floorplans don't look anything like that, as can be seen on other pictures of the dies.
That being said, it's pretty common to have dedicated silicon for video codecs. It normally takes the form of a little DSP with custom instructions to accelerate operations specific to the codec.
I agree that is how I'd expect it to be implemented, but I'm not sure how small it would be given the processing bandwidth we are talking about for 4k video.
I'm guessing this is a distinct region of the chip and not integrated with CPU/GPU since they scale up by replicating those blocks and wouldn't want to redundantly place that hardware. Having it separate also allows a team to work on it independently.
I think the relative size of the media engines is accurate in that slide, so then it comes down to how large the ProRes parts are in other chips. They are probably a couple of the unlabeled regions next to the performance cores in the M1 Pro die shot below, but I don't know which.
A GPU is also not a monolith. As you say, there are some functions that scale with the number of compute units, but others don't need to (e.g. display scan-out controllers); it would accordingly make sense to make the video coding functions part of the latter.
And video decoding/encoding is definitely at least GPU-adjacent, since it usually also involves scaling, color space transformations etc.
I've seen it. Sometimes because the DSP is hyperspecialized for a particular codec. Sometimes just because the SoC vendor bought the hard codecs from different sub vendors.
You can either have a fully dedicated core for codecs, or you can just put certain codec related operations (like DCT-related SIMD) in your main cores. Cryptographic acceleration tends to use the latter approach.
Video codecs usually come with GPUs, not CPUs. It is only on SoCs where this distinction is a more fuzzy.
On a GPU, you didn't have an option to interleave normal program stream with specialized partial-decoding instruction. You put encoded frame in and you get decoded frame back, the media engine was separate block from compute.
Though this is also changing; see Intel GuC firmware, which has (optionally) some decoding, encoding and processing based on compute.
According to the article, it's supported by every browser except Edge. It will be interesting to see who ultimately ends up making a better IE, Safari or Microsoft. So far, it seems Safari is winning, given the ever growing set of standards they don't support, but maybe this is Edge trying to catch up?
Which seems to be claiming the software fallback has been suddenly yanked, temporarily breaking YouTube which fixed it by serving VP9 instead but maybe AV1 hardware decode is still working?
Most of those that were discussed have been implemented. The new list is here [1].
What you see is that they implement different features with Google obviously wanting the browser to be a full blown OS. So they added things like MIDI support and web pages being able to access the battery.
The problem with many of those features is that they have been shown by researchers to either (a) be insecure or (b) allow web pages to uniquely fingerprint your device. This is obviously an anathema to everything Apple believes in.
Nothing that is relevant. I use safari across all my devices and I have never had an issue besides very specific things like flashing microcontrollers through a web interface which I had to use chrome for. That’s basically irrelevant for the entire world. Safari is mega efficient and fast and actually cares about your privacy
What is a "real" web browser? One would assume its an application that takes web content and parses it for user display so they can use it to browse the web...so Edge and Safari would surely be real, right?
I would add that broader AV1 support is also a good news for low latency application like Cloud gaming or streaming to VR headsets. HW accelerated encoding/decoding improve upon HEVC on every metrics so you can get near real-time streaming with higher quality and lower bandwidth requirements.
This doesn’t match up to my experience playing on Quest 3. Certainly I see better quality for lower bandwidth, but the client side decoding actually adds more latency at the point it actually looks better the HEVC.
Color me the skeptic here, but which benchmark(s) are we talking about? Even h264 vs h265 is not a settled matter - if we truly consider every possible metric, including e.g. SW encoding.
Edit: resolution on that graph image is terrible but they've been sharing it for a while in slide decks so you can probably find better quality by following links from here:
SVT-AV1 has still a lot of visual quality issues. PSNR, SSIM, VMAF are useful metrics, but optimising for these won't get you the best encoder.
x264 didn't get its reputation for going after PSNR and SSIM.
The subjective results in testing follow a similar pattern though. Even with variations between the metrics and subjective scores there's not really enough wiggle room for it to bridgevthe gap:
They had some psy optimisations that introduced "false" "detail" that the eye liked but metrics didn't.
Kind of like what AV1 does with film grain or various audio codecs do with filler audio that's roughly the right "texture" even if not accurate to the original signal.
edit: this is on top of the basics all working fast and well. You could argue that many competitors "overfit" to metrics and they had the wisdom or correct organisational incentives to avoid this.
I'm interested as to why it isn't a settled manner. In my experience, H265 files tend to strike really nice compression ratios (in the same vein, after an AV1 transcode, I'm typically left gobsmacked).
(Or were you talking more about latency? In that case I have to defer to someone with more knowledge.)
h265 has about 20% lower bitrate than h264 at a very similar perceptible quality, but encoding several variants (adaptive streaming) quickly becomes more taxing on the hardware, and support for decoding h264 in hardware is both more ubiquitous and less expensive. As a concrete example, the 2018 generation of Amazon Fire TV sticks supports h265 but gets really hot, so when an adaptive stream offers both h264 and h265, the Fire TV picks the former. We were experimenting with detecting Fire TV serverside to give it a h265-only HLS manifest (the cost savings on the CDN would be sweet), but ultimately decided against it - the device manufacturer probably had a legitimate reason, be it stability or longevity.
I don't quite understand the industry push for AV1. I appreciate that it's patent-unencumbered, but it makes very little sense from business perspective, as you still need to support h264 and/or h265 for devices that can't decode av1 in hardware (and let's agree that forcing software decoding for video should be criminal). So you add a third codec variant (across several quality tiers) to your stack, cost per minute (encode, storage) goes up, engineering/QA effort goes up... Where's the value? Hence my original question, is AV1 really that much better to justify all that?
Adopting AV1 isn't urgent but it's a good long-term move. The sooner we implement hardware support in new chipets, the sooner it will become as ubiquitous as H.264 is today.
As for the business perspective, major streaming services pay major dollars for transferring data. You could probably pay for every part of the AV1 project multiple times over on the money saved by a moderately lowering of Netflix's outbound data.
Keep in mind that standards are moving slow, CODEC standards more so. The golden standard is still h264/AVC, which dates back to the nineties. This is primarily due to many appliances (set top boxes, cameras, phones, TVs) using the absolute cheapest hardware stack they can get their hands on.
Compared to other standards in streaming media, I'd say that AOMedia has found adoption a lot quicker. h265 (HEVC) was all but DoA until years after it's introduction Apple finally decided to embrace it. It is still by no means ubiquitous, mostly due to patent licensing, which significantly drives up the price of hardware in single digit dollars price range.
Anecdotally, consider that Apple's HTTP Live Streaming protocol (till version 6) relied on MPEG2TS, even though Apple lay the groundwork for ISO/IEC 14496-12 Base Media File Format aka MP4. The reason was that chips in the initial Iphones had only support for h264 using mpeg2 transport streams, and even mp4 -> mp2 transmuxing was considered too resource intensive.
> h265 (HEVC) was all but DoA until years after it's introduction Apple finally decided to embrace it
No? You're talking in terms of PC / phone hardware support only. HEVC was first released June 7, 2013. The UHD Blu-ray standard was released less than 3 years later on February 14, 2016 - and it was obvious to everyone in the intervening years that UHD Blu-ray would use HEVC because it needed support for 4k and HDR, both of which HEVC was specifically designed to be compatible with. (Wikipedia says licensing for UHD Blu-ray on the basis of a released spec began in mid 2015.)
>Anecdotally, consider that Apple's HTTP Live Streaming protocol (till version 6) relied on MPEG2TS
This sounds like you might be confusing that MPEG2TS might have something to do with the video encoding instead of it solely being the way the video/audio elementary streams are wrapped together into a single contained format. The transport stream was designed specifically for an unreliable streaming type of delivery vs a steady consistent type of source like reading from a dis[c|k]. There is nothing wrong with using a TS stream for HLS that makes it inferior.
> There is nothing wrong with using a TS stream for HLS that makes it inferior.
Not wrong, but a bit surprising. As you mention, transport streams are designed to operate over unreliable connections (like satellite or terrestrial transmission). Reliability is not an issue with with HTTP or TCP.
Other than being archaic, some disadvantages are that TS has somewhat more overhead than (f)MP4 and poor ergonomics for random access / on the fly repackaging caused by continuity counter, padding, PAT/PMT resubmission timing, PCR clock.
If it were up to me to design a streaming protocol like DASH, HLS, Flash, or SmoothStreaming I'd instantly choose mp4 (or plain elementary streams). I wouldn't even consider TS or PS unless some spec forced me to.
>Reliability is not an issue with with HTTP or TCP.
We seem to be confusing what reliable means here. Yes, HTTP/TCP can reliably transmit data in the fact that if packets are missed they will be resent so that you can be assured that the data will eventually be delivered. However, that doesn't do well for real time streaming of data that is necessary to be received in order. That's why UDP was made.
>If it were up to me to design a streaming protocol like DASH, HLS, Flash, or SmoothStreaming I'd instantly choose mp4 (or plain elementary streams). I wouldn't even consider TS or PS unless some spec forced me to.
Well, it's a good thing we didn't have to wait for you to come around and design a streaming protocol and we've been able to use it for the past ~20 years with the technology that was available at the time. Perfect is the enemy of progress.
What I meant to point out as odd about Roger Panthos and co.'s decision to build HLS on top of Transport Stream containers is that Apple had already laid the foundation for MP4 with QuickTime.
Since HTTP live streaming was never about anything but HTTP, container capabilities like auto-synchronization offered by mpeg2TS were moot. It would therefore seem logical for Apple to build HLS upon what they already had with QuickTime + iso2 fragments. That was more or less the route Adobe/Macromedia had taken with Flash streaming.
Yet, the choose mpeg2TS (initially only muxing AAC and h264). The reason, historically seems to have been driven primarily by the capabilities of the iphone hardware which supported this out of the box! Separate transport streams for audio and video, WebVTT, elementary stream audio were added much later, and fragmented MP4 was introduced only once HEVC was bolted on.
I'm all for favouring what exists over what's perfect; it's just odd that Apple choose to (initially, at least) regress to 90s technology while rest of world had already adopted superior container.
> Compared to other standards in streaming media, I'd say that AOMedia has found adoption a lot quicker. h265 (HEVC) was all but DoA until years after it's introduction Apple finally decided to embrace it.
Was it all but dead because people thought h264 was good enough, until 2.5 and 4k became more prominent in media consumption? It seems really useful if youre doing encoding at resolutions than 1080p, and it makes me less regretful that I have a bunch of recent hardware that didn’t get av1 hardware support :)
> Was it all but dead because people thought h264 was good enough, until 2.5 and 4k became more prominent in media consumption?
(Lack of) a compelling use case certainly played a role. Sure, reducing bandwidth is in itself a noble (and potentially profitable) goal but why fix it if it ain't broken? Ie.: the h264 infrastructure for HD was already there, working fine.
Another factor was that HEVC was full of new patents and the patent pool licensing costs hadn't yet settled.
memcpy is ±99%. The 1% involves bit shifting (nal unit reshuffling, avc3/avc1 fixups).
So indeed, repackaging mp4 <-> mp2 containers is pretty trivial. Nevertheless, Apple initially choose mpeg2TS because it conveniently allowed them to shove the reassembled media segments straight into the dedicated AV chip.
The H265 patent licensing situation is famously a mess and has been a big barrier to adoption. Except in circles where people worry less about that sort of thing: Warez, China, ...
The licensing shenanigans of H265 was a big motivator for creating AV1, a royalty free codec.
The person who benefits from a more efficient codec tends to be netflix/youtube (lower bandwidth costs), and they are far far removed from the chipmaker - market forces get very weak at that distance.
People never stopped using VP8. In fact, your screen sharing is probably wasting excessive amounts of CPU every day because there is no hardware support.
AV1 isn't particularly behind schedule compared with previous codec generstions. We could and should have moved faster if everything went well but Qualcomm in particular were being awkward about IP issues.
Luckily the effort behind the David software codecs kept up the rollout momentum.
The AV1 ASIC takes up space on the SoC, so effectively it decreases the performance of other parts. This could be why some manufacturers have delayed including support for quite a while. Though Mediatek already had AV1 support three years ago.
Especially when Google and AOM promise to have a hardware encoder and decoder to be given out for free by 2018, and be implemented in many SoC by 2019, wide availability support and AV2 by 2020.
Well the basic answer is that, making an efficient hardware encoder and decoder, within power budget and die space, all while conforming to standard because you wont have much of a chance to correct it, and implementing it into the SoC design cycle which is and always has been at least three years minimum, is a lot harder than most software engineer at Google and AOM would thought.
Apple has a tiny sliver of the patents in HEVC, and while we don't have the numbers I feel pretty certain they pay far more into the pool to ship HEVC in their devices than they get out of it. The same is doubly true of Qualcomm who aren't even a part of the pool.
HEVC was finalized in 2013. AV1 was finalized in 2018, and has just finally started getting a robust ecosystem of software and hardware.
It wasn't really all that slow in general, just slow on dedicated streaming hardware.
Basically, it was the push to 4K (and especially HDR) that caused HEVC to roll-out. In 2016 4K Blu-rays started coming out and they were all HEVC 10-bit encoded. It took a couple more years before dedicated streaming devices and lower-end smart TVs bothered to start including HEVC support as standard because at first 4K content was uncommon and the hardware came at a premium.
Now that it's mostly the de-facto standard, we see HEVC support in basically all streaming devices and smart TVs.
AV1 didn't have any sort of resolution change or video standard change to help push it out the way HEVC did, so it's basically rolling out as the parts get cheaper due to pressure from streaming giants like Google and Netflix rather than due to a bottom-up market demand for 4K support.
I didn't know about Blu-Ray being relatively prompt. But I still think HEVC adoption was slow in broadcast TV, which I would have thought was a shoo-in market.
Qualcomm isn't even a part of the HEVC alliance patent pool, so that theory doesn't hold. Indeed, the fact that Qualcomm is currently building AV1 support into their next chip (purportedly) puts them at risk of being sued because while AV1 is open, we all know how patents work and there are almost certainly actionable overlaps with the pool.
Apple ships probably more devices than anyone, and given that the patent pool is huge as mentioned odds are overwhelmingly that it costs them money to support HEVC / HEIC, not the reverse. That theory also is dubious.
Remember when everyone was yammering for VP8 support? Then it was VP9 support? Now it's AV1. Sometimes it takes a while to shake out. By all appearances AV1 is a winner, hence why it's finally getting support.
>Apple ships probably more devices than anyone, and given that the patent pool is huge as mentioned odds are overwhelmingly that it costs them money to support HEVC / HEIC, not the reverse. That theory also is dubious.
This is a nit that doesn't negate your main point: Apple may ship more complete devices than anyone, but Qualcomm makes up significantly more of the SoC manufacturer market share[1] at 29% vs Apple's 19%
Netflix (and Youtube? I forget) will push an AV1 stream if you have the support. This was even mentioned in Apple's show yesterday. So the egg is already there and the chicken is slowly coming, thankfully.
YouTube was the first to support it. They even went to war with Roku over it and Roku killed the YouTube TV app in retaliation to YouTube's mandate that all next-gen devices support AV1, so YouTube went ahead and embedded it inside the regular YouTube app.
Roku's latest devices to support AV1, so I guess either the price came down, they struck a deal, or Roku just lost to the market pressure after Netflix pushed for AV1 as well.
I think a lot of content creators really want AV1 because of the drastic reduction of file sizes. Streaming companies want it to catch on because of the drastic reduction in bandwidth use.
I thought Google was the main one behind AV1. Couldn’t they use their position as one of the world’s biggest video platforms to break that chicken egg loop?
They have. They literally threatened to pull their support from devices if they don't implement the codec in hardware. Roku's spat with Google was a big-ish story when that happened.
I don't know how that can be viewed as a good thing.
Is it really _that_ hard to create a generic video decoding DSP whose firmware could be updated? Most codecs are very similar to each other. IIRC Texas Instruments used multicore DSP to decode MPEG back in the 90s.
Or maybe we should have written codecs to be amenable towards GPU shader implementation...
The problem with a generic codec DSP is how fast do you make it? Newer codecs often require twice as much computation as older ones, so do you make the DSP twice as fast as you need today and hope that's enough to run future codecs? Meanwhile you're wasting transistors on a DSP that won't be fully used until years later.
To some extent the PS3 did this; the Cell SPEs were fast enough to implement Blu-ray and streaming video playback in software and they made several updates over the life of the PS3.
> Or maybe we should have written codecs to be amenable towards GPU shader implementation...
They are, but programmable GPU shaders are nearly always more power expensive than fixed function purpose specific silicon. It's why many key aspects of GPUs are still fixed function, in fact, including triangle rasteriziation and texture sampling/filtering.
I think, at least, that one of the biggest use-cases for encode is game streamers (is this right?), they should have decent dGPUs anyway, so their iGPU is just sitting there.
Elemental wrote a GPU shader h264 encoder for the Radeon 5870 back in the day, marketed towards broadcasters who needed quality and throughput: https://www.anandtech.com/show/2586
Intel used to write hybrid encoders (that used some fixed function and some iGPU shader) for their older iGPUs.
So the answer is yes... if you can fund the right group. But video encoders are hard. The kind of crack developer teams that can pull this off don't grow on trees.
Shaders have little benefit for anything with "compression" in the name. (De)compression is maximally serial/unpredictable because if any of it is predictable, it's not compressed enough.
People used to want to write them because they thought GPU=fast and shaders=GPU, but this is just evidence that almost noone knows how to write a video codec.
The Elemental encoder was supposedly quite good, but it was a different time. There was no hardware encoding, and high res realtime h264 was difficult.
That's not really true; the motion estimation stage is highly parallel. Intel's wavefront-parallel GPU motion estimation was really cool. Real world compression algorithms are nowhere close to optimal partly because it's worth trading off a little compression ratio to make the algorithm parallel.
IIRC x264 does have a lookahead motion estimation that can run on the GPU, but I wasn't sure I could explain this properly.
That said, I disagree because while motion estimation is parallel, motion coding is not because it has to be "rate distortion optimal" (depending on your quality/speed tradeoff.) So finding the best motion for a block depends on what the entropy coder state was after the last block, because it can save a lot of bits by coding inaccurate/biased motion.
That's why x264 and ffmpeg use per-frame CPU threading instead (which I wrote the decoding side of) because the entropy coder resets across frames.
We have to make a distinction between the standard and implementations. AV1 has open source implementations which achieve really high compression. HEVC also has implementations which achieve really high compression and do that faster than AV1, but all the good ones are paid like MediaConcept. [1] The open source HEVC implementations (i.e. x265) are unfortunately quite weak compared to their AV1 counterparts and do not achieve comparable compression.
So yes, the answer depends most on whether you care about licensing. Both in terms of royalities and also implementations.
Yes x265 is not in the conversation for even top 5 HEVC encoders, regardless of presets. [1] It can be especially surprising because x264 is easily the best AVC encoder. For whatever reason (patents, lack of browser support etc) there just hasn't been as much engineering effort put into x265.
Now things like NVENC are even worse in terms of compression. Any GPU accelerated encoder trades compression efficiency for speed. Even x265 with the slowest presets will demolish any GPU encoder in terms of compression, including the MainConcept paid one when it's run in GPU-accelerated mode. This is unfortunately not explained in GUIs like Adobe tools. They just have a checkbox or dropdown to select GPU acceleration, but don't mention that it's not just acceleration - a different algorithm will be used that can't achieve the best compression.
GPU accelerated compression can still be very useful for scenarios where you need speed (e.g. have a deadline) or just don't care about quality (e.g. will publish it only on social media where it will be recompressed anyway). However when you have time to wait and want top quality, the slowest CPU-only code path will always win.
--
[1] One good public resource is the Moscow State University page http://www.compression.ru/video/codec_comparison/index_en.ht... -- They do regular comparisons of various codecs and some results are available for free. A bit in HTML form and more in some of the PDFs. Deeper insights are unfortunately paid.
"GPU encoders" don't use the GPU themselves (shader cores), they use another tiny CPU attached to the GPU. GPGPU does not really help with compression much.
HEVC is great for giving you quality in a smaller file size, for your local video files (due to being supported by more cameras and phones so far). AV1 has wider support for streaming video online at a higher quality (due to more browser support so far). So, at the moment, they are really best in two different use cases (subject to change, of course).
> HEVC is great for giving you quality in a smaller file size, for your local video files.
This is a huge plus in fansub anime scene. 10 years ago, majority of anime are in H264 (720p/1080p 8bit) which is normally 1±GB for each episode that are consist of 25 min. If I want to watch one anime, it will consume about 20 GB of space. Now, majority of them are in HEVC (1080p 10bit) which are about 300± MB for each episode.
I don't think you can reduce the comparison to "better than," but AV1 certainly compresses to smaller files than HEVC. Maybe that's a better fit for your use case.
I think it makes sense for companies to start with decode though. That hits pretty much 100% of users--everyone watches video.
But only a small fraction of users actually create content and need accelerated encode. And Apple especially I think is unlikely to use AV1 for their video recording, given their investment in other formats for that use-case.
> And Apple especially I think is unlikely to use AV1 for their video recording, given their investment in other formats for that use-case.
I concur. The raison d'être for AV1 is (lack of) patent license royalties. These apply to user devices as well as services. Think Google: Android as well as YouTube cost fortunes in AVC/HEVC licenses, so here AV1 makes sense.
On the other hand, Apple sells expensive hardware and has no problem ponying those licenses. Soon after adopting HEVC they doubled down with Dolby Vision which technically adds very little on top of standard HDR features already available in HEVC and AVC but present serious interop problems for device come with shiny Dolby stickers.
Plus unless you are streaming or producing a ton of video most users can afford to wait a bit for software encoding (which is often better quality as well). So encoding is far less important than decoding.
As far as I can tell, Apple has always only supported decoding for non-MPEG codecs.
And their encoders (at least on macOS in the past) usually don’t yield results comparable to software or dedicated quality-optimized encoding ASICs, so if I wanted high quality at low bitrates I’d have to reencode offline anyway.
It would be nice to have it available for video conferencing or game streaming, though.
Its in the same ball park. Both do considerably better than AVC (h264), but many direct comparisons between HEVC (h265) and AV1 compare apples to oranges. Sure you can get 30% lower bitrate, but only at degraded quality levels or higher decode complexity.
Also note that HEVC had a considerable head start (5 years?) so performant encoder (or even energy efficient decoders) took a while to catch up. Recent ffmpeg versions offer a lot of options, you'll find that even a basic comparison is PhD-level difficult ;-)
> Sure you can get 30% lower bitrate, but only at degraded quality levels or higher decode complexity.
Thank you for pointing this out. This thread is a mess of claims at the moment because this simple fact is under-recognized.
There are two accepted ways to compare codecs+settings: either (a) you perform a subjective comparison with the human eye using the same bitrate for both codecs, or (b) perform an "objective" metrics-based comparison where you match measured quality and compare the ratio of the bitrates.
If you're looking only at 1080p SDR 8-bit video, even h264 is already commonly used at bitrates that can approach transparency to the source (visually lossless to the human eye) when encoded well. For example, a typical Blu-ray bitrate of ~30 Mbps can achieve transparency when well-encoded for most sources.
The reason measures like "30%" are misleading is that if you try to match h264 performance at these high bitrates, you won't get anything close to 30% improvement (with HEVC over h264, or AV1 over HEVC). It can be negligible in a lot of cases. In other words, the improvement ratio from increasing the complexity of your media codec depends on the target quality of the encodes used in the test.
AV1 achieves significant improvements ("30%") over HEVC only at the lowest qualities, think YouTube or Twitch streaming. At high bitrates, e.g. something acceptable for watching a movie, the improvement can be much less or even insignificant, and at near-transparency a lot of AV1 encoders actually seem to introduce artifacts that are hard to eliminate. AV1 seems heavily optimized for the typical streaming range of bitrates, and claims about its supposed improvement over HEVC need to be understood in that context.
Depends on the encoder, this website provides easy-to-visualize data sets for various encoders at various settings
https://arewecompressedyet.com/
AV1 encoders tend to have better VMAF score at a given bits-per-pixel.
Wow that was unexpected. I checked online and it does say production encoders are faster and the result is somewhat smaller (for same quality). What a time to be alive.
AV1 is more complex. Prediction can be more complex with things like "combined inter/intra" or "warped motion" or "overlapping block MC" compared to HEVC/VP9. Then there's additional postfilters like loop restoration, cdef and film grain that didn't exist in VP9 (just deblock - which also exists in AV1) or HEVC (deblock + sao). Entropy coding is more expensive than VP9 with per-symbol entropy updates (which HEVC also has). Bigger blocks are probably an overall win, but bigger transforms can be painful with many non-zero coefficients. And intermediates in 2D MC and bi-directional ("compound") prediction are 12-bit instead of 8-bit in VP9 for SDR/8bit video. This is more similar to HEVC. So overall, AV1 > HEVC > VP9 in terms of runtime complexity, this is expected, nothing you can do about it.
It looks like dav1d is (or was 2 years ago, maybe it's even faster now) on par with ffmpeg's hevc decoder in that aspect: https://youtu.be/wkZ4KfZ7x1k?t=656
You're right about VP9 though, definitely faster to decode, though there are trade-offs (video quality, encoding performance) when compared to HEVC.
I wonder if it's just because it's new. I remember years ago the first 1.x versions of libopus used way more CPU than Vorbis to decode, now they're comparable. (This was on a teeny little chip where decoding 1 Vorbis stream took a measurable amount of CPU)
I love AV1 for compressing my movies to 720p. I also convert any audio to Opus. I get to archive a ton of content on peanuts. The videos look great on my PC monitor or my phone and they come in at around 452MB (1hr 24m video).
Here's my script if you're interested in trying it out on your content.
And I just invoke it against a folder to recursively convert stuff.
.\av1-convert.ps1 -sourceDir 'D:\Movies to convert\'
As soon as there's something that can decode AV1 that's like an nvidia shield I will replace both of my shields. So far nothing like that exists to my knowledge. Even Roku 4k Pro's say "AV1" support in their spec but they still trigger transcoding on plex when doing a playback.
As someone who also has a 2019 Shield TV Pro and is waiting for the "next best thing", one resource I've been keeping my eye on is androidtv-guide.com:
wait, you convert to 720p and want to play that using a shield pro type of device. This might be ok in your current setup, but as soon as you upgrade the panel to 1080p or 2160p, you would want the source to be in at the same resolution, or better.
AV1 hardware support is great and all, but what streaming services actually support it? Twitch did a pilot back in 2020 and the video quality was fantastic. They still haven't rolled it out.
Possibly worth noting that the encoding speeds for AV1 have improved out of all recognition over the past few years. Depending on what Twitch were using, even back in 2020 it might have been orders of magnitude slower than encoding other relatively modern formats. Today that is no longer true and in some circumstances encoding AV1 seems to be faster than almost anything else since H264. So if hardware decoding is also improving, it’s certainly possible that more services could potentially use AV1 now or in the near future.
(Source for the above is just some personal experiments. I happened to be doing a load of benchmarking with video codecs this weekend, as we’re hoping to start using AV1 for a website I run.)
Is this the new hardware treadmill? Every few years we need to switch codecs for minor bandwidth/quality gains to nudge us to buy new hardware that supports it?
Compared to H.264/AVC, you can get the same level of video quality in ~half the bandwidth (or double your quality for the same bandwidth).
Compared to H.265/HEVC, AV1 has no patents and so anyone can implement it without worrying about licensing (there seem to 3+ groups that need to be paid off).
The trade-off is that it is more computation intensive than H.264 (as is H.265).
Quite a lot actually. This codec is much more efficient at producing high quality video recording / streaming at much lower than normal bit rates when comparing to x264/5. Epos Vox has a good video describing the benefits: https://youtu.be/sUyiqvNvXiQ
H.264's most likely successor was HEVC. While Google and Mozilla strongly prefer VP8/VP9, most video content distributors are okay with paying the license fee for H.264. One patent pool, one license fee. HEVC's patent pool fragmented. So even after you pay one fee there might be another one or even worse patent trolling litigation. So non-broadcast companies are adopting av1 to avoid using HEVC when possible.
I think it's high time the web had a single audio and video codec choice that is both open and widely supported, which is why I've proposed support for AV1 and Opus for the Interop 2024 effort [2] [3].
[1] https://apps.microsoft.com/detail/av1-video-extension/9MVZQV...
[2] https://github.com/web-platform-tests/interop/issues/485
[3] https://github.com/web-platform-tests/interop/issues/484