
H.264 is magic (2016) - dpeck
https://sidbala.com/h-264-is-magic/
======
jwr
The interesting thing about H.264 is that it achieves excellent compression
rates not by a single "miracle trick", but by combining multiple small-gain
techniques. Every technique lets you gain on the order of 1-3%, but when you
combine them all, you get fantastic results.

Another tidbit of (potentially) interesting information: it took many years
for H.264 to achieve its full potential, for two reasons:

1) Implementing a decoder is really, really complicated, especially if you
want it to run fast, and if you think a miracle "hardware decoder" solves the
problem, you need to remember that H.264 places heavy requirements on memory
bandwidth and caching (B frames), so it's not like you can just feed a stream
through a chip and get video frames on the other side.

2) Implementing an encoder is really, really complicated :-) — and there are
many ways to encode a video stream. H.264 delivers the tools, but does not
specify how you should use them. It took many years for decoders to get to a
state where most of H.264 is used efficiently.

~~~
tgtweak
It's a tremendous effort to simply ensure standards. There are so many knobs
and dials on h264 encoders, and features in each profile, is hard to
appreciate the compatability we have today across devices - especially
considering how much is in silicon.

I remember sponsoring a feature in x264 (the best software encoder) and the
level of complexity in that code is mind bending. The patch was to allow
better CPU utilization when running with over 16 cores - something the authors
hadn't run into at the time and had no systems to test on. I did a simple
profile during encode, sent them the trace, in 2 days the patch was tested and
working and it hit main in a week's time. Boom, all 48 cores used.

Made me realize just how much talent is behind these technologies and I still
shake my head at how much money is generated with this and how little of that
makes it's way back to the developers at the core of it.

~~~
da_chicken
> Made me realize just how much talent is behind these technologies and I
> still shake my head at how much money is generated with this and how little
> of that makes it's way back to the developers at the core of it.

If it's any consolation, not much more makes it to the talented developers in
commercial products, either. It mainly ends up going to executives and sales.

~~~
winter_blue
I think in New York and California at least, developers are getting paid
closer to what their work is worth. Salaries are often hitting the $300k
ballpark for highly talented developers. Average pay itself is now around a
$150k.

I think it's _primarily_ developers in other countries[1] that are getting
severely underpaid. Hopefully that changes in the coming years.

[1] The most egregious example might be India. Around 7 years ago, I was on a
plane from Kerala (a state in India), and the guy sitting next to me happened
to run a software consulting company (in Kerala). His customers were from
Japan and some other countries (can't remember), and his company wrote Linux
drivers for proprietary hardware. He hired kids right out of college. I asked
him how much he paid them, and he said without blinking, Rs. 15000 per month.
Type in "INR 15000 to USD" to Google, and you're in for a shock. It's around
$2600 per year, but probably closer to $3000 per year at the exchange rates
back then.

American developers doing the same job could earn up to 100 times that, and at
the very least 40 times that amount. I can't understand why on Earth salaries
are 40 to a 100 times higher _for the same exact work_ , and for work that
could be done by anyone anywhere on the globe, but I hope this massive
difference in pay converges and disappears soon.

~~~
spease
What kind of developers are you referring to? I’ve noticed a huge discrepancy
between companies and industries.

------
bscphil
Interesting introduction. Too bad it doesn't really get into some of the more
complicated things an advanced H.264 encoder like x264 is doing, e.g. adaptive
quantization methods. There's also a mistake or two, for example

> P-frames are frames that will encode a motion vector for each of the macro-
> blocks from the previous frame.

Actually, P-frames can use data from multiple previous frames, not just the
last one.

I think it's worth pointing out, as well, that even though we've technically
surpassed H.264 with newer codecs like H.265, VP9, and AV1, all of which are
roughly 20%-50% more efficient, this has required tremendous increases in
encoding complexity. H.264 is special - it seems to occupy a kind of
inflection point on the complexity / efficiency curve. It's _far_ more
efficient than previous codecs like Xvid, WMV, and so on, but at the same time
even a lot of underpowered devices from over a decade ago can easily play it.
We're not likely to see tradeoffs that good again in the video codec space.

~~~
clouddrover
When you control for image quality VP9 outperforms H.264 in single-threaded
decoding:

[https://blogs.gnome.org/rbultje/2015/09/28/vp9-encodingdecod...](https://blogs.gnome.org/rbultje/2015/09/28/vp9-encodingdecoding-
performance-vs-hevch-264/)

You get the same image quality at a lower bitrate and faster decoding.

VP9 encoding is much slower, but Intel's SVT-VP9 encoder is getting over 300
frames per second on the right hardware:

[https://www.phoronix.com/scan.php?page=news_item&px=SVT-
VP9-...](https://www.phoronix.com/scan.php?page=news_item&px=SVT-VP9-Open-
Source)

~~~
bscphil
That's an interesting claim. I have to say it doesn't match my experience at
all, e.g. when playing VP9 videos on Youtube. Several thoughts:

* I'm not sure what the "ffh264" decoder they're talking about is. My understanding is that pretty much everyone is using libx264 for both encoding and decoding. Maybe the latter is more performant?

* Using SSIM to pick "equal quality" files will probably not give very accurate results.

* SVT-VP9 is, if I'm not mistaken, a hardware encoder. One of the issues with these is that they require some significant tradeoffs to get better speed, so they'll have worse quality than a software encoder at the same bitrate. If you go with software you really pay the price when it comes to encoding time.

~~~
cornstalks
Some corrections:

ffh264 is FFmpeg's decoder. libx264 only does encoding, not decoding.

SVT-VP9 is a software encoder, specially optimized for Intel Xeon Scalable and
Intel Xeon D processors.

------
mistercow
Overall, this article is a pretty good overview of how lossy image/video
compression works. However, what the author describes as "quantization" is
actually a low pass filter. With quantization, you do not simply zero out the
high frequency components. You "snap" them to specific intervals. For example,
if you have some data varying between 0 and 100, like [7, 39, 97, 42, 13],
quantizing that by a factor of 5 would give you [5, 40, 95, 40, 15]. This
gives you an approximation of the fine details, rather than simply throwing
them away.

------
Wowfunhappy
> Let's say you've been playing a video on YouTube. You missed the last few
> seconds of dialog, so you scrub back a few seconds. Have you noticed that it
> doesn't instantly start playing from that timecode you just selected. It
> pauses for a few moments and then plays. It's already buffered those frames
> from the network, since you just played it, so why that pause? Because
> you've asked the decoder to jump to some arbitrary frame, the decoder has to
> redo all the calculations - starting from the nearest I-frames and adding up
> the motion vector deltas to the frame you're on - and this is
> computationally expensive, and hence the brief pause.

Uh, is that actually what's going on there? Because for some reason, seeking
is basically instant when I'm playing locally-saved videos, including ones I
downloaded directly from Youtube and didn't re-encode.

Actually, this is one of the reasons _why_ I download videos before watching
them, instead of using the normal Youtube player.

~~~
sp332
Yeah. VLC for example doesn't have a frame-by-frame backward seek, because
they argue it would blow up the memory consumption to keep all those frames
around just in case you want to jump backward. And since videos are mostly
hardware-decoded on the GPU these days, you'd be wasting all that memory in
your VRAM.

Here's one of many threads where Jean-Baptiste Kempf addresses the feature
request.
[https://forum.videolan.org/viewtopic.php?p=390778&sid=1a571d...](https://forum.videolan.org/viewtopic.php?p=390778&sid=1a571d0ef612114307fdd41229a6278f#p390778)

The amount of pausing is going to vary a lot by video. If you have an I-frame
followed by 250 P-frames, there will probably be a noticeable pause.

~~~
Wowfunhappy
It has been a _long_ time since I used VLC—I generally use mpv on Windows and
QuickTime X on Mac because I like their minimal UIs—but I remember it being
instant too!

Side note, using VRAM to make seeking instantaneous seems to me like a
perfectly good use of resources, at least if my device has the VRAM to spare.
On my desktop, I have a 1080 Ti, and it mostly sits idle when I'm watching a
video...

~~~
Jaruzel
You may want to revisit VLC. It now has the option for you to run it with with
minimal UI or even with no UI at all (just a video frame), and if you are
brave enough you can also design a custom UI for it.

The only thing I _really_ don't like about VLC is the traffic cone icon, but I
get that's their brand so I have to live with it.

~~~
StavrosK
I'm another mpv user and I switched from VLC to mpv just because mpv is plain
better than VLC. VLC would give me some slowdowns and would hang sometimes,
whereas mpv has always played everything perfectly.

~~~
stockavuryah
But VLC has many many more options and possibilities, if you need it.

~~~
Wowfunhappy
This is kind of why I don't want to use VLC.

I want my media player to do exactly one thing: open the video or audio files
I tell it to open, and play them. Plus the option to pause, seek, enable
subtitles, or change audio tracks. Anything else is extraneous.

I'm something of a software minimalist. I've spent a lot of time finding and
hiding UI elements on my Jailbroken iPhone. If I don't need something I really
want it out of sight.

------
zeroimpl
Chroma subsampling is a terrible idea in the digital era. You should just
assign a different compression factor to the chroma channels instead. I don't
understand why we keep using this system of discarding 75% of the color data
BEFORE applying the lossy compression algorithm. (Well I do know of one
argument - apparently it reduces the amount of CPU time required to compress
and decompress the data, but I think that's a pretty lame reason)

~~~
Ace17
> You should just assign a different compression factor to the chroma channels
> instead [of subsampling chroma].

The good news is that we don't need to change the H.264 standard for this.

Supposing that:

\- most of the already deployed embedded H.264 decoders support 4:4:4 profiles

\- non-chroma-downsampled input content is available

Then, it's only a matter of changing encoder implementations to do exactly as
you say.

~~~
zeroimpl
I'm pretty sure there is plenty of 444 content. Movies are likely
filmed/edited at 444, they are provided to theaters at 444, but they are
generally only made available to consumers at 420.

I believe the reason is that most hardware decoders sold to consumers do not
support anything above 420. As Blu-rays and such are all currently 420, the
chip makers don't have much incentive to support 444. It's a shame because
your TV fully supports 444, and the difference can be huge on some content.

------
carlsborg
But it isn’t free so it’s a pity it’s taking over the world. Chrome plays
h.264 / mp4 because they are licenced. Firefox counts on OS support. (The
Cisco plugin only supports webrtc calls) so you can’t legally watch MP4 videos
on Linux.

Webm seems to be as good and free.

Can some comment if I am mistaken about this?

~~~
suhail
It’s all capped at a max payment and not terribly expensive.

VP9 and its new versions will take over but it will be a while until it’s
implemented in hw fully.

~~~
carlsborg
Non-free standards are bad for open source and innovation in general.

~~~
tntn
The good thing about standards is everyone can have their own.

------
sytelus
It is pretty amazing that H.264 is not one algorithm that just popped out and
changed the world, instead it is accumulation of collection of tricks
developed over several decades.

I distinctly remember few folks saying that you would need to violate laws of
physics to transmit 1080p @ 60 fps over wifi. Now we are all doing it many
times over and billions of dollars are being made every year. This is one of
the unsung achievement worthy of Nobel-prize level awards but no one seems to
know who these people were.

I wonder if there is any detailed history of how various components of H.264
came together, who led this effort, how projects were funded for such a long
time?

------
dpeck
Previously discussed:

[HN 2016] -
[https://news.ycombinator.com/item?id=12871403](https://news.ycombinator.com/item?id=12871403)

~~~
userbinator
I wrote a comment there about frame differencing and wanting to try writing a
video decoder after having written a JPEG one. Now over 2.5 years later, I can
say that I did write a toy H.261 decoder, along with one for MPEG-1 and the
beginnings of another for H.262/MPEG-2 in my spare time, and it wasn't all
that difficult. In fact, because of the need to not only decode an image for
each frame, but to do it quickly enough to keep up with the framerate on the
limited hardware of the time, these early video codecs are in some ways
_simpler_ than JPEG --- e.g. all the Huffman tables are static, the
colourspace and subsampling are either fixed or have only a small number of
variations, etc. As evidence in support of this, my JPEG decoder is around 750
LoC (in C), while the H.261 one is slightly smaller at just under 700 LoC. The
MPEG-1 decoder is a bit more complex, at close to 1kLoC. I haven't finished
the H.262 one and it is more complex (interlacing is a pain...), but it's
probably doable in less than 2kLoC total for a decoder that understands both
MPEG-1 and 2/H.262 --- the latter is somewhat a superset of the former. I
wasn't really optimising for size, nor did I have a good idea of how much code
it'd take before I finished, so these should be quite "realistic" estimates of
complexity.

I've seen and given recommendations on here to write a JPEG codec as a
learning exercise, and a search of GitHub reveals plenty of others who've done
it; but the same situation doesn't seem to hold for video. Nonetheless, having
done it and realised it's not that much work (i.e. should be doable in a
weekend or two), I now recommend trying to write codecs for H.261 and MPEG-1
too. There's not that much media now in those two codecs, but if you get to
MPEG-2, you can experience watching DVDs using your own code.

~~~
jwr
The complexity in H.264 gets you later on, when you try to build a complete
(e.g. fully compliant) decoder. There are some killer features which seemed
like a good idea at the time, but which make the complexity staggering.

MBAFF is a good example (Macroblock-Adaptive Frame/Field encoding). What this
means is that the encoder can choose for every block whether it wants to
encode it as a single frame or two interlaced fields. Combined with B-frames
(and remember that B-frames reference frames both in the past and in the
future) this makes for really interesting memory access patterns, which then
kills your cache locality and murders performance.

------
mrob
One under-appreciated advantage of H.264: it's the last video codec where the
artifacts look obviously artificial. Modern codecs have artifacts that look
too much like real objects, so they take more brain power to ignore. After
watching low bitrate H.264 for a few minutes I stop noticing the artifacts,
which isn't the case with modern codecs.

~~~
jeffhuys
What codecs are you talking about? I'm interested to see these artifacts for
myself.

~~~
mrob
Comparison videos from: [http://video.1ko.ch/codec-
comparison/](http://video.1ko.ch/codec-comparison/)

[http://video.1ko.ch/codec-
comparison/videos/vp9_562.webm](http://video.1ko.ch/codec-
comparison/videos/vp9_562.webm)

[http://video.1ko.ch/codec-
comparison/videos/x264_562.mp4](http://video.1ko.ch/codec-
comparison/videos/x264_562.mp4)

At 15 seconds, in the VP9 version, it looks like smoke is pouring off the top-
left metal pipe of the machine. It immediately draws my attention because it's
so out of place. But in the H.264 version, the whole image is noisy, so
nothing distracts me.

~~~
PuffinBlue
I'm not really sure which is which but the webm version looks way way way
better to my eye. Not even any comparison really.

------
mehrdadn
General question about video codecs: why are codecs tied to resolutions? Why
can't some codecs be sued for resolutions larger than some maximum? This is
probably the one thing I never understood about codecs.

~~~
opencl
Mostly it's just because the standards say so. There's nothing about how
MPEG-2 works that inherently makes it impossible to encode 4K video with it,
the standard just says that MPEG-2 video shall not exceed 80Mbit/s, 1920
horizontal pixels, 1152 vertical pixels, or 62,668,800 luminance samples per
second.

The limits are just there for interoperability. As is, if you encode a video
that complies with MPEG-2 High Level, then you can be pretty confident that
any MPEG-2 High Level decoder will decode it.

Without the limit everything becomes a mess. You're writing a decoder, what
max res should you support? 4K? 8K? The people writing decoders don't agree
and then people trying to distribute these high resolution files find that
they work in some decoders but not others.

~~~
mehrdadn
Thanks! What do you mean by "not being able to decode" something though? Do
you mean time-wise (not enough bandwidth to keep up) or do you mean the
algorithm would fail in a different way? The latter doesn't make sense to me,
and the former seems like it's entirely system-dependent? Like isn't it the
exact same problem if I have an underpowered computer?

~~~
Wowfunhappy
It means hardware decoders won't be able to decode the videos. (Some software
decoders that are overly rigid about following the spec may refuse as well,
but most won't, and none actually _have_ to from my understanding.)

You can absolutely use h.264 to encode 8K@120fps if you want to.

~~~
mehrdadn
Ohh hardware decoders! Makes sense.

------
jimbo1qaz
Chroma subsampling is a terrible idea for non-photographic images. People are
not "terrible" at color perception. I can clearly see its harmful effects to
on-screen captures, line art, anime, 2d video games, or even Super Mario 64's
hat.

Youtube's chroma subsampling makes colors bleed, Mario's hat turn into chunky
red blocks approximately filling Mario's hat, screen captures discolored and
grainy around text, and sharp colored anti-aliased lines turn into a
discolored mess.

(Mario's hat was pixelated on pannenkoek2012's emulated SM64 videos at native
640x480. Maybe newer video decoders antialias the chroma channel, so Mario's
hat is not pixelated but "merely" blurry.)

I upload oscilloscope videos with colored lines on black backgrounds, and
stopped using brightly colored lines partly because color was being blurred
and discolored by Youtube chroma subsampling.

------
gingabriska
I bought a laptop camera module and attached to a USB cable and it pegged my
raspberry CPU at 90% and soon I started receiving temperature warnings through
that thermometer icon on the screen.

I wonder why webcams don't directly use the onboard graphic card.

Even the cheap webcams all suffer from the very same problem.

The reason I bought laptop card, I thought it would be higher quality and
lower price and would be good for monitoring my 3d prints.

I've no found any Raspberry Pi camera with autofocus with IR filter intact ( I
need it for full light use and better colours during day time)

~~~
Jaruzel
Try the Microsoft LifeCam range. AutoFocus, Supported by Linux, Really good
image.

------
viraptor
I like this high-level summary. One thing I'd change though is that the photo
loses details in the frequency domain only. In real view it actually gains
unnecessary details/noise - it's not that the grill details disappeared, it's
the whole image that gained an anti-grill layer. (You can now see it on the
smooth part of the laptop)

------
nurettin
> You wouldn't say HHHHHHHHH. You would just say "10 tosses, all heads" \-
> bam! You've just compressed some data! Easy. I saved you hours of mindfuck
> lectures. This is obviously an oversimplification

Oversimplification and exaggeration. Understanding Huffman encoding and
deflate algorithm are just two five minute articles away.

~~~
dTal
It's also not compressed at all - the english prose is 21 characters, over
twice as long. Also the English prose requires at least ASCII, while the
string of H can be encoded with a single bit each. 168 bits > 10 bits.

To be fair, it is faster to _say_ , out loud, because our verbal "character
set" is so much richer. A unit of pronunciation is roughly a syllable. "10
tosses, all heads" has only 5 syllables, while "ach ach ach..." takes 10. But
this insight is a bit much to ask of someone new to the concept of
compression.

~~~
ljcn
It's difficult to describe but I read it more like converting the raw data (a
list of results) to a generator. Akin to the "compression" described in this
xkcd: [https://www.xkcd.com/1155/](https://www.xkcd.com/1155/)

------
IronWolve
The chroma subsampling is why 4k video played on a 1080p/1440p monitor looks
so much better/clear/detailed on monitors. Typical 1080P native encoding has
less chroma detail for size saving.

------
thomascgalvin
I've seen this article before, and this part always stuck with me:

> I captured the screen of this home page and produced two files:

> PNG screenshot of the Apple homepage 1015KB

> 5 Second 60fps H.264 video of the same Apple homepage 175KB

> Eh. What? Those file sizes look switched.

> No, they're right. The H.264 video, 300 frames long is 175KB. A single frame
> of that video in PNG is 1015KB.

The article does go into how and why this is possible (tl;dr H.264 is lossey,
png is not), but for the difference in human-detectable quality, H.264 is
astonishingly more efficient.

~~~
ChrisSD
It's a good example but the difference doesn't quite have to be as dramatic as
the article makes out.

Screenshot tools rarely produce optimised PNGs. A quick test of the image in
the article shows that optipng (lossless) can bring the size down to ~570kb or
pngquant (lossy) to ~240kb. Zopfil could further compress the images produced
by optipng or pngquant, but it's a case of diminishing returns at that point.

To be clear the article makes a good comparison but I think it perhaps
overstates the case to make a point.

------
vagab0nd
I worked on this project where we needed to analyze the motion of objects from
the camera in real time. We were trying to find an algorithm that was fast
enough to run on the hardware, but to no avail.

One day, it dawned on me that since our hardware encodes the video in h264, we
should be able to get some of that info from the video itself. So we tried
extracting the motion vectors from the video and sure enough, it was good
enough for our use and we got it basically for free.

------
Iv
FFT is the real magic there and has been since the 90s.

Convince me otherwise.

~~~
ChrisLomont
The FFT is lossless, so there’s no compression as a result of using it.

The compression is gained by perceptual modeling and dumping parts, and by
powerful entropy modeling, like CABAC and such.

Also more advanced things are usually used in modern compression: DCTs,
sliding window stuff, wavelets, CABAC stuff, motion prediction schemes, etc.,
none of which were used in the 90s (except DCT and rudimentary prediction).

~~~
acchow
"FFT is lossless" \- is this true without infinite precision values?

~~~
ChrisLomont
The FFT using real numbers is not lossless (except in certain numerically
coincidental cases that are not mush practical use), but there are lossless
approximations used routinely in compression to avoid error at this point.

For fixed input and output (say 8 bit to 8 bit data), it's easy to make
lossless FFTs, since the final truncation or rounding can be designed to never
lose information. So in these cases, even a FFT based on , say, IEEE 754
doubles can be made lossless when using integral input and output.

A similar example is converting 8 bit RGB to 0-1 floating point by dividing
the color channel by 255.0, which is lossy as real numbers since the resulting
floating point is truncated (except precisely for inputs 0 and 255). However,
multiplying the result by 255.0, and rounding appropriately, you can recover
each input without loss. So this is an example of a lossless transformation
that has lossy steps in between. This can be done in almost any case you need
to do it.

FFTs in compression are not used to lose values, but to change signal domain,
where perhaps interesting and lossy things can be done.

Google lossless FFT or reversible FFT and dig around.

At the end of the day, however, it's not FFTs that are useful very much in
compression. They're only used for signal changes at best, and they've been
made obsolete for the most pary by other techniques.

~~~
tntn
Can you give some details re:

> The FFT using real numbers is not lossless

The DFT over the reals is invertible, right? How can it be invertible but not
lossless?

~~~
ChrisLomont
> The FFT using real numbers is not lossless

FFTs use terms of the form e^(2 pi i k / n), and these terms cannot be
represented exactly except in rare cases with finite precision floating point
numbers (follows from Gelfand's Theorem).

Thus, as soon as you try to use or compute such a term, you've made an
approximation, losing information.

The transform is approximately invertible using finite precision, and if your
inputs are some fixed set you can make sure and do error analysis to ensure
those terms come back out via careful rounding.

But the FFT, and it's inverse, involve infinitely precise real numbers that
cannot be represented as floating point.

The DFT fails for the same reason.

 _However_

As I explained above, if you have a limited set of inputs, say byte values
0-255, that get converted to floating point via these approximations, then the
inverse approximation is applied, _then the final values are appropriately
rounded,_ you can make the DFT and FFT (and almost any approximation
algorithm) on this limited subset of inputs 0-255.

As a simple example, consider turning a byte color channel 0-255 into a 32 bit
float in 0.0f - 1.0f via dividing by 255.0. Now every one of the values except
0.0f and 1.0f are approximations, since the only exactly representable floats
are dyadic (denominator is power of 2) and these denominators are 255 (no
powers of 2).

So this is lossy. But the limited inputs means there are 255 different floats
possible.

Now multiplying by 255.0, which is (int this case) not lossy on floats, puts
each back to close to an integer (but not all are integers). Rounding to
closest will restore the original 0255 byte values.

So the roundtrip is lossless here. But the same transforms dividing by 255.0f
then multiplying by 255.0f are not lossless for all floating point inputs.

In each case for any algorithm, one needs to carefully design the inputs and
outputs and transforms carefully to ensure it behaves as desired.

This is the tip of the iceberg when dealing with floating point algorithms :)

The DFT over the reals is not lossless. Using floats, for example, and
applying it to max_float overflows, so cannot be inverted. It's not even bit
invertible for almost any set of inputs. It's only invertible on very limited
specialized sets of inputs

~~~
tntn
I find it very confusing that you seem to use "real number" and "floating
point number" interchangeably.

Like this:

> The DFT over the reals is not lossless. Using floats, for example, and
> applying it to max_float overflows, so cannot be inverted.

I don't see why you make statements about the reals based on what happens with
floats. They aren't the same, and the DFT exists independently of actually
computing it.

------
alfonsodev
Still what got into all MacBooks > mid 2015 is a hardware HVEC encoder(if I’m
not mistaken), anyone knows why HEVC and not H.264 ?

EDIT: *HEVC not HVEC

NOTE: I found out HEVC is H.265 so my question might not make sense :)

~~~
zapzupnz
Fairly sure hardware H.264 decoding has been included with every Mac, usually
on the GPU, for years and years now. A couple years after the iPhone, I think.

~~~
alfonsodev
Decoding sure, but I was saying about encoding, my Macbook mid 2015 comes with
"Intel Iris Pro 1536 MB" and takes 25% cpu to capture screen with Quicktime,
while newer MacBooks >=2016 come with HEVC, which after some searches I found
out is also called H.256[1][2], and is not just in Macs[3]

[1] [https://support.apple.com/en-us/HT208238](https://support.apple.com/en-
us/HT208238)

[2] [https://www.macrumors.com/guide/hevc-video-macos-high-
sierra...](https://www.macrumors.com/guide/hevc-video-macos-high-sierra-
ios-11/)

[3][https://www.techspot.com/article/1131-hevc-h256-enconding-
pl...](https://www.techspot.com/article/1131-hevc-h256-enconding-playback/)

EDIT: I see now the label 2016, so opening question didn't make sense :)
sorry.

~~~
kalleboo
They also have encoding. The software might not be using it properly though.
[https://en.wikipedia.org/wiki/Intel_Quick_Sync_Video](https://en.wikipedia.org/wiki/Intel_Quick_Sync_Video)

I think screen recording is slow not due to the video encoding but since it
defeats standard GPU acceleration of the GUI in order to be able to capture
the contents

~~~
namibj
It really shouldn't. You just feed the framebuffer both into the DMA output,
and in the H.26x encoder. If needed, use another framebuffer space if the
encoder is slow, so you don't _have_ to recycle the buffer after this frame,
and risk the encoder not being done yet.

------
social_quotient
In the article near the top it says

“1080p @ 60 Hz = 1920x1080x60x3 => ~370 MB/sec of raw data.”

Can someone help me understand the 60 Hz part of this? Was it meant to be fps
or does it really mean Hz? And why?

~~~
syn0byte
The average refresh rate of a computer display is 60Hz. In this context it's
interchangeable with framerate but we think of it in terms of Hz because of
the monitors.

------
arendtio
Nice summary!

I just wish he wouldn't have used pounds for comparison... I mean there are 3
countries left on this planet that don't use the metric system officially.
Even the scientists in the US use it. Reminds me of
[http://www.joeydevilla.com/wordpress/wp-
content/uploads/2008...](http://www.joeydevilla.com/wordpress/wp-
content/uploads/2008/08/king-henrys-foot.jpg)

But given how he managed to simplify such a complex technology to an easy-to-
read article I don't really want to criticize him for using the wrong unit
system.

------
nojvek
I have to say I really love sid's style of writing. Humurous and entertaining
yet brief and to the point.

Hope there's more like this to come.

~~~
davidcollantes
Written in 2016, it is the only entry on the blog. Odds are slim, at least
coming from Sid.

------
mrmondo
I’d be interested to see this same analysis with H.265 given that its the way
forward and the standard on the for many devices and media.

------
PorterDuff
Thinking back on the different dead ends, I figure that H.264 is just a
reasonable evolutionary stage.

Now, Pixelon...that was probably magic.

~~~
viraptor
Somehow I missed this dot-com crash story. The number of things wrong with it
is amazing: [https://www.wired.com/2000/05/perilous-fall-of-
pixelon/](https://www.wired.com/2000/05/perilous-fall-of-pixelon/)

~~~
doomrobo
If stories like this interest you, I highly recommend Nat Geo's Valley of the
Boom. I don't even like "history shows" but it's a funny and self-aware take
on SV in the 90s with solid acting and a pretty low budget.

------
_bxg1
Fascinating. I love these accessible dives.

------
Invictus0
Neat! I'd love to read another post like this about H.265.

------
montenegrohugo
What a nice article! Love finding these on hn

------
alanh
Don’t use blockquotes instead of section headings… please. This makes my skin
crawl, but more importantly, it messes up assistive technology such as rotor
and text-to-speech.

------
eternalban
I believe the "magic" is happening in the visual cortex.

------
etaioinshrdlu
We have the tech today for even wildly better compression, at least for
natural looking images. Using neural nets. There is a big computational cost
to them, but if history is any guide, that won't be an issue forever.

~~~
flud
Got any links to that?

~~~
sandos
Its sort of obvious if you see stuff like this:

[https://www.pcgamesn.com/nvidia/ai-gaugan-app-painting-
neura...](https://www.pcgamesn.com/nvidia/ai-gaugan-app-painting-neural-
network)

The neural network basically has a huge lookup table of "natural" looking
stuff. Just point into that table and get what you want out of it: grass,
trees, a crowd, etc and the NN will produce it to your specifications.

------
mojuba
Judging from the comments, the article mostly resonates with people who
already know how the codec works, but it doesn't seem to be very useful to
engineers who, like me, want to understand it in a bit more technical detail.
The author could have removed the casual conversation noise in the text (very
annoying in technical articles!) and filled it with some more useful bits
instead.

~~~
jannemann
I enjoyed the reading and think that it is not a technical article rather
entertainment. Maybe that is wrong too and it is both. Why should every
technical article by definition be written in a boring language?

For technical details use the terms and search them in your favourite search
engine. There are literally tons of articles about every aspect of H.265.

~~~
mojuba
> For technical details use the terms and search them in your favourite search
> engine.

Thanks for lecturing me, but the article in question is neither too
entertaining nor too informative. That was my point. Other than that, of
course I can find better materials on the net.

