
How does a video codec work? - dreampeppers99
https://github.com/leandromoreira/digital_video_introduction#how-does-a-video-codec-work
======
userbinator
If you're interested in more "hands on" or practical video codec learning, I
recommend writing a H.261 decoder. Only two supported frame sizes, no
B-frames, and no intra prediction (effectively a subset of JPEG) make for a
simple yet possibly quite rewarding weekend project that can be completed in
around 700 lines of C (my attempt). Unfortunately not much existing media is
available in 261, but I think watching a video being decoded entirely by code
you wrote is a pretty fun experience, including all the weird and amusing
distortions you can see when debugging; and from there you can move on to
MPEG-1 with variable frame sizes and B frames (another weekend, assuming you
reuse much of the 261 exercise --- I ended up with 1k lines total to decode
MPEG-1), and that has somewhat more existing media you'll be able to watch.

Then you can try H.262/MPEG-2 and enjoy the intricacies of handling
interlacing as well as being able to decode DVDs and a lot of existing
content; and then there's H.263 which has intra prediction... I haven't gotten
past the first two largely for reasons of time and other things to play with,
but IMHO getting a basic implementation of a video decoder is not that hard
especially when you're working from a standard.

~~~
sparklingriver
Very cool idea! Is your code for this online anywhere?

~~~
userbinator
Sorry, no. But a search of GitHub reveals some others have.

------
unlinked_dll
One thing that I wish more folks did with DSP/media tech was to start with
concepts instead of diving into details first.

Like, "how does a video codec work?" should start with: what is the problem?
(reducing the bits per second required for a video stream, since it's big) and
how? (don't send superfluous detail, since it's either redundant or
imperceptible).

Than dive into the details of how color is represented, how images are
structured, how 2d signal transforms work, the principle behind the DCT as a
method of representing the same data with better energy compaction by
decorrelating different components, and how that can be used advantageously to
reduce the number of bits for a still image, then talk about shared data
between images, etc etc.

I've noticed as a DSP guy that when I talk about things without concepts first
that everyone's eyes glaze over. Although it is nice that everyone thinks it's
black magic, good job security.

~~~
ofrzeta
To be fair the article does that. The above URL just links to an anchor in the
middle where it talks about codec implementation. For instance this quote from
the article

"We learned that it's not feasible to use video without any compression; a
single one hour video at 720p resolution with 30fps would require 278GB*.
Since using solely lossless data compression algorithms like DEFLATE (used in
PKZIP, Gzip, and PNG), won't decrease the required bandwidth sufficiently we
need to find other ways to compress the video.

...

In order to do this, we can exploit how our vision works. We're better at
distinguishing brightness than colors, the repetitions in time, a video
contains a lot of images with few changes, and the repetitions within the
image, each frame also contains many areas using the same or similar color."

[https://github.com/leandromoreira/digital_video_introduction...](https://github.com/leandromoreira/digital_video_introduction#redundancy-
removal)

~~~
lightedman
278GB is a bit off. The real number is higher than that, but not by too much
more, but they definitely messed up on their math somewhere. 1280 horizontal
pixels times 720 vertical pixels times 8 bits per pixel color channel times 3
color channels (RGB) times 30 FPS times 60 SPM (sec per min) times 60 MPH (min
per hour) divided by 8 bits to convert bits to bytes and finally divided by
1,024 bytes per kilobyte gives 291,600,000KB, exactly 291.6GB.

And now days, that kind of uncompressed video is more than feasible, at least
in countries with developed gigabit internet capability, as you'd need just
634mbit of bandwidth to handle an uncompressed 720p stream at 30 FPS. Local
storage on even first-gen SATA drives can hit double+ that.

Storage itself? Well, we've got the cloud, right? /s

~~~
EvanAnderson
You're mixing Si units and IEC units[1]. The article's measurement of 278 GB
(sic) is really 278 GiB.

1280 horizontal resolution x 720 vertical resolution x 3 8-bit bytes per pixel
x 30 frames per second x 3600 seconds per hour is 298,598,400,000 bytes per
hour. 1GiB (1,024 bytes x 1,024 KiB x 1,024 MiB) is 1,073,741,824 bytes.
Dividing that out gives ~278.09 GiB.

[1]
[https://en.wikipedia.org/wiki/Binary_prefix](https://en.wikipedia.org/wiki/Binary_prefix)

~~~
lightedman
I'm responding directly to the GB claim. I'm very well-aware of binary prefix
(in fact several of my other online names are those very words) and the
general industry-wide mix-up that is used on the common consumer is to start
with KiB and then just run that in plain units of 1,000 for MB and GB, which
is what I went with.

------
greggman2
As much as I like Apple it seems like they are in the back pocket of big media
when it comes to this topic. AFAICT Apple only supports h.264 and h.265
meaning 1.3 billion iOS devices have browsers that can't access open standards
for video.

Yet one more reason why Apple should be required to allow alternate browser
engines IMO. Some will claim it's a battery issue but Apple has the resources
to add hardware support for other codecs and I'm only guessing could already
handle it just fine with their current "Bionic" chips. I'm not sure what other
executes can be dreamt up. I'm sure they'll follow in the comments below.

~~~
anovikov
Not sure what you mean, a stock iPhone plays a VP8 webrtc stream with no
issues?

~~~
greggman2
[https://www.webmfiles.org/demo-files/](https://www.webmfiles.org/demo-files/)

Play in Chrome (Mac/Win/Linux/Android) and Firefox (Mac/Win/Linux/Android) but
not Safari (Mac/iOS).

Here's a few more

[https://commons.wikimedia.org/w/index.php?search=webm](https://commons.wikimedia.org/w/index.php?search=webm)

I would be nice if Apple would support open standards. It's unclear what their
motive is for not supporting them.

~~~
Const-me
> It's unclear what their motive is for not supporting them.

Programmers don't work for free, Apple has to pay them. Motivation works the
other way, they need a motive to support some tech.

------
hamilyon2
I wish there was an article on how hardware decoding works. Detailed but
approachable.

I guess reading source code of drivers and popular open-source video players
will give me an idea, but it is too much work. Does anyone have a technical
article highlighting differences between hardware decoders and important
details.

~~~
rasz
Afaik nowadays hardware decoding is a magic black box you shove bytestream
into affair. Its all hidden away in binary driver/firmware blobs.

------
vkaku
Good. Want more people to understand how compression works.

Things like motion estimation, encoding the differences in frames, all that
stuff is magic to the uninitiated. :) But this stuff is absolutely worth
learning and every engineer should show interest :)

~~~
RivieraKid
Why is it worth learning? I have a long list of stuff that I want to learn but
codecs don't seem useful enough to be included there.

~~~
vkaku
Codecs teach you if the high fidelity of the data you store may not be as
important as you'd think it is, and sometimes you'd be able to process all of
that with a fraction of the compute power and cost involved.

------
DavidVoid
I can also recommend this article on just how impressive H.264 at compressing
a file while retaining good quality.

[https://sidbala.com/h-264-is-magic/](https://sidbala.com/h-264-is-magic/)

------
doublerabbit
This is all very cool, but what annoys me the most is that, let’s say you
create video authoring software, it creates a piece of unique content and you
decide to use a codec that’s licensed; you’ve got to pay royalties. If you use
another it’s not supported by default. .FLAC for example or .oggv

To not be liable to lawsuits and to support your own you need to come up with
your own codec, resulting in your own platform and the circle starts once
again.

Why in this age can we just not all work together and make something that’s
usable by everything for everyone without any restrictions.

------
alexirae
A similar tutorial like this but for audio?

~~~
Tomte
[https://xiph.org/video/](https://xiph.org/video/)

------
peter_d_sherman
Great article + amazingly extensive collection of links!

------
sabujp
this is good, then cringed when I saw the diagram about DRM that had M$, for
msft. i guess the other companies aren't about making money at all costs?

------
setheron
Playing videos is so complicated.

