
A hands-on introduction to video technology: image, video, codec and more - manorwar8
https://github.com/leandromoreira/digital_video_introduction
======
barrkel
Video compression is not understood well enough throughout the whole stack
yet.

I recently got a 1080p projector for home use, so now movies / TV series in my
home are viewed on a 100" screen. Content is mostly from Netflix and Amazon
Prime Video.

Netflix does a really good job with encoding. I cannot say the same for Amazon
Prime Video; even with their exclusive (in UK) offerings, like American Gods
or Mr Robot, the quality of the encode is quite poor when viewed on a big
screen. Banding, shimmering blocky artifacts on subtle gradients, insufficient
bit budget for dimly lit scenes - once you become aware of the issues, it
becomes really distracting.

OTOH a really big screen is a fantastic ad for high quality high bitrate
content. Anything less than 2GB/hour is noticeably poor.

~~~
izacus
Depends on the content - HDR 4K content on both Netflix and Amazon is encoded
in horribly low bitrate - about 12-15Mbit/s which makes it barely look any
better than 1080p content (for comparison - 4K BluRay discs are about
60-100Mbit/s and look infinitely better).

~~~
user5994461
This is probably not an accident.

A good broadband connection is limited to a sustained rate around 13-18 Mbps.

~~~
rasz
in US maybe, Im getting my advertised Mbps 24/7 as long as the source is able
to keep up.

------
thomastjeffery
Remember when we did this ugly _interlacing_ thing, so that we could get a
higher (50/60fps) framerate?

When did we decide that 24/25/30fps was good enough? Now we have a Blu-Ray
standard that cannot handle greater than 30fps, and media corporations that
are unwilling to release content via any other medium.

Put that together with ever-increasing resolutions, and the amount of pixels
something moves across from one frame to the next becomes greater, and video
looks more and more choppy.

Franky, this is a _much_ bigger problem than NTSC ever was. Even with content
(The Hobbit, Billy Lynn's Halftime Walk) being created at higher framerates,
users have no way to get the content outside of a specialized theater because
the Blu-Ray standard cannot handle it, and because people seem to honestly
believe that higher framerates look _bad_.

I suppose we can only hope that creators take better advantage of digital
mediums that _do not_ have such moronic, and frankly harmful, arbitrary
limitations.

~~~
bluedino
>> When did we decide that 24/25/30fps was good enough?

Everything looks weird at 48/60fps

Remember LoTR?

~~~
Houshalter
I think you mean the Hobbit, which was mentioned in parent comment. People
didn't like it because they are so used to blurry low frame rate movies. If
all movies were filmed like that, people would quickly get used to it. Soon
they wouldn't be able to stand 24 fps. It's objectively worse.

Similarly footage shot with digital cameras is often modified to look more
like film. There's no objective reason for it most of the time, people are
just used to the artifacts of film. Anything else "looks weird". Now stuff
like lens flare, shaky cameras, etc, are added to shots made entirely in CG!

I can get past that stuff because it doesn't make the quality that much worse,
but low fps definitely does. What's the point of having super high resolution
4k display, if the scenes displayed on it are incredibly blurry from a low
frame rate?

------
ccommsxx
looks like this contains a bunch of creative commons (CC-BY-SA) content ripped
from wikipedia without proper attribution. please add the missing attribution

[https://it.wikipedia.org/wiki/File:Pixel_geometry_01_Pengo.j...](https://it.wikipedia.org/wiki/File:Pixel_geometry_01_Pengo.jpg)

[https://en.wikipedia.org/wiki/Chroma_subsampling#/media/File...](https://en.wikipedia.org/wiki/Chroma_subsampling#/media/File:Colorcomp.jpg)

etc

~~~
dreampeppers99
I'm so sorry, I didn't mean to make it wrong, I tried really hard to put all
the references in a list
[https://github.com/leandromoreira/digital_video_introduction...](https://github.com/leandromoreira/digital_video_introduction#references)
!

I'll fix these attributions but also feel free to point me more or even PR.

------
profpandit
It's interesting to note that the architecture of the first ISO codec MPEG (1)
is almost identical to the one we have today H.265 That codec was standardised
in the late 90s So this design has carried through for about 20 years. Most of
the changes relate to the targeted parameters such as frame size, frame rate
and bitrate. Only the last step 264 --> 265 seems to have added new features.

This is a very well written introduction

~~~
samstave
Just curious as I never bothered to think about this before, in all H.
Codecs/standards... what does the "H" stand for?

~~~
barrkel
It's an ITU spec naming convention - think things like X.25. They're all
<letter>.<digit>+. The letters aren't often very mnemonic. The letter is a
large bucket classification, audiovisual and multimedia systems for H.

[https://en.wikipedia.org/wiki/ITU-T](https://en.wikipedia.org/wiki/ITU-T)

[http://www.itu.int/en/ITU-T/publications/Pages/structure.asp...](http://www.itu.int/en/ITU-T/publications/Pages/structure.aspx#H)

------
heydenberk
Xiph.org wrote fascinating stuff about video compression when working on their
next-generation codec, Daala
[https://xiph.org/daala/](https://xiph.org/daala/)

------
ilzmastr
This was also food for whiteboards in the show silicon valley. Compare:
[https://github.com/leandromoreira/digital_video_introduction...](https://github.com/leandromoreira/digital_video_introduction/blob/master/i/thor_codec_block_diagram.png)

with: [http://imgur.com/a/Sne89](http://imgur.com/a/Sne89)

~~~
bpicolo
Those don't really seem similar. Diagrams with boxes and arrows aren't
uncommon.

~~~
heydenberk
"Q" = Quantizer, "EC" = Entropy Coding, etc. It's definitely similar if you
look closer.

~~~
microcolonel
"LZW" -> Lempel–Ziv–Welch on the entropy codes doesn't really make sense
though (maybe that's why the word "stupid" is next to it).

------
city41
I liked that the green channel in Mario's picture was titled "Luigi". Nice
touch :)

------
kozak
The frequency of 60/1001 Hz and the situation where we are stuck with it
basically forever is a shame upon the entire profession of video engineers.

~~~
TeMPOraL
Any more background on the 60/1001Hz thing?

~~~
Houshalter
[https://m.youtube.com/watch?v=3GJUM6pCpew](https://m.youtube.com/watch?v=3GJUM6pCpew)

------
metaphor
Got really excited for a second thinking this was discussion on _transport_
technology as opposed to _encoding_.

~~~
minipci1321
What aspects of the transport technology are you specifically interested in?

~~~
samstave
Uh...assume I know absolutely nothing; can you start with a simple ELI5, and
then elaborate/point me off in the right direction with a seed foundation of
knowledge?

~~~
minipci1321
LOL, I'll try. Two major things to grasp about transport technology are the a)
"entry point" and b) the notion of time.

a) multiple types of media data are encoded independently and then bundled
together in what essentially looks like an endless file (called a stream
file). So when given a chunk of such a file, the decoder needs to quickly
identify the nearest offset in it where it can begin decoding simultaneously
all the individual media it needs. This is called "access point". Decoding
cannot be started at any random place in the stream, as it generally requires
context (so an access point allows to start decoding with the context being
empty for all required media -- audio, video, graphics, subtitles etc). Stream
file formats (called containers) are designed to solve this, provide access
points to the decoder, as easily and frequently as possible.

b) a decoder, when driven by a running presentation device -- video screen,
audio amplifier etc -- is essentially a pump. The encoder can be looked at
like a pump too, when they are separated by network. If decoder runs faster
than the source feeds it, it will drain the pipe and will make the
presentation device run idle (which will be noticeable to the consumer). If it
runs slower, at some point it will be drowned in data from the source. So the
pumping rhythm needs to be maintained identical between both ends. The most
practical way to synchronize the "piping clocksource" is via the stream file
itself (which has to carry time sample data for that). Again, different
containers solve this differently (some not at all).

EDIT: I didn't mention (should go into b)) the effort to make constant the
throughput of the pumping -- "constant bit rate", as I believe with the advent
of transport schemes which require point-to-point connections (as opposed to
multicast streaming), the importance of this goes lower now.

~~~
profpandit
Re: b) For video, the encoder contains a model of the decoder, including the
amount of buffering available to the decoder. The bit-rate controller at the
encoder uses this model to ensure that the decoder always has the right amount
of data in its input buffers. It also ensures that the information rate of the
channel is matched with that of the compressed stream in a live transmission
setting. The transport scheme which operates at a layer below the codec,
therefore only needs to take care of delay and packet delivery/loss related
issues over the channel. Media is typically transmitted over UDP.

------
0xelectron
This is really great. We seriously lacked a good introduction to video
technology.

------
alfg
This is awesome. I work in the VOD space, specifically in content protection
and this is great reference guide. I've been meaning to write a similar guide
for DRM.

------
rasz
first example interlacing image is wrong, shows running dogo with simulated
division into scan lines, but does not take into account timing difference -
that was one of the mayor sources of deinterlacing artifacts. Alternating
fields are 1/60 second apart in time.

[http://www.onlinevideo.net/2011/05/learn-the-basics-of-
deint...](http://www.onlinevideo.net/2011/05/learn-the-basics-of-
deinterlacing-your-online-videos/)

------
callesgg
I believe the next step in video compression will be more on smarts like
object tracking och object recognition.

Machine learning becoming more and more popular will probably help :)

~~~
profpandit
They started along that direction with MPEG-4. But it didn't go very far then.

------
m1el
How much of that repo is protected by patents and cannot be reused?

------
alextooter
Amazing work.

