
MPEG1 Single file C library - phoboslab
https://phoboslab.org/log/2019/06/pl-mpeg-single-file-library
======
izacus
> This gross oversight in the overengineered (especially for its time) MPEG-PS
> and MPEG-TS container formats just leaves me dumbfounded. If anybody knows
> why the MPEG standard doesn't just provide a byte size in the header of each
> frame or even just a FRAME_END code, or if you have a solution for this
> problem, let me know!

Because the video encoding was created in 1988 and the mux format in 1995 when
large amounts of fast RAM were incredibly expensive and recording/transcoding
and processing devices didn't always even have a framebuffer to store a full
frame. Many many MPEG-1, MPEG-2 and even MPEG4 AVC Baseline limitations become
very obvious when you consider that they were encoded on CPUs that might be
slower than 150MHz and be decoded on devices which may only have a few
macroblocks worth of storage for decoded frame.

> Interestingly, if I interpret the source correctly, ffmpeg chose the second
> option (waiting for the next PICTURE_START_CODE) even for the MPEG-TS
> container format, which is meant for streaming. So demuxing MPEG-TS with
> ffmpeg always introduces a frame of latency.

I think the confusion here is because MPEG-TS was created for broadcast TV
streaming, not realtime streaming. Broadcast TV can easily be seconds behind
the source these days and has probably travelled at least once from
geostationary orbit so one frame really isn't something anyone cares about.
The more modern HLS/DASH formats tend to be even worse at this, with many
sources waiting for a full several-second long chunk to be complete before
transmitting it to the viewer's device.

~~~
keithwinstein
I think the MPEG people and the authors of pl_mpeg probably just differ on how
"very hairy" it is to represent the byte-by-byte state of a single-threaded
decoder. In other words, how hard is it to create a continuation object that
lets the decoder run out of buffer at any byte location, return to the caller,
and later resume from the same place when more bytes are available? The MPEG
people were mostly thinking about hardware decoders, but this is not _that_
hard in software -- and it's been done successfully by every major decoder
implementation afaik.

The page writes: " _if we 're in the middle of decoding a video frame and the
buffer doesn't have any more bytes yet (e.g. because it's streaming from the
net) we would need to pause the decoder, save its exact state and later, when
enough bytes are available, resume it again. Of course this isn't particularly
difficult to achieve using threads, but if we want to stay single threaded it
gets very hairy._"

But I'm pretty sure the MPEG people would just say, "look, this is not that
hard and was a solved problem 20 years ago. We're not going to introduce a
1-frame delay at the encoder (which is what you want us to do by putting a
frame length in the PES header) but we're also not making you introduce a
1-frame delay at the decoder. Just do what libmpeg2 has been doing since 1999:
make a state object that represents the byte-by-byte evolving state of the
decoder. Update the state object as you go. If you run out of bytes in the
buffer, return STATE_BUFFER to the caller. When the caller gives you more
bytes later, use the state object to resume from where you left off."

"Here's what that object looks like for MPEG-2: [https://github.com/cisco-
open-source/libmpeg2/blob/master/li...](https://github.com/cisco-open-
source/libmpeg2/blob/master/libmpeg2/mpeg2_internal.h#L67-L155)

Yes, it's not beautiful, and yes, it would be more elegant if these variables
were broken out into different scopes (current block, current slice, current
picture, current sequence), but this is basically what you're in for and it's
not that hard. And it avoids a 1-frame delay at either end. (And no, you don't
need a 1-frame delay just because you're demuxing from a transport stream
either -- just throw each TS packet into the video ES decoder as soon as you
get it.)"

~~~
phoboslab
Thanks for the explanation! It hadn't occurred to me that an encoder would
spit out packets for part of a frame, while it's still encoding. For what it's
worth, ffmpeg's public API - at least the one I used in a previous project[1]
- only produces full frames.

The MPEG-2 state object you linked to looks a lot like the private data of my
decoder already[2]. I wonder if there's any restriction on when a packet may
be concluded. I.e. do MPEG-PS packets have to contain full slices, or can they
be cut off in the middle of slice?

The "hairy" part with my current design would be to reproduce the call stack.
Again, if the decoder would live in its own thread, it would be a no-brainer.

> and it's been done successfully by every major decoder implementation afaik

As far as I can tell, ffmpeg's decoder does not allow for this. It always
searches for the next picture's START_CODE before starting to decode the
frame. Similarly, the libmpeg2 source you linked to doesn't seem to provide
any functionality to resume decoding from anywhere in the stream either!? The
NEEDBITS and DUMPBITS macros just assume there's always more data.

[1] [https://github.com/phoboslab/jsmpeg-
vnc/blob/master/source/e...](https://github.com/phoboslab/jsmpeg-
vnc/blob/master/source/encoder.c#L76)

[2]
[https://github.com/phoboslab/pl_mpeg/blob/master/pl_mpeg.h#L...](https://github.com/phoboslab/pl_mpeg/blob/master/pl_mpeg.h#L1775-L1825)

~~~
keithwinstein
Well.... now that I've gone to read the code again, you're absolutely right
that I was too hasty in calling it the "byte-by-byte" state. It's more like
the "slice-by-slice" state. libmpeg2 goes header-by-header, so, it can process
each individual slice without waiting for the next picture start code, but it
does buffer up a whole slice before starting any work.

If you just give it a single byte (or any number of bytes that doesn't include
some subsequent start code, including a sequence_end_code for the end of the
whole video), it just copies it to an internal buffer and then asks for more
until it sees the beginning of the next slice or some other header. That's why
NEEDBITS and DUMPBITS don't have to bail out in the middle -- by the time you
get there, you know they have a whole slice to play with. So, yes, libmpeg2
does go start-code (or sequence_end_code) by start-code -- but not a _picture_
start code.

ffmpeg/libavcodec is a wrapper around like 75+ different decoders, so I'm not
too surprised if they have to go with a least-common-denominator interface.

In general an MPEG-2 TS or PS packet is just a fixed size packet and doesn't
have to be aligned with any ES syntax element. Typically the PES packets (the
much larger packets encapsulated in PS/TS packets) _do_ contain exactly one
video picture (i.e. the data_alignment_indicator is set on every video PES
packet), but even this isn't formally required. Note that the PES packet
header also includes an _optional_ length field that would do what you want
(but it's optional, in part to accommodate encoders that don't want to buffer
the whole image before starting to encode pixels).

You might be interested in our TS/PES demuxing code that wraps libmpeg2/liba52
and tries to maintain a/v sync in the presence of arbitrary corruption -- it's
more than half the size of your entire decoder!

[https://github.com/StanfordSNR/puffer/blob/master/src/atsc/d...](https://github.com/StanfordSNR/puffer/blob/master/src/atsc/decoder.cc)

------
Scaevolus
MPEG2 is mostly patent-free too now, though I'm not sure how much larger its
decoder would be.

~~~
keithwinstein
The core of libmpeg2 is 3,810 lines of C (not including any systems-stream
demultiplexer or audio decoder) versus about 2,400 for pl_mpeg.h. So not
dramatically larger.

On the other hand, for progressive-scan 30fps video on computers without a VBV
constraint and with a deterministic known decoder, I'm not sure any of the
extra features in the MPEG-2 spec are very helpful compared with MPEG-1.

------
jhallenworld
Do you have the encoder side of this also?

------
ksec
I have been wondering if we could built something on top of MPEG2 and AC3 and
MP3, Codec with patents that had expired and something that is truly patents
free. Which reminds me of Musepack, based on MPEG-1 Layer 2 [1]. Truly amazing
quality at the time even when comparing to high bitrate AAC.

[1] [https://www.musepack.net](https://www.musepack.net)

------
commandlinefan
Love to see stuff like this. I wonder why he put all of the code in a header
file, though... I've never seen that done before; it seems like it would make
it impossible to invoke this from two separate source files?

~~~
newnewpdro
It's somewhat trendy these days, but the main reason I've found is windows
developers struggle with adding third-party dependencies to their projects
because their development environment sucks.

~~~
Impossible
Although this is getting downvoted because it is somewhat inflammatory (stops
just short of using "winblowz" and "M$"), there is some truth to it. Visual
Studio _does_ have a package manager but its not widely used, and in practice
can't be used in a lot of environments that C\C++ Windows developers (often
game developers) are using. That said, I've seen stb_image used widely in
environments that are not Windows because even if you have package manager and
a build system that manages dependencies better than vanilla MSBuild it's
still lower friction to include a single header file library. Last time I used
xcode, dependency management was about as painful as visual studio but I might
have been doing it wrong.

------
VikingCoder
How long until someone emscripten's this, so it runs directly in JS in the
browser?

And how would that compare to others?

[https://jsmpeg.com/](https://jsmpeg.com/)

~~~
flohofwoe
Here you go :)

[https://floooh.github.io/sokol-html5/plmpeg-
sapp.html](https://floooh.github.io/sokol-html5/plmpeg-sapp.html)

------
taneq
As an aside, it makes me happy that Bink is still around. I remember using
their Bink video codec for VJing way back in 2000 or whatever.

------
xmichael999
Excellent stuff, really excellent. Thanks for the additional links and cool
tangents too!

