Hacker News new | past | comments | ask | show | jobs | submit login
JSMpeg – Decode It Like It's 1999 (jsmpeg.com)
225 points by mmcclure on May 7, 2019 | hide | past | favorite | 58 comments

Keep in mind that MPEG-1 was designed to give acceptable performance on the computers of the time, so it is a far lower complexity codec than those which came after it, and explains why current hardware can provide acceptable performance with it even with JS overhead. The standard widely-used codecs roughly have a history like this:

    H.261 - first standard DCT-with-motion-compensation codec
    MPEG-1 - B frames, variable rates and sizes
    MPEG-2/H.262 - interlacing, 4:2:2 and 4:4:4 subsampling, other minor features
    H.263 (FLV) - mainly low bitrate improvements, introduction of intra prediction
    MPEG-4 part 2 (DivX/Xvid, etc.) - more prediction modes, some very advanced and little-used features (3D shape coding?)
    H.264 - different transform, even more prediction modes and features
    H.265 - not really familiar enough with this one to say
Also worth noting that this isn't a full MPEG-1 decoder, since it doesn't support B-frames. The justification given in the documentation is "no modern encoder seems to use these by default anyway", but B frames are what gives MPEG-1 a significant compression advantage over its predecessor, H.261; so I'd consider this implementation to be closer to an H.261 with variable frame sizes and framerates, which is useful enough.

ogv.js does Theora, VP8, VP9, and AV1 video and Opus and Vorbis audio:



A demo which lets you select various codecs to try out. You can, for example, play 720p VP9 video in Safari with the WebAssembly build of the decoder:


H.265's biggest changes are probably a change from fixed macroblocks (basically a fixed grid) to a tree based structure called a coding tree (see this paper for details: http://citeseerx.ist.psu.edu/viewdoc/download?doi=

It includes 8k and HDR color support (in 10 bit in v1 via main 10 profile, and 12 and 16 bit in v2).

In addition it moved from 9 prediction modes to 26.

"MPEG-4 part 2 (DivX/Xvid, etc.)" - did you mean "MPEG-2 part 4 (DivX/Xvid)", hence the (misleading) filename extension "mp4"?

MPEG4 the video codec is https://en.wikipedia.org/wiki/MPEG-4_Part_2

mp4 the media container is https://en.wikipedia.org/wiki/MPEG-4_Part_14

MPEG-2 part 4 is some conformance testing specification.

I almost did a double take when I saw this project.

We used this library two years ago in HS for some low-latency VR streaming. And yes, it's a little CPU-intensive so our smartphones got rather hot.

We tried other streaming protocols (such as H.264) but all of them supplied noticeable latency that made our system disorienting. Only JSMPEG was fast enough for our purposes. It's a fantastic library for any low-latency streaming! Highly recommend it.


Anyways, self-plug for our old project: https://rmj.us/motorized-live-360-video/. Basically, the smartphone's gryoscope controls a remote video camera that streams a live-feed to the user's headset.

Interesting project!

Curious what exactly you tried - H.264 is a codec, and there's a bunch of ways of delivering it to the client (HLS, WebRTC, some have build WebSocket-based streaming, ...), and I'd expect that the main latency is hidden there, not in the decoding?

Of course. FFMpeg was used consistently on our backend; we cycled through various codecs to find the best one for our purpose and changed the frontend library accordingly.

And I think you're right on the encoding latency. I believe that H.264 buffers a little bit before it makes a decision on how to compress the frames, where as MPEG1 doesn't? I could be completely wrong but my gut is telling me that MPEG1 is basically independent, slightly-modified JPEG frames.

> I could be completely wrong but my gut is telling me that MPEG1 is basically independent, slightly-modified JPEG frames.

Not that simple, but not that far off.

Basically MPEG1 has key-frames (“JPEG” frames) and (forward) delta-frames (diffs).

From my understanding H264 has many improvements, including reverse delta-frames.

My guess is that disabling those in the encoder will improve real-time capabilities.

Ah gotcha, I think I was mixing up MPEG1 and MJPEG in my head.

I didn't realize that you could disable those features in the encoder. I'll have to look into x264 tunables that lower latency. Might be interesting.

Bingo, that's one of the tuning you can use for real-time communication. There's a lot of ways to tune an encoder, and getting optimal setting for each use case is always tricky.

libx264 (which is the H.264 software encoder everyone uses) has a mode optimized for zero latency (--tune zerolatency).

Cool project, similar in spirit as to something I did a few years back as well ( https://github.com/Matsemann/oculus-fpv ). It's pretty trippy watching real stuff live, but your "head" is somewhere else. Cool that it's even possible to experiment with in the browser now.

Oh shoot wow that's really cool! Honestly, it's such a weird feeling viewing through that secondary perspective. I'm glad that someone else also came up with the idea.

We had a lot of fun by having someone hold the device mounted on a pole, looking down on the viewer, and following them around the room. This provided a weird, quasi-video-gamey third-person perspective, like in GTA or some other.

A friend once walked towards my office while we were video chatting (he was on his phone). Seeing the video of our office hallway on my screen, and realizing he was walking to my office disorientated me a little.

And these guys did that with a car: https://www.youtube.com/watch?v=nIRUavithF8

That experience sounds like 'virtual embodiment'. Researchers are using it for a bunch of different things as it can be quite powerful. So powerful, in fact, that some of the researchers have been advocating for limiting access to the technology and establishing a 'code of ethics' around its use. Basically, how you see the world is tied to an internal mental model of your own body that establishes where each limb is, how big you are in relation to your environment, etc... and virtual embodiment seems able to alter that very easily. With lasting effects.

One man ported Grand Theft Auto V to VR, and he got really freaked out when he shot a man in-game...

Virtual embodiment is a bit different from regular VR and much more intense. Sure, I imagine there will be people that get freaked out in regular VR since fake violence is literally what people think real violence actually IS now. The fact that the unreal depiction doesn't resemble actual violence in any way except on a vague conceptual basis is totally ignored by most people. Luckily, though, things like PTSD are rooted in direct physiological effects rather than cognitive ones, so it won't cause too many problems in the long run.

Virtual embodiment requires a separate camera in addition to the VR. What they do is have you don a VR helmet, and in that you see the environment around you - but actually it's the camera mounted on your face. Then, however, the perspective begins to move. Without you moving yourself, the view you see moves, turns around, and you can see yourself. At this point, you basically feel disembodied. But then the perspective is slowly moved, with your real body still in view, into a 'virtual avatar' of some sort. From that point, everything you see is from that avatars perspective, and as far as your brain is concerned, that IS your body.

They have used this to transfer the perspective of large, imposing men into the virtual bodies of small women, then they have large, imposing male VR characters come in and begin shouting at them. As they look down, they see their thin arms, their short stature, their lack of muscle, etc, and they get legitimately scared of the huge male figure confronting them. Tests afterward showed that the men who underwent this experiment (men who had previously been abusive to their partners) showed a marked improvement in their ability to recognize fear in the face of others, an impairment common to most abusers.

It's a fascinating topic, and one of the leading researchers using it, a Dr. Metzinger, has proposed that a VR Code of Ethics be considered. I don't personally think we really have the ability to competently form such a code since we're pretty early on and don't really know how things will affect people, but it is an issue that's being considered. Any time potential censorship of things like this is proposed, I always consider the case of actors in films and plays. They're already far more "immersed" and doing things more "interactively" than any technology is likely to enable us to do. They use real guns (loaded with blanks), shoot them at real people, see blood packs explode, see those real people they personally know crumple to the floor or wail in pain, etc... and they're fine.

We experimented with it five years ago trying to reduce the loading time of animations in a mobile game, but with the phones of the time we only managed to exchange waiting for the data with waiting for the devices to process the clips.

I guess nowadays this strategy would work much better, even considering that 4G is currently the norm.

Wow, that MPEG-1 encoded music video just hit me with a wave of nostalgia - not the song itself but the encoding!

I remember downloading similarly-encoded (and much worse) music videos at the time, pausing and resuming over multiple nights on services like KaZaa. Good times.

This is a really cool project and helped inspire me to write a "streamer" using ASM.js with FFMPEG on the front end and libav on the back end reading a UDP stream. You can send the TS packets over the Websocket as a data blob and decode them on the front with FFMPEG. I was able to get sub 100ms latency (probably a couple frames) on a local network. I wrote a demo which you can find here. https://github.com/colek42/streamingDemo

Not bad. I was able to run 1080p trailer of the movie in the following link with CPU ranging between 32% to 39% on my old MacBook Pro from year 2012.


I measured the same performance (~35% CPU)on my rMBP 2012 in Chrome under MacOS.

On my new Lenovo X1X I measured 4% CPU usage with Chrome under windows. Sure, the X1 has a new CPU, but I didn't expect this performance increase. Is this a windows optimization?

After reading your comment, I tried to disable the "WebGL" option on that perf web page and guess what ??? CPU usage bumped upto 64%, twice as before.

I think in the case of X1, the GPU card memory must be really good and WebGL is able to direct computation tasks to GPU

Similar thing for h265: http://strukturag.github.io/libde265.js/ Code: https://github.com/strukturag/libde265.js Difference is that this is using emscripten to compile the native code to JS. That was 5 years ago. Today, we would use WASM, of course.

Runs surprisingly smooth, even though a native decoder is clearly superior. However, there are still nice uses, like adding support for HEIF to the browser: https://strukturag.github.io/libheif/ Code: https://github.com/strukturag/libheif

Interesting how the FPS keeps raising over time, possibly the browser progressively optimising code hotspots?

Could this be used in ads instead of native players, circumventing browser limitations such as auto-play? Obviously, the implementation would have to get better, but perhaps WebAssembly would make it just bearable enough that ad-ridden sites would allow them to eat up your CPU?

> JSMpeg can decode 720p Video at 30fps on an iPhone 5S, works in any modern browser (Chrome, Firefox, Safari & Edge) and comes in at 42kb gzipped.

What is its power use, versus native decoding?

A few years ago video tags were very limited on iOS - they would only play fullscreen and couldn't be autoplayed (which would have been useful for GIFs, even MPEG1 is more bandwidth effective than GIF).

None of those restrictions remain now, so this is more just an interesting proof of concept at this point.

A lot I would think. It was using "150%" cpu on my 2015 MacBook Pro.

Excellent. Goes nicely with the Python implementation of ZFS from the other day.

Surely there must be a python interpreter written in Node we can use somehow.

Remember: We need full stack!


One disadvantage I discovered: it stops playing when you go to another tab

That can be an advantage based on what video you may be watching and where ;)

I've used this in a production project I can't really talk about. It performed super well and I was consistently impressed.

kb != kB.

This seems tremendously CPU intensive, my fan spun up after two seconds of playing video. The idea is interesting, but until it does it with GLSL shaders or something to get that sweet hardware acceleration,it's just a proof that it CAN be done, rather than a practical solution.

It'll use shaders if WebGL is supported and enabled: https://github.com/phoboslab/jsmpeg/blob/master/src/webgl.js

There is also a site to test features and their impact on performance: https://jsmpeg.com/perf.html

> rather than a practical solution.

I have a very practical solution for you: bypassing the autoplay restriction for ads.

Ohh that's great! Terrible! But great!

This is why videos are allowed to autoplay if muted.

Otherwise sites would use GIFs or something like this, which is much less efficient.

Well, this allows to autoplay even if not muted :-P.


This would be very user hostile.

You mean user engagement will improve, right? :-P

Like every other form of advertising.

> it's just a proof that it CAN be done

Which is a perfectly valid reason to build something. But this project is 6 years old (almost to the day, according to Github) and has over 3700 stars, so it seems to be a bit more than a mere novelty.

> it's just a proof that it CAN be done, rather than a practical solution.

Thankfully this is Hacker News, not Practical News.

Straight from the minified version:

> JSMpeg.Renderer.WebGL.IsSupported()?new JSMpeg.Renderer.WebGL(options):new JSMpeg.Renderer.Canvas2D(options);

Not sure why you didn't check if it supported WebGL or not.

Not stressing my 5 year old laptop out too much.

Now compare the amount of data it consumes vs h264 for the same resolution/quality... This isn't going to fly. And, WebRTC IS supported on iPhones now, which kills the main area of use for it.

How much? Probably only 25-50% more? It's hard to guess, but I imagine it isn't more than a factor of 2.

It's more like 5-10x better at the same quality vs. h.264. MPEG-1 is 30 years old. The available computational capacity along with the tremendous investment in algorithmic research in that period cannot be understated.

I'd guess factor 4-10. Most video codec implementations are heavily optimized including use of assembly or SIMD intrinsics. So it's not only that JS is slower for the same code, there are also some optimized constructs that are not possible in JS.

anovikov and I are referring to filesize.

At least by a factor of 4 and by a factor of 10 in some conditions. MPEG-2 is very processor efficient but in exchange for very low bandwidth efficiency. Today we watch full movies in 720p and decent quality <1GB size, and in MPEG-2 in late 1990s, 2 video CDs (also ~1GB) were needed for a crappy grainy VHS-style movie. Nobody was much pissed because alternative was VHS itself.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact