Keep in mind that MPEG-1 was designed to give acceptable performance on the computers of the time, so it is a far lower complexity codec than those which came after it, and explains why current hardware can provide acceptable performance with it even with JS overhead. The standard widely-used codecs roughly have a history like this:
H.261 - first standard DCT-with-motion-compensation codec
MPEG-1 - B frames, variable rates and sizes
MPEG-2/H.262 - interlacing, 4:2:2 and 4:4:4 subsampling, other minor features
H.263 (FLV) - mainly low bitrate improvements, introduction of intra prediction
MPEG-4 part 2 (DivX/Xvid, etc.) - more prediction modes, some very advanced and little-used features (3D shape coding?)
H.264 - different transform, even more prediction modes and features
H.265 - not really familiar enough with this one to say
Also worth noting that this isn't a full MPEG-1 decoder, since it doesn't support B-frames. The justification given in the documentation is "no modern encoder seems to use these by default anyway", but B frames are what gives MPEG-1 a significant compression advantage over its predecessor, H.261; so I'd consider this implementation to be closer to an H.261 with variable frame sizes and framerates, which is useful enough.
I almost did a double take when I saw this project.
We used this library two years ago in HS for some low-latency VR streaming. And yes, it's a little CPU-intensive so our smartphones got rather hot.
We tried other streaming protocols (such as H.264) but all of them supplied noticeable latency that made our system disorienting. Only JSMPEG was fast enough for our purposes. It's a fantastic library for any low-latency streaming! Highly recommend it.
---
Anyways, self-plug for our old project: https://rmj.us/motorized-live-360-video/. Basically, the smartphone's gryoscope controls a remote video camera that streams a live-feed to the user's headset.
Curious what exactly you tried - H.264 is a codec, and there's a bunch of ways of delivering it to the client (HLS, WebRTC, some have build WebSocket-based streaming, ...), and I'd expect that the main latency is hidden there, not in the decoding?
Of course. FFMpeg was used consistently on our backend; we cycled through various codecs to find the best one for our purpose and changed the frontend library accordingly.
And I think you're right on the encoding latency. I believe that H.264 buffers a little bit before it makes a decision on how to compress the frames, where as MPEG1 doesn't? I could be completely wrong but my gut is telling me that MPEG1 is basically independent, slightly-modified JPEG frames.
Bingo, that's one of the tuning you can use for real-time communication.
There's a lot of ways to tune an encoder, and getting optimal setting for each use case is always tricky.
Cool project, similar in spirit as to something I did a few years back as well ( https://github.com/Matsemann/oculus-fpv ). It's pretty trippy watching real stuff live, but your "head" is somewhere else. Cool that it's even possible to experiment with in the browser now.
Oh shoot wow that's really cool! Honestly, it's such a weird feeling viewing through that secondary perspective. I'm glad that someone else also came up with the idea.
We had a lot of fun by having someone hold the device mounted on a pole, looking down on the viewer, and following them around the room. This provided a weird, quasi-video-gamey third-person perspective, like in GTA or some other.
A friend once walked towards my office while we were video chatting (he was on his phone). Seeing the video of our office hallway on my screen, and realizing he was walking to my office disorientated me a little.
That experience sounds like 'virtual embodiment'. Researchers are using it for a bunch of different things as it can be quite powerful. So powerful, in fact, that some of the researchers have been advocating for limiting access to the technology and establishing a 'code of ethics' around its use. Basically, how you see the world is tied to an internal mental model of your own body that establishes where each limb is, how big you are in relation to your environment, etc... and virtual embodiment seems able to alter that very easily. With lasting effects.
Virtual embodiment is a bit different from regular VR and much more intense. Sure, I imagine there will be people that get freaked out in regular VR since fake violence is literally what people think real violence actually IS now. The fact that the unreal depiction doesn't resemble actual violence in any way except on a vague conceptual basis is totally ignored by most people. Luckily, though, things like PTSD are rooted in direct physiological effects rather than cognitive ones, so it won't cause too many problems in the long run.
Virtual embodiment requires a separate camera in addition to the VR. What they do is have you don a VR helmet, and in that you see the environment around you - but actually it's the camera mounted on your face. Then, however, the perspective begins to move. Without you moving yourself, the view you see moves, turns around, and you can see yourself. At this point, you basically feel disembodied. But then the perspective is slowly moved, with your real body still in view, into a 'virtual avatar' of some sort. From that point, everything you see is from that avatars perspective, and as far as your brain is concerned, that IS your body.
They have used this to transfer the perspective of large, imposing men into the virtual bodies of small women, then they have large, imposing male VR characters come in and begin shouting at them. As they look down, they see their thin arms, their short stature, their lack of muscle, etc, and they get legitimately scared of the huge male figure confronting them. Tests afterward showed that the men who underwent this experiment (men who had previously been abusive to their partners) showed a marked improvement in their ability to recognize fear in the face of others, an impairment common to most abusers.
It's a fascinating topic, and one of the leading researchers using it, a Dr. Metzinger, has proposed that a VR Code of Ethics be considered. I don't personally think we really have the ability to competently form such a code since we're pretty early on and don't really know how things will affect people, but it is an issue that's being considered. Any time potential censorship of things like this is proposed, I always consider the case of actors in films and plays. They're already far more "immersed" and doing things more "interactively" than any technology is likely to enable us to do. They use real guns (loaded with blanks), shoot them at real people, see blood packs explode, see those real people they personally know crumple to the floor or wail in pain, etc... and they're fine.
We experimented with it five years ago trying to reduce the loading time of animations in a mobile game, but with the phones of the time we only managed to exchange waiting for the data with waiting for the devices to process the clips.
I guess nowadays this strategy would work much better, even considering that 4G is currently the norm.
Wow, that MPEG-1 encoded music video just hit me with a wave of nostalgia - not the song itself but the encoding!
I remember downloading similarly-encoded (and much worse) music videos at the time, pausing and resuming over multiple nights on services like KaZaa. Good times.
This is a really cool project and helped inspire me to write a "streamer" using ASM.js with FFMPEG on the front end and libav on the back end reading a UDP stream. You can send the TS packets over the Websocket as a data blob and decode them on the front with FFMPEG. I was able to get sub 100ms latency (probably a couple frames) on a local network. I wrote a demo which you can find here. https://github.com/colek42/streamingDemo
I measured the same performance (~35% CPU)on my rMBP 2012 in Chrome under MacOS.
On my new Lenovo X1X I measured 4% CPU usage with Chrome under windows. Sure, the X1 has a new CPU, but I didn't expect this performance increase. Is this a windows optimization?
Could this be used in ads instead of native players, circumventing browser limitations such as auto-play?
Obviously, the implementation would have to get better, but perhaps WebAssembly would make it just bearable enough that ad-ridden sites would allow them to eat up your CPU?
A few years ago video tags were very limited on iOS - they would only play fullscreen and couldn't be autoplayed (which would have been useful for GIFs, even MPEG1 is more bandwidth effective than GIF).
None of those restrictions remain now, so this is more just an interesting proof of concept at this point.
This seems tremendously CPU intensive, my fan spun up after two seconds of playing video.
The idea is interesting, but until it does it with GLSL shaders or something to get that sweet hardware acceleration,it's just a proof that it CAN be done, rather than a practical solution.
Which is a perfectly valid reason to build something. But this project is 6 years old (almost to the day, according to Github) and has over 3700 stars, so it seems to be a bit more than a mere novelty.
Now compare the amount of data it consumes vs h264 for the same resolution/quality... This isn't going to fly. And, WebRTC IS supported on iPhones now, which kills the main area of use for it.
It's more like 5-10x better at the same quality vs. h.264. MPEG-1 is 30 years old. The available computational capacity along with the tremendous investment in algorithmic research in that period cannot be understated.
I'd guess factor 4-10. Most video codec implementations are heavily optimized including use of assembly or SIMD intrinsics. So it's not only that JS is slower for the same code, there are also some optimized constructs that are not possible in JS.
At least by a factor of 4 and by a factor of 10 in some conditions. MPEG-2 is very processor efficient but in exchange for very low bandwidth efficiency. Today we watch full movies in 720p and decent quality <1GB size, and in MPEG-2 in late 1990s, 2 video CDs (also ~1GB) were needed for a crappy grainy VHS-style movie. Nobody was much pissed because alternative was VHS itself.