Hacker News new | past | comments | ask | show | jobs | submit login
FFmpeg for browser and Node, powered by WebAssembly (ffmpegwasm.netlify.app)
274 points by MrRolling 3 months ago | hide | past | favorite | 87 comments



FFmpeg is licensed under LGPL which can (albeit carefully) work on the web, but libx264 is GPL. I think that alone makes this project unusable in non-open-source projects? (For reference, GPL does not have any dynamic linking exception, so the copyleft terms apply as soon as you include it in your shipped binary in any way)

Aside from the OSS licenses alone, H.264 encoders/decoders have a required patent license. Most users get around this by using an API (e.g. `<video>` tags, AVFoundation) - which means Google/Apple pay for it when they ship the decoder for Chrome/iOS. How does this project get around that requirement?


> I think that alone makes this project unusable in non-open-source projects?

I hope so. I mean, that's the only reason for GPL to exist, right? There's quite a few proprietary programs that decided to relicense as GPL in order to use a GPL library.


IANAL but you don't need to relicense your program to GPL just to use a GPL library. You can relicense to something that is compatible with GPL, like say the MIT license. Then the combined form has to be distributed under GPL terms. But when you one day replace the library with something not under GPL, you can distribute combined forms of those under MIT.


This must be why so many proprietary apps require you to manually download FFmpeg binaries during runtime. Thanks for the explanation!


That is correct. If you want to redistribute a mixture of different program components as one program, all the pieces have to have licenses that are compatible. If any of them are GPL, then the whole thing will be GPL; all the other licenses have to be cool with being part of a whole that carries a GPL umbrella license. MIT and BSD pieces are this way.


Yes, that's true. I mean the whole work to be under GPL. The exact terms of your own code is less important, as long as it is GPL compatible.


I think the original point of GPL was to incentivise companies/developers to contribute changes back to GPL licensed project. What is clear is that, minus a few exceptions, most large open source engaged companies avoid GPL because of the risk of compliance [1]. And many authors that originally licensed their projects as GPL see this and regret their choice [2][3]. This is why there is a trend away from copyleft licenses [4].

[1] https://opensource.google/docs/thirdparty/licenses/#restrict...

[2] https://lwn.net/Articles/478361/

[3] https://twitter.com/id_aa_carmack/status/1412271091393994754...

[4] https://opensource.com/article/17/2/decline-gpl


No, that was the point of LGPL (Lesser or Library GPL)

The point of GPL software is that it's to be used in a GPL system, where users retain modification rights to the system they've acquired, and to prevent proprietary systems from free-loading off of GPL work.


Yep, it's funny to see people complaining about it when it is literally working as intended.


Even LGPL seems to have quite some FUD around the compatibility with distributing resulting software in app stores.

This seems to be quite at odds with the (maybe mistaken?) idea of a license that allows embedding a component without "license contaminating" the entire project, but still requires publishing changes to the component itself.

I wonder if there is a license that actually achieves that, without restricting the ability to publish to app stores?


That's not the only idea behind the LGPL (v3 especially), so it's not surprising it doesn't fulfill that without restricting other things.

MPL might be more what you are looking for.


The original point of the GPL was to ensure that end-users retained their rights to the source code. See https://www.gnu.org/philosophy/free-sw.en.html#four-freedoms.


There’s the philosophical intent of the FSF and then there’s what individual project authors intended by adopting GPL.

I think OP correctly explains the perspective of some/many/or even most authors.

Arguably, given the decline of GPL as the de facto license for OSS might indicate this viewpoint is at least an accurate perspective today and may have been the underlying truth all along. Certainly growing up I think I cared more about the concept of contributing improvements back than necessarily being able to have source to all derivative products and didn’t trust corporations to give back (I still don’t, but the situation doesn’t seem quite so serious for most projects).


That's just one point; but note that the LGPL also ensures that users retain those rights.

Yet here is a difference between the GPL and LGPL.

A GPLed piece cannot be mixed together pieces that have incompatible licenses.

This is true even if the pieces are mixed together in a dynamic way that respects their boundaries, so that users can rebuild the GPL-ed piece from sources, make modifications, and slide the modified piece back into the combination.

This is part of the point of the GPL, and specifically that point which is relaxed by the Lesser GPL, which is called "lesser" for some ideological reason.


This is a bit reductionist but a fun story for those that haven’t heard it. The GPL was created so rms could modify his printer driver: https://www.gnu.org/philosophy/rms-nyu-2001-transcript.txt .


Inspired by that situation, at least


Most "most large open source engaged companies " want to leach off open source and do not want to create end user products that are open source.

They focus on open source development tools, and libraries. Not end-user software products.

This is why they reject GPL because they want to be able to use the Tools and Libraries to package up a commercial software to sell to end users.

They do not actually support the principles of Free Software, it is simply away for them to lower development costs and off shift capital expenses while getting some good PR about being "open source engaged"


You can pay for a commercial license for x264 (or x265) if you don't want to touch the GPL


Oh wow I didn’t realize they did custom licensing (presumably for a good chunk of change). Good to know!


What’s also important to know is that the commercial license (IIRC) doesn’t include access to the patent pool. I presume that when going through the commercial license the x264/x265 people will walk you through or otherwise assist in the MPEG-LA stuff:

https://www.mpegla.com/

It trips people up: the commercial x264/x265 license provides alternate use terms on the implementation. The implementation, however, is still protected by various patents worldwide and MPEG-LA is who you go through for access to the patent pool which is (supposed to) include payment to all of the identified patent holders.


IANAL the patent are required only if your devices does not already have an H.264/H.265 license.

H.264 is fine as it is literally the MP3 of Video. H.265 is a lot more complicated.


That is true but if you’re using libx264/libx265 you’re not using hardware encode/decode blocks that include the patent license so MPEG-LA is back in the picture.


Again IANAL but license are per devices and not per IP blocks so you dont double pay for each and every components ( Hardware Encode Decode in CPU and GPU ). And generally speaking that is up to the OEM and Devices Manufactures to have it done.


IANAL disclaimer as well but each component (hardware block, x264/x265 instance) is (IIRC) considered a separate "unit" [0].

[0] Slide 10, bullet one https://www.mpegla.com/wp-content/uploads/avcweb.pdf


How does one develop an understanding of all this? I would have completely missed this had I been the author of a similar project.


It is almost as if JavaScript/WebAssembly need a sandboxing feature that allows a library to be LGPL (or GPL?) compatible - legally callable from your own JavaScript source code.

I wonder if a legal sandboxing feature could be designed, perhaps even misusing the technical words to munge the sandbox to be compatible? Misuse words like header file, Library, link, object code etcetera to make the sandbox legally compatible?


What does “need” mean? I think it would be pretty messed up for browsers to ship a way to legally embed GPL code within closed-source binaries.

In a sense, though, that feature does exist — it’s called an HTTP request.


Or IPC? Does WebAssembly have something like Web Workers?


Not a lawyer but I’m pretty sure you can’t circumvent the GPL just by running the code in a different thread.


It would be a different process, not a different thread. For example executing GCC from a proprietary IDE doesn't seem to be a problem.


It's a problem if your IDE is designed to work only with gcc. It's not a problem, if it could work with any compiler and gcc is just one of the options.


My understanding is that it matters whether the GPL software is distributed with your program. If your IDE installed gcc, you would have to license it as GPL — even if it also installed other compilers. (It would be fine for your IDE to use a preexisting gcc installation obtained in some other fashion, though).


If I was a copyright lawyer, I would probably take a dim view of a defense that amounts to "it's not using your client's code, it's using code that directly uses your client's code."


With WebAssembly, it's running in a separate virtual machine. If that didn't insulate it, there would be no reason for the AGPL.


My understanding of the AGPL and its history is that it was meant to close a perceived loophole with SaaS: companies selling you access to services built on GPL'd code were not compelled to distribute their changes.

The is significantly different a local virtual machine, of which the JVM is one: the code is still fundamentally being distributed to the client, which triggers the release clause in the original GPL. To the best of my knowledge, nobody has ever (successfully) claimed that executing JVM bytecode releases them from their obligations under the GPL.


What about GitHub Enterprise? Isn't that running it as a service, which happens to be located inside your machine?


That's a really good question!

Has anyone here used the self-hosted version of GitHub Enterprise? Did they really reimplement git?

Edit: Apparently GitHub uses https://libgit2.org/ which is "GPLv2 with a Linking Exception" (equivalent to LGPL)


My understanding of GHE is that it uses libgit2 internally, which in turn has a linking exception in its license (just like GCC and glibc). It’s unlikely that they’re violating GPL in that particular way.


GitHub Enterprise is proprietary software. To the best of my (external observer's) knowledge, it has no GPL code in it. If it does, then that's probably a licensing violation, but IANAL.


As I (not a lawyer) understand it, the spectrum here is tight/loose binding on a conceptual level, rather than the specific technology used to achieve it.

The static/dynamic linking concern seems like a red herring, as it reflects a specific technological instance of such a distinction, but in architectures that do linking differently than traditional Unix systems, it makes less and less sense.


Meh, you don't need any special features to use LGPL libs. As long as the user can recompile and swap out that library with their new version, you're done. This either means dynamic linking the library or making object files for your proprietary code available so that the user can link it.

Oh and this will never work for GPL.


I guess this could be used to remux the broken files that are spitted by the MediaRecorder API, which have missing metadata that prevents from seeking and thus far has been ignored / sweeped away in Chrome [1], Firefox [2], and even the standard itself [3], which ignored in its design the basic fact that encoding any file should include a "closing" stage (where metadata is written) before yielding it as a finished file.

[1]: https://bugs.chromium.org/p/chromium/issues/detail?id=642012

[2]: https://bugzilla.mozilla.org/show_bug.cgi?id=1283464

[3]: https://github.com/w3c/mediacapture-record/issues/119


This is pretty neat, and possibly a really interesting way to show people what WebAssembly can do!! I can see myself using this for a project I have been thinking of, and when I get around to giving it a go, it will definitely come up easily in my mind!


Big big question here: how can you pipe inputs to it? In a browser version. Say you want to transcode video right in the browser after capturing it with MediaRecorder and producing say, HLS chunks. Getting output is easy, but how to pipe input chunks there? All examples for browser JS i could see only deal with the ready files on input :/


There's an example of that here: https://blog.scottlogic.com/2020/11/23/ffmpeg-webassembly.ht...

<ctrl-f>Creating a Streaming Transcoder


it uses the [0]File System API of emscripten. i think it is possible to pipe via stdin somehow but i have no experience with any of this stuff but am interested to do something similar myself. although i don't want to transcode but repackage input streams for browser consumption. I also found [1]this which implements various filesystem backends in a compatible way afaict although the XHR backend [2]requires some index file i would like to avoid and its also not clear to me if this will actually stream data and/or use range requests.

[0] https://emscripten.org/docs/api_reference/Filesystem-API.htm...

[1] https://github.com/jvilk/BrowserFS

[2] https://github.com/jvilk/BrowserFS/blob/a96aa2d417995dac7d37...


> Your browser doesn't support SharedArrayBuffer, thus ffmpeg.wasm cannot execute. Please use latest version of Chromium or any other browser supports SharedArrayBuffer

Sad face


MDN says SharedArrayBuffer is supported on Chrome, Edge and Firefox.

What browser are you using that doesn't support it? Safari?

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


Safari disabled it due to speculative execution threats. Not sure why they haven’t re-enabled while chrome has: https://bugs.webkit.org/show_bug.cgi?id=212069



I got that error on Safari desktop (14.1.2), yes.


I had a project I was working on that I wanted to generate downloadable videos from a canvas element. It all works in JavaScript - the problem is I can’t mix the final mp4 with audio in browser, was hoping this could solve it - but it doesn’t work on mobile so I’ll still have to use a server for that.


I get the message with safari on mobile


Add this to module.exports.

async headers() { return [ { source: '/', headers: [ { key: 'Cross-Origin-Embedder-Policy', value: 'require-corp', }, { key: 'Cross-Origin-Opener-Policy', value: 'same-origin', }, ], }, ] },


What’s this meant to do?


There’s also a ffmpeg port to WebAssembly that can be used from Rust: https://github.com/jedisct1/rust-ffmpeg-wasi or linked to a Zig project targeting WebAssembly: https://github.com/jedisct1/tinyglitch


More of an introduction from when this was new in November 2020:

https://jeromewu.github.io/ffmpeg-wasm-a-pure-webassembly-ja...


I've tried using this for encoding short clips and it was 10-20x slower than native ffmpeg.


Do you think this is to do with wasm having no threading support at the moment?


ffmpeg has had a long history of optimizations, but no time has been spent targeting webgpu and what not.

It could be much faster, but it's not going to beat native without some sort of distributed behavior.


Interesting that asm is opening the door for stuff like this. For compiled node modules with bindings, how does asm compare to binaries? i.e. compiling to asm instead of targeting a particular architecture. Seems like it could be a little neater.


I had a back and forth with some node developers on github a few years ago about this.

Napi (node js's native api) is a much richer API. It allows you talk to V8 & interact with javascript objects directly. So, you can create native javascript objects from native code, attach custom (hidden) properties to existing JS objects, interact with the prototype chain, create promises, make native functions which call C directly, etc. All of this stuff is (currently) way more awkward from wasm. Even just moving data across the wasm-to-javascript divide is a hassle. Wasm compilers solve it for you - but usually by adding a big chunk of generated JS.

Between that and the VM slowdown, the code I've been working with lately, wasm ends up about 3x slower than when I just run it natively.

Wasm code also can't access the native system level stuff - so, no filesystem access, no kernel APIs, etc. Though depending on who you ask, this might be a feature.

But wasm works everywhere - including in the browser, and from Python and Go. With napi you need to compile your code separately for every operating system and CPU architecture pair you want to support. This is a huge pain when publishing an npm module. With wasm its just, compile once, run anywhere. For that reason alone I wouldn't be surprised if wasm became the default recommended way to ship native modules with node soon; at least for modules which aren't super performance sensitive. And there's been talk of exposing the JS GC to wasm for a few years. Hopefully when that stuff lands, it'll get easier to marshal objects across the gap.


> And there's been talk of exposing the JS GC to wasm for a few years. Hopefully when that stuff lands, it'll get easier to marshal objects across the gap.

You don't need a Wasm GC to do this. If you only need js objects to pass on to, say, the host's function or check is null or not, then reference types that are opaque external references: https://github.com/WebAssembly/reference-types/blob/master/p...

You can even do many more things if you export `Reflect` to WebAssembly: https://github.com/AssemblyScript/assemblyscript/blob/main/t...

Reference Types are available almost everywhere already (In Safari will be available after 15.0): https://webassembly.org/roadmap


It's fitting, how Fabrice Bellard's ideas and creations are coming together.


This is kind of cool. Probably a pretty reasonable thing to use if you need arbitrary code decoding on a web page without having to worry about explicit browser support :)


This is a great tool. I wish there was something like this for Gstreamer, as i use it more than ffmpeg.

FFMpeg and gst, and the external encoder libraries, contain alot of simd optimzied code.

Any benchmarks for how the ffmpeg wasm version compares to a ffmpeg native build in terms of decoding/encoding?

Also discussed here btw:

https://news.ycombinator.com/item?id=24987861


I am just afraid WebAssembly is another way for hackers to get in. Browser bugs are common and have even led to jailbreaks and other attacks in the past.

Probably a good idea to limit the type of code they can execute.


The speed on this is terrible, so seemingly the only valid use-case is for sensitive files that need to be processed locally. Otherwise, it cannot be meant as an actual application. This is more complex than simply calling an installed executable, which only requires server support.

However, it does show that wasm is a capable compilation target, which is great!


I couldn't find any benchmarks on the site. How much slower is it?


Anywhere from 5-20x for me. I imagine that the speed is very sensitive to the individual clip.



Especially the war thing, uh-huh.


Now all we need is for someone to do a clean room implementation of the DRM spec.

Oh wait.


But why?


(2020)


ffmpeg to this day has and creates some of the scariest security problems you'll see. I love ffmpeg, but I fear this will all end in tears...


Could you point me to some examples or perhaps share some? I totally believe you, but we depend on ffmpeg (and we license the relative codecs) and I wanted understand what kind of problems we should be looking out for.


So this is a great step in the right direction since wasm is run in the JavaScript sandbox.


Is this still true?

I know it certainly was several years ago, and the problem was magnified by every video sharing site using it in their ingestion and transcoding pipelines. It's a single, incredibly powerful and versatile tool, to the point where it's often easier[ß] to fiddle around with command line parameters than to hit the underlying libraries directly.

And by design, in that kind of setup ffmpeg is processing vast quantities of untrusted inputs, coming in at all possible (and some impossible) combinations of video formats, container formats, invalid segment headers, bitstream corruptions and whatever else you can think of. If my memory serves me right, when M. Zalewski first joined Google, he worked on YouTube's video processing ... and to make their engine more robust, started to fuzz ffmpeg. I've heard people describe the result as if shaking a plum tree.

As a result of all these years of hardening, these days ffmpeg should be reasonably robust against malicious inputs. Now, if anyone is generating command lines for it and passing anything from user input to that - well, that's an open invitation to abuse.

ß: subjective, of course. But I've tried to look at using the libraries for something fairly straightforward and every single time it's been much more effective use of my time to just look up what the necessary ffmpeg CLI flags+arguments were.


You can use ß as foot note? I've never seen that before.


I believe you can use any character your keyboard and/or chosen input method supports. I happen to like ß. From memory alone I've seen people use the +/- symbol, the upright cross, the double-cross and the cross product symbol, just on HN.


Do you have an example of a security problem that would still be an issue in the wasm and browser context?


Would wasm do anything for gaming? I don't expect Doom Eternal running at 165fps in my browser but what would realistically be possible in the coming few years with webgpu and all?


There's already 3D games that can run in browsers, if that's what you mean.

Unity and Unreal both have HTML5 as a first-class platform, I believe.

What webasm will do for games:

- Not much for graphics, those are limited by the graphics APIs

- Maybe allow number-crunching of 32-bit floats and integers to be faster because you don't have to fight JavaScript's native types being doubles

- Reduce overhead because the runtime doesn't have to keep track of its own GC, or things like falling back to a slow interpreted path (which takes some memory, even if the fallback never happens)

- Allow stuff like Unity's C# runtime to be more efficient, since it's just running the C# runtime directly in webasm instead of using something like asm.js where it has to sit atop the larger JS runtime and its GC

It's gonna be nice. There will be complaints, and backlash, but nobody could have prevented this, and it is a benefit to many use cases.


Epic deprecated support for exporting to HTML5 from the editor back in 4.23. They barely put any effort into reducing the file binary sizes to make shipping a project on the web viable. Our startup is working on upgraded support for the engine, with plans to support WebGPU in the upcoming Unreal Engine 5.

We’re even working on WebXR support, which will play a key part in enabling the metaverse.


https://krunker.io is a pretty popular one made in Unity


AFAIK Krunker isn’t made in Unity. They’re even expanding into the web game engine space: https://krunker.io/helpdocs.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: