Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Video Transcoder on the browser using WASM (modfy.video)
129 points by cryogenicplanet 3 months ago | hide | past | favorite | 41 comments



Hello HN

Modfy is a purely browser-based privacy-first video tool capable of performing tasks like converting, compression, etc without uploading your files.

I thought it would be cool to process video in the browser using Web Assembly and FFmpeg

The project is also open source and based on https://github.com/ffmpegwasm/ffmpeg.wasm

Let me know what you think!


Thanks for building, cryogenicplanet!

Just wanted to add I really like the "no-upload" programming paradigm a lot. I use browser native webgl rendering internally for example to automate generation of static digital art assets. All samples and finished images are transferred via http. And its easy to run on any device: just fire up chrome and enter url.

I'd like to do the same for the video asset generation pipeline as well. For which I do most ffmpeg processing on host. WASM build allows for browser based video generation on the fly ;)


Yeah this is definitely try to push that dream of not having to upload/download software and doing most if not all things on the browser itself


How does the performance compare to running ffmpeg ad usual? I’m guessing that WASM means you don’t have access to things like hardware acceleration?


I tested with a 64MB, 124s long, 1152x720 .MOV video converted to .MP4 and here's what I saw:

  modfy.video   - 258 seconds
  native ffmpeg - 23 seconds
This was using a 2020 MBP with a 2.3 GHz Quad-Core Intel Core i7 processor & 32GB of RAM. My local ffmpeg version was v4.3.1.

So it looks like performance definitely takes a hit due to WASM, but I'm sure performance will improve, and this is technically a very impressive achievement. The UI is really nice, and the privacy benefits of remaining client side, whilst delivered through the Web are great (i.e. nothing to install locally).

I'm one of the co-founders @ Zamzar so really interested in this space. Congrats to @cryogenicplanet on this project - you've done a great job with it!


Can you compare them when running native single threaded? WASM threading support isn't standard yet


What video codec were you converting to?


H.264 in each case.


Isn't .MOV already .MP4?


No. MP4 is based on MOV. MP4 is typically going to have h.264 video with AAC audio. MOV is not limited in video codec, nor audio codec. Even though MediaInfo reports MOV as MPEG-4, they are not the same.


MP4 isn't limited in video codec or audio codec either, and it's quite common to have non-H.264 or non-AAC codecs in them. The only difference between MOV and MP4 are a few later created extensions, and the difference in every day use is negligble.


MP4 is an ISO standardized version of where MOV was at the time it was made a standard. MOV is an Apple product that they are free to update as they see fit. Apple has since modified the QT container regarding things like the moov atoms for aspect ratio, color systems (709 vs 609 etc), and all of the other similar fiddly bits. This is the major differences between the two to me.


So .MOV is basically already .MP4.


In the sense MariaDB is MySQL


Both mov and mp4 are "containers", which can contain different kinds of codec. Both are usually h264 - mp4 especially - but they don't have to be. mp4 files can contain many different variants of mpeg-4, including hevc. mov files can contain virtually anything.


It is definitely slower than using ffmpeg normally, but the idea is for more non technical users it adds convenience and they don't have to use a command line.

Yeah this version doesn't have hardware acceleration, but if I am not wrong WASM has hardware acceleration in development or coming soon.


Awesome work! The easy-of-use and privacy capabilities of this totally out-weight the somewhat slower performance. The safety one has with being able to do this in the browser, rather than having to download a tool that is not "sandboxed" and is capable of potentially running anything on your system, is amazing.

Regarding the slowness, can you comment on why you think WebAssembly is slower than native? I'm sure some remarks would be very helpful to the Wasm team. Thanks!


I am not knowledgeable with WASM, can it enable threading in FFmpeg? That’s where FFmpeg shine...


Yes, SIMD as well, although they're not available on all browsers/devices.


I'm curious about using this to transcode youtube videos in the browser. Wondering if CORS is going to be an issue. Looks like I found another rabbit hole to keep me occupied for a while.


This is a real cool tool and I would love to see more simple tools client-side only. However I have serious concerns about the performance. When developing a solver (https://ricochetrobots.kevincox.ca/) I found that compared to the native performance running it in WASM was ~2x slower. (Running it on a phone compared to a laptop was another ~2x). However I doubt the native compiler was using large amounts of SIMD. IIUC video encoding often will use a large amount of SIMD instructions so I would expect the slowdown to be worse.

On the other hand SIMD support in WASM is in progress: https://www.chromestatus.com/feature/6533147810332672 (preview available on chromium)


Yeah I do think there is a performance bottleneck here with WASM being quite a bit slower than native performance. But I think the idea is that WASM performance is only going to get better as the technology keeps evolving, and that gap will be bridged.

I'd say the overall performance of the tool itself is a bit mixed. There are some task where it performances really well (trimming, converting photos to montage) but some where it is horrible(transcoding to VP9)


I agree. It will keep getting better and I would much rather have these things available than not. It would be cool if the site provided the raw ffmpeg command line so that you could use it as a command builder and run it locally for large or batch jobs. Then you can choose between security and convenience or max performance.

The notes you mentioned sound like fundamental ffmpeg issues rather than problems with the implementation of this tool. It can usually trim without transcoding (much of) the file and encoding VP9 is notoriously slow.


>Then you can choose between security and convenience or max performance.

How does running a transcode job in a browser provide "security and convenience" vs a native app?


It does for non technical users or someone who wants to do a one off transcode job. Most people wouldn't be comfortable using FFmpeg in the command line.

And for the people who are comfortable with that, this provide some convenience in not having to look up command(something I always do) but its never meant to be a replacement for hardcore FFmpeg users


No need to manually install anything and sandboxing. Honestly if I had a suspicious video file I might prefer the browser option so that an ffmpeg exploit would not have access to my machine.


I too would like to know if the performance shortcomings of webassembly are inherent to its design (ie. The sandboxing mechanism), or are just due to a poor implementation.

Basic things like not being able to get stuff from CPU to screen without multiple memcpy's are all problematic too...


Not a compiler expert, but looking at the bytecode it looked fairly limited and minimal number of opcodes.

Giving it has to be universal, i imagine where a LLVM bitcode op might translate to 1, 2 or 3 native opcodes, with WebAssembly you might with much more opcodes in the native output.

I mean, of course they will be able to make it better with time and effort, but i dont think we can spect a much bigger speedup in comparison with the native equivalent.

Note: this is all very speculative, but by looking at it, it looked great as an accelerator, specially for math stuff, but i imagine that it will be hard even to beat the Javascript code for generic stuff giving how much it was already optimized for it.

If any compiler expert here that is familiar with it want to correct in my assumptions, please feel free to do it.

(Not event touching in issues like the ones you mention, like memory copy, where the compiler cannot translate the thing in a more optmized manner giving it cannot speculate about the environment and optimize for it)


I had a similar experience comparing DSP algorithms; native was faster across the board. It’s hard to get excited about WebASM if it’s slower.


This is very interesting! Would definitely be useful non-technical users. I have a few UX concerns, though.

- By default, it's set to All Features instead of Basic Features

- The app repeatedly says it doesn't upload your files, yet there's a big button in the app saying Upload File. Perhaps "Your files are never uploaded" should be reworded to something like "Your files never leave your device"?

- It also was not obvious how to get started with the app. Perhaps some text could be added to the play button?


Or change the text on the button. 'Add Files', 'Import Files', etc would be equally useful while not confusing.


Thanks for the feedback, will definitely take this into account and iterate on this feedback.

Someone opened an issue related to this too, so will update it on that issue https://github.com/Etwas-Builders/modfy.video/issues/88


This is pretty cool. I've also been playing around in this area lately and noticed there's still a drastic performance hit. But seems to work great for smaller files.

Question: Are you compiling FFmpeg to WASM, or FFMpeg's libav and interfacing with that directly?

Also for anyone curious, the author of ffmpeg.wasm[1] has an excellent guide on compiling FFmpeg to WASM:

https://itnext.io/build-ffmpeg-webassembly-version-ffmpeg-js...

[1] https://github.com/ffmpegwasm/ffmpeg.wasm


On the left side, the label says "Click or Drag to upload", and at the bottom there is a label saying "Your files are not uploaded anywhere".

If you have control over the left file picker, it might make sense to change the label to "Click or Drag to pick a video" or similar.

Really great to see this -- I once worked on a cloud-based video editor and the cost of running ffmpeg in the cloud was surprisingly prohibitive.


Thanks!

Already tracking the label issue here https://github.com/Etwas-Builders/modfy.video/issues/88 will resolve that tonight

I think that's another great thing for me with this product, is that it almost costs nothing to run it (Thanks netlify)


Wow! I wish I saw this a couple of months ago when I set up server-side transcoding for a client ;_;


I tried running something like this on my own, but I got OOM errors pretty quickly. 4 gigs isn't a lot when it comes to video processing. Hopefully wasm64 and WebGPU will make it a lot better.


I think this is as a result of the video have to be uploaded into RAM to be processed, this is definitely something I want to try to fix where I access the videos on the native file system to allow much larger files


Is there a restriction on the size of the file or can it save the contents on the fly to disk like streamsaver?


Currently there is a restriction on size of file because it has to process the video in your memory and isn't saving content on the fly to disk but that is definitely something I can try to add moving forward


Good job on your release, Curious, How is your version different from kagami or videoconverter versions, which were file conversions and they did not have streaming options, even though they had some patches who showed how to do it. Does yours support streaming options so that the streams can be saved to file serially using streamsaver.js.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: