Hacker News new | past | comments | ask | show | jobs | submit login
Ffcms – FFmpeg's -filter_complex made simple (stryku.pl)
93 points by stryku2393 11 months ago | hide | past | favorite | 26 comments



This is really awesome -- FFmpeg's filter_complex is powerful and appropriately named "complex" lol.

I built and sold a company built largely around FFmpeg and this type of tool would've been really useful for understanding how FFmpeg's filter chain works. If you're interested in some non-trivial OSS examples, check out https://github.com/transitive-bullshit/ffmpeg-gl-transition and https://github.com/transitive-bullshit/ffmpeg-concat.

One non-obvious piece of advice I have for developers delving into ffmpeg's filter_complex is to try and keep the filter graphs as short and simple as possible. Creating complex filter graphs programmatically (as ffcms does in this case) is super powerful, but I've also found in practice that larger filter graphs often lead to random, impossible to debug errors. This could be due to anything from some combination of VM resource constraints, internal ffmpeg leaks that compound at scale, etc etc.

My advice is to think carefully about what you want to use ffmpeg for. It's excellent at transcoding and doing minor filter graphs but trying to get too crazy with complex filter graphs will lead you down a dark path that imho ffmpeg wasn't really meant for. If you can break up your complex filter effects into smaller, isolated ffmpeg commands it may be slightly slower but your rendering pipeline will be significantly more robust.


Glad to hear that you like it (:

Thank you for the links. I'll check them out.

Aaand of course, thanks for the advice. Personally, I've never been in a situation where `filter_complex` caused problems. But, I've never done some weird filtering there too, so you may be right.

I'm not saying that now, with ffcms, you should write JSONs that produce an A4 ffmpeg command. If you're doing so, something is not right.

For now, it's just a tool that helps me with timelapse creation for my fiancée. Let's see how it evolves.


That's an awesome use case I hope your fiancee appreciates all your hard work haha.


"-f lavfi -i testsrc - declaration of an input. The input format is lavfi, the input file is testsrc. This is duplicated four times to imitate four different inputs."

You can feed a single input multiple times in a filtergraph, so you can make do with only one. In this case, it hardly matters, but the input packet will be demuxed and decoded only once, unlike in the examples.


True, but I wanted to stick to the official example.


I spent quite a while trying to replace the syntax of FFmpeg's filter-graphs with a tree-structure of nodes. Then about a week in I remembered why you should never build a DSL over a DSL.

I personally find the FFmpeg syntax to not need further abstraction. There's more than a few projects that try and replace the graph and they'll be stuck trying to keep pace with version creep.


> Then about a week in I remembered why you should never build a DSL over a DSL

Yes.. I was very excited to do the same, even automate it in a way.

It's a tool that is (IMO) impossible to automate.. Making it easier is project dependent... But it wont translate for someone else usually.


Ffmpeg's built-in filtering has gotten a lot more advanced since i last looked. Is it possible to use this as a full/near-full avisynth replacement?

Is it possible to compile avisynth code down to a filter_complex statement?

Is anyone using Vapoursynth?


Thank you. Very welcome addition. You could of course use a script for that, but that requires more effort / is more difficult for a lot of people to understand.

I hope there will be a similar structure (or DSL) for imagemagick, as it's it has the same difficulties.


I can say from experience of my own and others, that people usually keep these close to their chest, I consider mine a bible. In fairness, its a tool, so what you do with it is up to you. It's really hard to organise a standard library for even video work as each source is always a bit different, given the amazing community aroudn questions and answers I suspspect nobody wants to have a standard the same as ImageMagic.. It really is a different beast.

Edit : With certain frames, certain 'standards' settings would not result the same, so.. It's really impossible to have 1.

You might have 1 clip which requires 2 entirely different settings.


This FFmpeg Python bindings [0] provides wrapper for filter_complex which simplifies its use a lot !

[0]: https://github.com/kkroening/ffmpeg-python


That's actually awesome. Didn't know about this feature but I'll take a look for sure.


Love it. FFmpeg is so powerful, but so opaque. I think there could be a whole vertical of startups dedicated to tooling built to simplify using it. Looking forward to seeing this develop.


There definitely are. There are video transcode SaaS platforms which wrap something (maybe FFMPEG) in a nice REST API.


Too bad its using JSON. There are lots of other cleaner formats like YAML, RJSON, etc


I feel like JSON is a fair choice these days. More people would dislike some YAML IMHO.


I started building a tool that simplifies video scripting [1] and the first version worked from JSON. Although this was aimed at a technical audience, people really struggled with it. it's difficult to add comments, and ended up being very error prone. My second choice was YAML, but this ended up even more error prone due to whitespace issues.

My lesson is that JSON and YAML are great for machine-consumption when people just need to do something once and leave it (eg config files), but far from ideal for stuff people need to edit over and over (such as script files).

I converted the input to markdown (with some small extensions) and it made a huge difference. it's less fiddly, much easier for humans to edit, and the parser can point out errors much better. for some nice examples check out https://github.com/videopuppet/examples/

In conclusion, YAML and JSON are great for machine-to-machine communication or human-to-machine for things that do not change often and aren't complex. for human-to-machine that is more frequent, we should be kinder to our users.

[1] - https://www.videopuppet.com


"Consume YAML, emit JSON". I've been using this pattern for the last several years on most projects, ever since someone else on the internet recommended it. Since YAML is a superset of JSON, people can input either JSON or YAML (or mostly just JSON with JS style comments, like I do most of the time) and have simple JSON output.


JSON is really unfriendly for human editing. Notably, lack of comments, and trailing commas being strictly forbidden.

JSON5 seems like a decent alternative to either.


I have a soft spot for https://jsonnet.org/ myself.

https://github.com/mbrt/gmailctl uses it for example.


Anything that's going to be handwritten shouldn't be using json, it's unnecessarily verbose to read and produce

I personally prefer toml


I thought about TOML for a while. Went with JSON because I just wanted to start working.

Like I said in one of the comments, final decision has not been made yet. JSON is not written in stone, plus ffcms could support more than one language, e.g. JSON and TOML.


That looks very cool indeed


For representing a graph of processing nodes, none of these are great. Maybe something like DTS format used by the Linux kernel, where the language parser itself would be able to check connections between processing nodes.


It took some looking but I think DTS=Device Tree Spec. https://elinux.org/Device_Tree_Usage


ffcms is not a finished tool. JSON is not written in stone. I fact, ffcms could support more than one language, e.g. JSON and TOML.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: