
AutoFlip: an open-source framework for intelligent video reframing - jonbaer
https://ai.googleblog.com/2020/02/autoflip-open-source-framework-for.html
======
krisoft
"By detecting the subjects of interest, AutoFlip is able to avoid cropping off
important visual content."

And they manage to demo it with a scene where a lady talks to a guy in a
checkered shirt in front of a yellow background. The algorithm cuts off the
lady completely and only keeps the guy. I guess she mustn't have been
important. Next time she should try to look more like "interesting, salient
content".

~~~
rcthompson
Yeah, I noticed that. While it's possible this is the result of the machine
learning algorithm implicitly picking up a male bias in the data set that
causes it to rank men more highly in terms of importance, I don't know how
likely that is in practice. Maybe the explanation is as simple as the man
taking up more pixels, or a bias toward lighter colors (like the man's white &
blue shirt) while black colors (like the woman's clothing) are more likely to
be considered unimportant background. Maybe the algorithm considered both
approximately equally important and the result was essentially a coin flip.

Regardless, I agree that it's not a great look when the caption on that
example talks about how AutoFlip "detects the subjects of interest" and
"avoids cropping off important visual content". It's definitely not sending
the intended message.

~~~
drusepth
I interpreted it as just picking one visual entity to focus on for that
cropped clip for design reasons, since the letterboxes necessary to get both
entities in that portrait orientation would be huge and visually unappealing
[1].

[1] [https://i.imgur.com/yxxEnGe.png](https://i.imgur.com/yxxEnGe.png)

------
ArtWomb
Results actually look really good. Having to just worry about filming in one
format, such as widescreen 4K. Means you just have to worry about getting the
shot. I think its good enough to build an emotion predictor vector.

------
dehrmann
Aside, but I've always wondered how pan and scan was done in practice for
converting film to VHS. It seems either tedious or sloppy.

~~~
btown
Pan and scan was historically done manually. Turner Classic Movies made a
great 5-minute documentary about the practice and how much of the filmmakers'
intention it loses. Definitely worth a watch:
[https://www.youtube.com/watch?v=5m1-pP1-5K8](https://www.youtube.com/watch?v=5m1-pP1-5K8)

------
sharpercoder
This is a more scalable solution then the Spectacle solution (record both
hori- and vertical). I wonder if filmmakers will adapt their filming to cater
to this algo - i.e. having a bigger margin to the edge in order to have
AutoFlip give even better results.

------
32gbsd
Anytime I see anything with the phrase "open source" \+ "google" I immediately
think programmer API trap.

