
Real time image animation in opencv using first order model - abhas9
https://github.com/anandpawara/Real_Time_Image_Animation
======
qchris
I'm a huge fan of this kind of practice, where the code for a paper is all
located in a single public repository with build instructions, along with
directions for how to cite it. Obviously, it's a little tough to do with some
more data-intensive sources (besides GH hosting limits, no one really wants to
download 100G of data if they're just trying to clone a repository), but this
kind of thing sets a high standard for reproducibility of published results.

~~~
dllthomas
> but this kind of thing sets a high standard for reproducibility of published
> results.

I think making the code available is good, but I think we should be careful
how we use the term "reproducibility". Pulling your repo and running it had
better give the same results, but it's not the same sort of thing as building
my own experimental setup according to a paper's specification. The latter
gives more room for variability such that successful replication speaks more
strongly to the robustness of the result, and also puts human brain power next
to each step of the process in a way where weirdness might be noticed.

Replication should probably involve reimplementation, if it's to carry its
traditional weight. In the event that we fail to replicate, though, having the
source code for both versions is likely to be hugely informative.

~~~
dekhn
I think this is a fair point, but in my experience, having a concrete replica
that people can start from (and compare to the paper) can make a year's
difference in speeding up progress.

Many times, I've read a paper, thought something was great, and then
implemented the paper and failed to reproduce the author's results. In the
cases where I've been able to compare my implementation to a reference on
github, I often find the paper doesn't match the code, or a subtle data
processing step was left out. Having a replica (a commit hash and a pointed to
versioned input data) can often make a huge difference in time.

~~~
dllthomas
Yeah, I'm certainly not saying it isn't advisable or even important. I'm just
saying it's not the same thing as replication.

------
rozgo
I'm working with same model, but in a real-time pipeline developed with
GStreamer, Rust and PyTorch:

[https://twitter.com/rozgo/status/1255961525187235842](https://twitter.com/rozgo/status/1255961525187235842)

Live motion transfer test with crappy webcam:

[https://youtu.be/QVRpstP5Qws](https://youtu.be/QVRpstP5Qws)

~~~
mv4
Nice. I want to try something like this.

------
forgingahead
Very cool, reminds me of Avatarify, which is also based upon the First Order
Model work:

[https://github.com/alievk/avatarify](https://github.com/alievk/avatarify)

~~~
roomey
It looks the same, even the same images. I can only get 3fps from avatar
that's with CUDA, is this one faster?

------
egfx
Pretty cool. Reminds me of [https://github.com/yemount/pose-
animator](https://github.com/yemount/pose-animator)

I would use it if there was a JavaScript port.

------
bsaul
How can it generate teeth that look like they fit the picture ???

~~~
rozgo
This model is trained with short clips of human speech. There is enough
statistical information to "guess" how to fill the gap created by opened lips.
I'm still amazed how it conserves temporal coherence (what it looks like from
frame to frame).

------
sriram_malhar
Is no one else deeply afraid of this future?

~~~
api
Provenance and chain of custody is everything. It's always been important, but
now its critical. Any audio or video without a solid chain of custody is now
suspect. Anonymous leaks are worthless as anything can be faked by almost
anyone with a PC.

Old and busted: "pic or it didn't happen."

New hotness: "in person witness or it didn't happen."

~~~
enchiridion
Do I smell a blockchain application?

~~~
api
Cue 10 ICOs for AuthenticityCoin type things, most of which just exit scam and
the rest of which don't actually work.

The real security hole for forgery is at the point of injection. Tracking a
forgery along with a block chain doesn't prove it's not a forgery.

One thought is a camera sensor that cryptographically signs (watermarks)
photos or video frames _on the sensor_ before they are touched by anything
else. It's not perfect since a highly sophisticated adversary could get the
secret key out of the chip, but it could definitely make it quite a bit harder
to fake photos. Nothing is ever perfectly secure. All security amounts to
increasing the work function for violating a control to some decent margin
above the payoff you get from breaking the control.

I could see certified watermarking camera sensors being used by journalists,
politicians, governments, police, etc.

~~~
kortex
This is a start. It can even be done steganographically, embedded in the
picture in a non-visual way, which is robust against compression and "social
media laundering" (term of art for uploading then downloading from social
media).

The problem is people just don't care. See "cheap fakes" like slowing down a
video of Pelosi and claiming she's drunk. People actually believe that
garbage. No amount of fancy math can fix that.

------
imron
Looks like the file mentioned in this step

> gdown --id 1wCzJP1XJNB04vEORZvPjNz6drkXm5AUK

Is no longer accessible (too many downloads in too short a time)

Edit: For anyone else with the same problem, the file in question is "vox-
cpk.pth.tar" which can be found in various places on the internet.

------
seesawtron
The google colab version is not really real-time, is that correct? It loads
pre-recorded video. I guess that is because it is not easy to add realtime
feed from camera into browser notebook or what are the limitations there?

------
villgax
The paper & final models don't to justice for detailed outputs though, but
this is still a great model for datasets with no annotations per se.

------
karakanb
does anyone know if using this tool to generate a music video of famous
pictures singing a song would violate any copyrights? it seems like a fun
exercise.

------
sgroppino
very neat! You can crop and convert to mp4 using ffmpeg: ffmpeg -i test.avi
-filter:v "crop=250:250:260:0" out.mp4

------
throwlaplace
one of the authors is at snap. inquiring minds want to know: will this soon be
available in snap camera?

------
mister_hn
Really cool, but I hoped to see C++ code for OpenCV, not python

