
ML turns video of a 360° turn into 3D model of a person - mikeyanderson
http://www.sciencemag.org/news/2018/04/watch-artificial-intelligence-create-3d-model-person-just-few-seconds-video
======
symisc_devel
Link to the paper:
[https://arxiv.org/abs/1803.04758](https://arxiv.org/abs/1803.04758)

~~~
neonate
And to the video:
[https://www.youtube.com/watch?v=nPOawky2eNk](https://www.youtube.com/watch?v=nPOawky2eNk)

------
llao
Oh how I hate marketing speech.

First of all, the title should include "video of a predefined 360° turn".

And then they say something along the lines of "average accuracy of about 5mm"
for joining the constructed modeled joints to their model, while you see the
body wobbling around happily.

This is an impressive demo, but gah!

~~~
dang
Ok, we'll give it a 360° turn above.

------
nitrogen
Structure from motion is an existing technique. What is the contribution of ML
in this case (it seems like joint positioning maybe?)?

[https://en.m.wikipedia.org/wiki/Structure_from_motion](https://en.m.wikipedia.org/wiki/Structure_from_motion)

~~~
ansgri
99%¹ of computer vision problems are 80% solved. The problem is, you need 95+%
solution to be practically useful.

Binocular stereo vision has just approached general applicability, and SfM is
mostly used in very constrained environments (traffic analysis) or with large
computational resources with manual correction (offline 3D mapping from aerial
data).

¹ Numbers are metaphoric only, based on experience in scientific and
industrial CV.

------
raghavkhanna
How is this ML? They use a CNN for foreground segmentation, a minor step in
their pipeline. But the major contribution seems to be putting the silhouettes
in a common reference frame. I sincerely hope sciencemag isn’t putting ML in
the title purely to jump on the bandwagon.

~~~
utkarshsinha
It's someone standing in front of a green screen. You don't need ML to find a
person's silhouette.

~~~
seandougall
To be fair, they do have examples that aren’t chroma keyed; they just lead
with one that is.

Which is not to say that ML is necessary for this sort of computer vision
task, but I wonder if it yields better or sharper results than other
techniques?

~~~
extralego
Same. As someone who has spent an embarrassing amount of time keying and
tracking video footage over the years, I’m surprised ML isn’t being used for
this more often in studios by now.

------
egypturnash
As an artist, my first thought is _I wonder what happens if you try giving
this a series of drawings_.

~~~
make3
you'd probably need a lot of drawings, I wonder what's the sampling rate the
thing uses

it's a cool idea though :)

~~~
seandougall
They say “standard” video is the source, so it would likely be on the order of
30 or 60 fps. Seems to be around a couple hundred frames, give or take, though
I suspect it could get _something_ out of fewer frames, and more would just
incrementally improve the model.

I would expect minor textural differences in a hand-drawn or painted source
would make it a lot harder to correlate points between frames, but it’s an
interesting idea to think about!

------
mtgx
This is what should give you pause before using face authentication technology
for anything.

~~~
haZard_OS
Can you elaborate?

~~~
toomuchtodo
Makes forging facial biometrics easier.

~~~
seandougall
In the case of Face ID, at least, you’d still have to transfer the
measurements into the physical world, in a way that fools a system that has
ostensibly been designed not to be fooled by masks.

~~~
toomuchtodo
Like a 3D printed model?

~~~
URSpider94
Doesn’t work for high quality face reco systems like iPhone X. You’d also need
to get the IR reflectance, as well as a sign of life from the eyes.

------
make3
I wonder if will see a future soon where a director can fully edit the
positions and physical actions of the actors at post production.

basically, the whole scenes will be transferred to believable 3d models
seemlessly, and you can reanimate parts of everything. I feel like that's
doing to happen for sure, for big Hollywood productions at least (like the
Marvel stuff)

~~~
leohutson
This already happens a lot, most VFX heavy productions will have digital
doubles of the main cast, and they can be used for as simple a reason as
reframing a shot.

~~~
extralego
Your comment could give the impression this is drastically more simple to do
than it is in reality. This is considered as something like the last frontier
of VFX, and there still remains a lot of work to be done.

While you’re essentially correct, it is currently an overwhelmingly manual
process. The amount of work and time necessary is substantial (some would say
outrageous), and exponentially higher for certain types of shots. Many shots
remain impossible or cost-defeating.

------
interfixus
It seems determined to put visible toes on everybody, no matter that they're
wearing socks.

Is this a bug or a feature?

~~~
RodgerTheGreat
I'm going to guess they start with a generic human model that includes all
limbs and extremities and then the "machine learning" process attempts to fit
that model to the silhouettes extracted from the video.

~~~
stochastic_monk
Which implies that the technique uses domain knowledge of people to make
assumptions about their morphology.

------
codetrotter
This is awesome. I wish someone will implement this as a piece of open source
software. Imagine the potential!

~~~
raghavkhanna
Source code seems to be available :)

[https://graphics.tu-bs.de/people-snapshot](https://graphics.tu-bs.de/people-
snapshot)

~~~
bahmboo
From site: "We will provide access to the code and dataset soon."

------
meric
Could be used for VR phone calls between long distance couples.

