
Dynamic 3D Avatar Creation from Hand-Held Video - mmastrac
http://prostheticknowledge.tumblr.com/post/121197583491/dynamic-3d-avatar-creation-from-hand-held-video
======
markild
That was _seriously_ cool!

I tried browsing through the paper, but I'm still not entirely sure on how
real-time this is. Are they using only a smartphone as both capture and
rendering hardware?

~~~
pgeorgi
According to the video there's an offline training step, but after that it
seems that a (somewhat modern) smartphone should be good enough for rendering.

------
aw3c2
direct link to source:
[http://lgg.epfl.ch/publications/2015/AvatarsSG/index.php](http://lgg.epfl.ch/publications/2015/AvatarsSG/index.php)

submitted url adds nothing.

~~~
TeMPOraL
It adds cool gifs, which frankly, seem to be awesome form of TL;DR.

------
bsenftner
We've had a similar capability as this, lacking the normal map generation, for
nearly 2 years at the 3D Avatar Store. (www.3D-Avatar-Store.com) However,
we've been unable to raise the financing to build a commercial, scaled
infrastructure for video as input. So we built our current infrastructure for
single and multiple photo input. It's available now, as a web app and webAPI.

Our solution differs from this effort, as we use a neural net pipeline for
multiple stages of the pipeline: a face finder, a facial feature finder, a 3D
reconstruction 'net, and then conventional C/C++ utilities to generate a
rigged and ready for immediate use geometry and texture. A nice aspect of
using neural nets is the ability to retrain our output 'net to generate any
geometry one wants. We currently generate 5 different types of avatars, with
our most popular being a Fuse 1.3 compatible character that plugs right into
the Mixamo auto-rigging pipeline.

Our system is trained on a data set of 300,000 laser scans of real people,
plus dozens of photos taken at the time of scanning with each photo capturing
a different angle, lighting condition, and type of camera lens.

A single photo does a remarkably good job. A single photo only take 0.5
seconds to reconstruct the initial geometry; for an online user the process is
3 to 8 minutes, depending upon prior experience, to create a fairly realistic
full body Fuse 1.3 compatible character. We provide multiple editing
interfaces, as users tend to provide a "selfie" (with bad perspective
distortion) or a poorly lit mobile phone / web cam photo (with splotchy
pixels). The multiple editing interfaces provide means of correcting the
facial feature locations, and 3D deformations to correct for perspective
distortion and difficult to recover features from a passport style image, such
as the curve of one's profile.

We've love to find a financial partner interested in developing our video
input capabilities. The video-as-input 'net has higher CPU requirements than
the photo-as-input 'net. It's results are remarkable: it tracks 36 faces
simultaneously per video feed, and can accept up to 4 HD video feeds. Each
tracked face is reconstructed in real time, 25 ms per face, holding whatever
facial expression the subject had at that moment.

No doubt, this paper is a great step forward. 3D Avatars are also a very tough
market to crack, as people submit remarkable poor quality images, and you are
messing with people's visual identity. With actual photos people will say
"that does not look like me", and that self identification issue is compounded
with 3D Avatars. Plus, people want them for free, and they are one of the most
expensive digital objects to create.

~~~
lcswi
> 300,000 laser scans of real people

Blimey! What is the source of that dataset? What are its demographics?

~~~
bsenftner
Over a few years, laser scans covering pretty much every age, gender, and
ethnic background.

------
geon
The necks bothered me. They seem to just freely rotate together with the head,
when it should be attached to the shoulders. It kind of ruins the illusion.

------
iamcreasy
Looks a lot better then L.A. Noire

------
freekh
Wow! This is incredible!

