
Real world location virtually recreated to scale in minutes [video] - Kroeler
https://nwn.blogs.com/nwn/2019/08/volumetric-mirror-world-mapping-6d_ai.html
======
ggambetta
Related question that I've had for a while. I have 25 years worth of family
pictures, many of them taken at my childhood home. The house is no longer in
the family, and I desperately want a photorealistic 3D model.

I don't have a walkthrough video, let alone different videos; and I believe
reconstruction techniques like these work on such sparse inputs.

What's my best approach here? I have partial floorplans, I made a reasonable
SketchUp model (I counted bricks in the pictures to get accurate measurements)
but I'm nowhere near my goal, which would be having a complete, photorealistic
3D model I can drop load in Unreal or Unity and make a VR walkthrough (I can
only imagine how that would blow my mum's mind!)

I've thought of outsourcing this, e.g. find a talented game environment artist
or architectural modeller and pay them to do this, but I fear the result will
be "off". I dream of the perfect algorithm that will do this.

Any ideas?

~~~
jobigoud
I think there aren't any algorithm that can do that and won't be for any time
soon.

I spent several weeks last year recreating my backyard in VR, aiming at 1:1
mapping, using photogrammetry for capture and a standalone mobile headset that
allow walking around untethered (Lenovo Mirage Solo). Even with photogrammetry
it wasn't as accurate as required for the objects to be placed exactly at the
correct position and be able to confidently walk around without fear of
hitting a tree that is 20 cm more to the side in VR than in reality. It took
many back and forth to get the alignment right. By the time I was done the
tomatoes had grown so much the VR environment was already wrong.

Outsourcing reconstruction of lost/old places is a great idea. I found that
even just stereo-panoramas of old apartments seen in VR (so without positional
tracking) is triggering memories in a way that simple photos can't.

~~~
mLuby
It's definitely possible, if the photos are suitable. There was a burst of
interest over a decade ago specifically around reconstructing a 3D model of
Notre Dame from public tourist photos, so that might be something to look
into. Here's an example:

[https://www.youtube.com/watch?v=8rG0t1hS1ms](https://www.youtube.com/watch?v=8rG0t1hS1ms)

~~~
tralarpa
I am wondering what has happened (or is currently happening) to this project:
[https://www.youtube.com/watch?v=Ur1Z72_LyTM](https://www.youtube.com/watch?v=Ur1Z72_LyTM)

~~~
ggambetta
OMG, that's incredible. I wanted to do something similar for my project -
whenever you stepped in a location where a picture was taken, the picture
would fade in, superimposed to the VR scene.

------
tmilard
I remember a ycombinator "ba, tched" startup doing about the same.
[https://sendreality.com/](https://sendreality.com/)

The time to sniff the place was a bit more but the quality looked better.
Let's not get fooled by speed : without a good 3D copy of the place you are...
screwed. Because our eyes can't stand gross mistakes.

~~~
Fission
That's us. [https://sendreality.com/318-main-
st-a3d/](https://sendreality.com/318-main-st-a3d/)

The biggest challenge about 3D mapping like this isn't about the tech itself,
but about actually making it useful for end users. These demos are cool, but
the important thing is that you need to build out the part of the software
that makes it useful/valuable for consumer/business applications. Otherwise,
you should be working on the tech in a research lab vs. a venture-funded
business.

This is a trap that even we've admittedly fallen into at times.

~~~
trevyn
That's a great point, and an interesting example link for two reasons:

1) Real estate seems like a large market for this technology.

2) The specific space in the link looks like it would photograph well, but the
3D version reminds me that it has low ceilings and a lot of areas with limited
natural lighting. i.e. it does a worse job than photographs for the purpose —
selling the unit — because you can find the unflattering perspectives.

(Side: the red dot is very confusing as regards browser pointer-lock, and the
whole thing seemed very awkward to navigate with a trackpad until I discovered
that WASD worked.)

~~~
Fission
The incentives of realtors are not aligned with those of the buyer, which
manifests in the cognitive dissonance you've just experienced.

On a different note — if possible, I'd love to chat about your experience with
the UI. Do you have an email I can reach out to?

------
Animats
There have been lots of programs to do that. This one isn't that good. Here's
an open source one.[1] And another one.[2]

Doing a room is the same photogrammetry problem, but inside-out. Autodesk had
that 10 years ago. The current product for that is ReCap.[3] You can work from
drone imagery if necessary.

[1]
[https://www.youtube.com/watch?v=R0PDCp0QF1o](https://www.youtube.com/watch?v=R0PDCp0QF1o)

[2] [https://www.youtube.com/watch?v=1D0EhSi-
vvc](https://www.youtube.com/watch?v=1D0EhSi-vvc)

[3]
[https://www.autodesk.com/products/recap/overview](https://www.autodesk.com/products/recap/overview)

~~~
numlock86
> Here's an open source one.[1] And another one.[2]

I think both of these links refer to Meshroom/AliceVision. Very nice input,
though.

------
iamleppert
It makes for a cool demo but all the applications that could benefit from this
need much higher quality textures and better dimensional accuracy. Also there
are certain cases where this fails, reflective or partially transparent
surfaces, lack of texture, etc.

~~~
dvasdekis
Object detection can 'normalise' the shape of your couch from a noisy input,
and the upscaling algorithms we've seen recently can clean up the textures.
It's only an aggressive short-term R&D effort away from being terrifying.

~~~
heavenlyblue
I can already see deep learning applied to normal maps of the various
(+coloured) surfaces at home.

You could then deep-filter the inputs and get a higher-resolution version of
the same.

------
scanny
This would be pretty handy for indoor mapping. Imagine a crowd-sourced 3d OSM
for malls, public buildings, office-plans ect.

That would be massively useful.

~~~
bigiain
Serious question. What are some examples you're thinking about of "massively
useful" applications of this?

I think it's a really cute tech demo, but I'm struggling to find obvious uses
that'd make the world a better place or generate huge piles of cash.

~~~
scanny
Fair point, for me 'massively' would be an significant amount of time /
confusion saved.

Whenever I want to go catch a bus at an overseas metro station, try find a
shop in a unfamiliar mall, or even try find where a clients desk is in an
unfamiliar building, that would be fantastic.

The a potential to generate a snapshot of a space indoors would be super
handy. Maybe even for places without an address you could look at a
timestamped point cloud like that and see which building to go to.

Democratizing capturing 3d space in a simple/user-friendly and cheap way would
open up really interesting doors.

~~~
bigiain
Streetview type imagery/navigation solves that problem though, right? There's
no need for the depth map/3d reconstruction going on here? Just a "walk
through" with a 360degree camera (or multiple overlapping cameras) capturing
2D lets you find the shop or the desk.

~~~
scanny
Potentially, but I am sure you find use from zooming around in 3d before going
into streetview and taking a closer look, as being able to look on from up is
a useful perspective. At least I can find streetview disorienting at times.

Maybe it just comes down to personal preference. But having a finer scale of
3d, another level in google maps for example, seems like a logical next step.

------
jmpman
I expect Apple is working on this type of technology for their next iPhone.
What better use case for a back facing depth camera (yes, I understand this
demo was performed with a standard camera, but it would likely be better with
a depth camera). The implications for their maps, home furnishing, AR, Android
differentiator... all very interesting, and Apple will likely do it right.
Want a high resolution capture of a vase in your living room? Just walk
closer. Claim a space and mark it private? Sure. An option to turn an
elaborate house into a basic ceiling/walls/floors for others, maybe.

------
awinter-py
the baseline photogrammetry software is already 'mostly good' if you're
scanning something featureful & matte like a wall of graffiti

where it chokes is specular reflections, curves, 'wobbly bits' (leaves on
trees), lots of duplicate features (leaves on trees), and things in parallel
planes (leaves on trees in front of a brick wall)

these are all solvable with better scene models, object models and feature
stats

looking forward to the next gen of structure from motion tech

------
stelonix
This has been possible for a good while, the most common algorithm is Parallel
Tracking and Mapping (PTAM). No stereo cameras are needed, but there are also
implementations that make use of depth sensors.

What interests me is for an open implementation of Street View: drones plus
crowdsourcing and we'd be free of requiring Google services for street
visualization. We already have open street maps, we could have open street
view too.

~~~
gauku
For a course project, I implemented DTAM - which does dense mapping and
tracking, without needing any stereo vision, in real time (on a mid-range
GPU). It's really amazing.

~~~
stelonix
Hadn't heard of DTAM yet, but found a video and am very impressed! What GPU
did you use? Do you think it'd be feasible on mobile GPUs?

~~~
saganus
That sounds interesting.

Any links to further explain what DTAM is/how it works?

~~~
severine
DTAM: Dense tracking and mapping in real-time

> _DTAM is a system for real-time camera tracking and reconstruction which
> relies not on feature extraction but dense, every pixel methods. As a single
> hand-held RGB camera flies over a static scene, we estimate detailed
> textured depth maps at selected keyframes to produce a surface patchwork
> with millions of vertices. We use the hundreds of images available in a
> video stream to improve the quality of a simple photometric data term, and
> minimise a global spatially regularised energy functional in a novel non-
> convex optimisation framework. Interleaved, we track the camera 's 6DOF
> motion precisely by frame-rate whole image alignment against the entire
> dense model. Our algorithms are highly parallelisable throughout and DTAM
> achieves real-time performance using current commodity GPU hardware. We
> demonstrate that a dense model permits superior tracking performance under
> rapid motion compared to a state of the art method using features; and also
> show the additional usefulness of the dense model for real-time scene
> interaction in a physics-enhanced augmented reality application._

[https://ieeexplore.ieee.org/document/6126513](https://ieeexplore.ieee.org/document/6126513)

~~~
saganus
Awesome, thanks!

------
lemmox
I remember seeing some of this type of stuff with the Xbox Kinect. It's cool
that the form factor has moved to mobile (slowly swinging the Kinect around
the room was not ideal). I'm not following this space but from the video it
still looks like capturing good 3D models is a long way away. Being in a VR
world that looks like that would be terrifying.

~~~
tmilard
Yes I do agree.

------
rasz
Here is Jack Black demoing similar NSA technology in 1998
[https://www.youtube.com/watch?v=3EwZQddc3kY](https://www.youtube.com/watch?v=3EwZQddc3kY)

------
jvalencia
The possibilities for those with disabilities is remarkable. You wouldn't have
to map everything to be useful, just common routes.

------
throwaway13000
This is cool. I have a related question. Can I make 3d models of my 5yr old
kid and 3-d print robots that walk/talk like thme? Does current day technology
even suffice for this? Or do I need to create 3d models now and then wait for
a few years for technology to catch up?

~~~
insulanus
You could do it, but it would take forever. Here are some starting points for
the different technologies involved:

Here's how to make the walking robot:
[https://asimo.honda.com/](https://asimo.honda.com/)
[http://users.umiacs.umd.edu/~fer/cmsc828/classes/cse390-05-0...](http://users.umiacs.umd.edu/~fer/cmsc828/classes/cse390-05-02.pdf)

Here's how to make it look like them:
[https://www.creativebloq.com/features/deepfake-
examples](https://www.creativebloq.com/features/deepfake-examples)
[https://www.youtube.com/watch?v=_9qs6JudXJg](https://www.youtube.com/watch?v=_9qs6JudXJg)

Here's how to make it talk like them: [https://github.com/CorentinJ/Real-Time-
Voice-Cloning](https://github.com/CorentinJ/Real-Time-Voice-Cloning)

~~~
throwaway13000
Thnaks. yes, based on the list you posted, it will take forever. Guess I have
to wait for someone to put a package together.

The deepfakes can only do a 2-d model, not a 3-d model yet.

Btw, why are people down voting this?

