Does anyone else look forward to a game that lets you transform your house or neighbor into a playable level with destructible objects? How far are we from recognizing the “car” and making it drivable, or the “tree” and making it choppable?
I work in the rendering and gaming industry and also run a 3D scanning company. I have similarly wished for this capability, especially the destructability part. What you speak of is still pretty far off for several reasons:
-No Collision/poor collision on NERFs and GS: to have a proper interactive world, you usually need accurate character collision so that your character or vehicle can move along the floor/ground (as opposed to falling thru it) run into walls, go through door frames, etc. NERFs suffer from the same issues as photogrammetry in that they need “structure from motion” (COLMAP or similar) to give them a mesh or 3-D output that can be meshed for collision to register off of. The mesh from reality capture is noisy, and is not simple geometry. Think millions of triangles from a laser scanner or camera for “flat” ground that a video game would use 100 triangles for.
-Scanning: there’s no scanner available that provides both good 3-D information and good photo realistic textures at a price people will want to pay. Scanning every square inch of playable space in even a modest sized house is a pain, and people will look behind the television, underneath the furniture and everywhere else that most of these scanning videos and demos never go. There are a lot of ugly angles that these videos omit where a player would go.
-Post Processing: of you scan your house or any other real space, you will have poor lighting unless you took the time to do your own custom lighting and color setup. That will all need to be corrected in post process so that you can dynamically light your environment. Lighting is one of the most next generation things that people associate with games and you will be fighting prebaked shadows throughout the entire house or area that you have scanned. You don’t get away from this with NERFs or gaussian splats, because those scenes also have prebaked lighting in them that is static.
Object Destruction and Physics: I Love the game teardown, and if you want to see what it’s like to actually bust up and destroy structures that have been physically scanned, there is a plug-in to import reality capture models directly into the game with a little bit of modding. That said, teardown is voxel based, and is one of the most advanced engines that has been built to do such a thing. I have seen nothing else capable of doing cool looking destruction of any object, scanned or 3D modeled, without a large studio effort and a ton of optimization.
I think collision detection is solvable. And the scanning process should be no harder than 3D modeling to the same quality level. Probably much easier, honestly. Modeling is labor intensive. I'm not sure why you say "there’s no scanner available that provides both good 3-D information and good photo realistic textures" because these new techniques don't use "scanners", all you need is regular cameras. The 3D information is inferred.
Lighting is the big issue, IMO. As soon as you want any kind of interactivity besides moving the camera you need dynamic lighting. The problem is you're going to have to mix the captured absolutely perfect real-world lighting with extremely approximate real-time computed lighting (which will be much worse than offline-rendered path tracing, which still wouldn't match real-world quality). It's going to look awful. At least, until someone figures out a revolutionary neural relighting system. We are pretty far from that today.
Scale is another issue. Two issues, really, rendering and storage. There's already a lot of research into scaling up rendering to large and detailed scenes, but I wouldn't say it's solved yet. And once you have rendering, storage will be the next issue. These scans will be massive and we'll need some very effective compression to be able to distribute large scenes to users.
You are correct; most of these new techniques are using a camera. In my line of work I consider a camera sensor a scanner of sorts, as we do a lot of photogrammetry and “scan” with a 45MP full frame. The inferred 3D from cameras is pretty bad when it comes to accuracy, especially from dimly lit areas or where you dip into a closet or closed space that doesn’t have a good structural tie back to the main space you are trying to recreate in 3D. Laser scanners are far preferable to tie your photo pose estimation to, and most serious reality capture for video games is done with both a camera a and $40,000+ LiDAR Scanner. Have you ever tried to scan every corner of a house with only a traditional DSLR or point and shoot camera? I have and the results are pretty bad from a 3D standpoint without a ton of post process.
The collision detection problem is related heavily to having clean 3D as mentioned above. My company is doing development on computing collision on reality capture right now in a clean way and I would be interested in any thoughts you have. We are chunking collision on the dataset at a fixed distance from the player character (can’t go too fast in a vehicle or it will outpace the collision and fall thru the floor) and have a tunable LOD that influences collision resolution.
Both my iPhone and my Apple Vision Pro both have lidar scanners, fwiw.
Frankly I’m surprised that I can’t easily make crude 3D models of spaces with a simple app presently. It seems well within the capabilities of the hardware and software.
Those LiDAR sensors on phones and VR headsets are low resolution and mainly used to improve the photos and depth information from the camera. Different objective than mapping a space, which is mainly being disrupted by improvements from the self driving car and ADAS industries
I feel like the lighting part will become "easy" once we're able to greatly simplify the geometry and correlate it across multiple "passes" through the same space at different times.
In other words, if you've got a consistent 3D geometric map of the house with textures, then you can do a pass in the morning with only daylight, midday only daylight, late afternoon only daylight, and then one at night with artificial light.
If you're dealing with textures that map onto identical geometries (and assume no objects move during the day), it seems like it ought to be relatively straightforward to train AI's to produce a flat unlit texture version, especially since you can train them on easily generated raytraced renderings. There might even be straight-up statistical methods to do it.
So I think it not the lighting itself that is the biggest problem -- it's having the clean consistent geometries in the first place.
Maybe a quick, cheap NeRF with some object recognition, 3D object generation and replacement, so at least you have a sink where there is a sink and a couch where you have a couch, even though it might look differently.
Is there a teardown mod that uses reality captured models? Or is there any video even? I have played the game once, destruction was awesome. I want to see how it looks like the way you said.
My parents had a floor plan of our house drawn up for some reason, and when I was in late middle school I found it and modeled the house in the hammer editor so my friends and I could play Counter Strike source in there.
It wasn't very well done but I figured out how to make the basic walls and building, add stairs, add some windows, grab some pre existing props like simple couches beds and a TV, and it was pretty recognizable. After adding a couple ladders to the outside so you could climb in the windows or on the roof the map was super fun just as a map, and doubly so since I could do things like hide in my own bedroom closet and recognize the rooms.
Took some work since I didn't know how to do anything but totally worth it. I feel like there has to be a much more accessible level editor in some game out there today, not sure what it would be though.
I thought my school had great architecture for another map but someone rightfully convinced me that would be a very bad idea to add to a shooting game. So I never made any others besides the house.
An interactive game is much more than just rendering. You need object separation, animation, collision, worldspace, logic, and your requirement of destructibility takes it to a completely different level.
NeRF is not that, it's just a way to represent and render volumetric objects. It's like 10% of what makes a game. Eventually, in theory, it might be possible to make NeRFs or another similar representation animated, interactive, or even entirely drivable by an end-to-end model. But the current state is so far from it that it isn't worth speculating about.
What you want is doable with classic tools already.
I dreamed of that since being a kid, so for nearly three decades now. It's been entirely possible even then - it was just a matter of using enough elbow grease. The problem is, the world is full of shiny happy people ready to call you a terrorist, assert their architectural copyright, or bring in the "creepiness factor", to shut down anyone who tries this.
There's also just the fact that 1:1 reproductions of real-world places rarely make for good video game environments. Gameplay has to inform the layout and set dressing, and how you perceive space in games requires liberties to be taken to keep interiors from feeling weirdly cramped (any kid who had the idea to measure their house and build it in Quake or CS found this out the hard way).
The main exception I can think of is in racing simulators, it's already common for the developers of those to drive LiDAR cars around real-world tracks and use that data to build a 1:1 replica for their game. NeRF might be a natural extension of that if they can figure out a way to combine it with dynamic lighting and weather conditions.
Having destructible objects is in no way possible on contemporary hardware, unless you simplify the physics to the extreme. Perhaps I'm misunderstanding your statement?
Recognising objects for what they are has only recently become somewhat possible. Separating them in a 3D scan is still pretty much impossible.
Destructible environments have been a thing for like....a decade or so? There's plenty of tricks to make it realistic enough to be fun without simulating every molecule.
A whole lot of manual work goes into making destructible 3D assets. Combined, I've put nearly a full work week into perfecting a breaking bottle simulation in Houdini to add to my demo reel, and it's still not quite there. And that's starting out with nice clean geometry that I made myself! A lot of it comes down to breaking things up based on voronoi tessellation of the surfaces, which is easy when you've got an eight-pointed cube, but it takes a lot more effort and is much more error prone as the geometric complexity increases. If you can figure out how to easily make simple enough, realistic looking, manifold geometry from real world 3d scans that's clean enough for standard 3D asset pipelines, you'll make a lot of money doing it.
We've had destructible polygonal and voxel environments for a while now, yes. Destructible Nerfs are a whole other ball game - we're only just starting to get a handle on reliably segmenting objects within nerfs, let alone animating them
My statement applies even without the destructive environment part - even though that was already mainstream 23 years ago! See Red Faction. No, just making a real-life place a detailed part of a video game is going to cause the pushback I mentioned.