Show HN: SplatGallery – A Community-Driven Gallery for Gaussian Splats

wongarsu · 2024-11-04T13:45:00 1730727900

HN readers interested in more splats might be interested in polycam's discovery page [1]. Same idea, but limited to splats generated with polycam (either in the app or with uploaded images). It's also a less filtered look at the technology, more a collection of neat things than a strict highlight reel

1 https://poly.cam/explore?type=splat&feed=trending

ratedgene · 2024-11-04T12:56:55 1730725015

I love the gaussian splatting that's going on. I also love the people pushing gaussian splitting and generative AI. I really feel there is something there but I'm not quite sure what yet. It's cool seeing this unfold, but I'm also worried it can turn into something like Photosynth, where it was a cool exercise but not much came out of it. I would love someone's input who is involved in this tech that could blue sky where it could be applied in interesting ways.

andybak · 2024-11-04T13:00:32 1730725232

> like Photosynth, where it was a cool exercise but not much came out of it

It's hard to know what Photosynth could have become. It was shut down by Microsoft for unknown reasons. If it was open source it might have evolved and might still be around.

Gaussian Splatting has multiple implementations on both the training/generation side and the rendering side.

mackopes · 2024-11-04T13:11:12 1730725872

100%. I really think Gaussian Splatting as a method can offer a lot, but it's also not powerful enough (yet) to be useful. Specifically editing and relighting GS models is rather clunky.

I think SplatGallery is an exploration of how people use the technology and what can it be useful for today and what is the next step.

neomantra · 2024-11-04T14:24:58 1730730298

Gaussian splatting has been my obsession lately =)

Hard to believe, but the main technique is just over a year old (built on the shoulder of giants). This is the seminal paper for it [1] and here's a three-hour video so you really understand it [2] (and great how-to-really-read-a-paper-if-you-are-serious video). So what's great is a bunch of labs saw that and started building on it, so in the last three months there have been so many great improvements. And so much work done openly!

This has a great video roll [3] of some recent work, including use in construction and forestry.

If you have an Apple Vision, try MetalSplatter [4] and you will get an idea of how OMFG this stuff is.

We are in such a rich time of new compression knowledge and reality into weird representations!

I've been trying to evangelize it, but it takes so much foundation to understand how interesting it is. Many people (even software engineers and computer scientists) don't understand traditional 3D rendering pipelines and meshes/triangles and lighting. So you have to explain that, then concepts of spherical harmonics, gaussians with affine transforms, and the miracle that happens when you sample millions of them using raycasting at 800 fps. The neural network approaches go at 2fps.

In ideating and communicating the possibilities, I try to focus on the workflows...

Capture: we can sample lighting and depth of scenes with our phones or fancy cameras or drones; we can also use photogrammetry (algos/math to create depth fields from photos). This isn't specific to 3DGS, but 3DGS empowers it to be useful. So we have this tech where we can more easily capture objects and environments, edit them, and play it back. 3D captures has been around a long time (e.g. I worked on the Immersion Microscribe in the mid0-90s [4] and this is what machine vision used to be about), but we didn't have techniques to infer structure from the point clouds.

Processing: magic math turns this into a bunch of Gaussians with a transformation matrix (bell curves of different shape floating around) which represent the structure. Literally the components are spherical harmonics (color), density (alpha), variance, translation, rotation. Scenes will have many hundreds of thousands to tens of millions of them.

I tend to mention how LLMs capture aspects of knowledge into a big sea of weights that get computation applied to them and that this is similar very abstractly; and researchers have worked with Neural Radiance Fields. But what's great about Gaussian+Transform is that this is you can actually get an intuition of what's going on -- and the editors let you edit with the gaussians and filter/prune them. You can't do that with an NN (have intuition and direct manipulation).

Rendering: those objects are sampled and drawn on at interactive rates. You can represent scenes or objects. You can commingle them with traditional 3D assets. Works on recents phones very well. So this tech is broadly available now for playback.

The thing about it is that they lack structure. It's can be very ghostly and ethereal. So I think in the near term, this tech will be fantastic for customization/personal object capture/integration with scenes... but not for simulation but for human communication. As noted, the forestry and construction videos hint at this. Also product displays in website -- here's a Shopify plugin [6].

I have so much to say about it, but will stop here =)

[1] https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/

[2] https://www.youtube.com/watch?v=xgwvU7S0K-k&t=1s

[3] https://www.youtube.com/playlist?list=PLrhy9mGYkm0aZnjL-4OpO...

[4] https://apps.apple.com/us/app/metalsplatter/id6476895334

[5] https://revware.net/microscribe-portable-cmm/microscribe-i-p...

[6] https://bitbybit.dev

andybak · 2024-11-04T16:58:41 1730739521

1. Support for WebXR - first thing I want to do with a splat is view it in a headset)

2. Support for any emerging formats that improve on .ply (https://github.com/nianticlabs/spz looks promising)

3. Better navigation controls (or at least document what you already support)

mackopes · 2024-11-04T21:43:35 1730756615

Thanks a lot for trying it out and the feedback!

More formats are definitely on the roadmap. At first I'd like to add full support for higher order of spherical harmonics and then I wanted to take a look at other methods, too. Maybe 2DGS and/or SMERF would make sense for comparison purposes.

Explanation of controls makes sense, too. I'll add that.

Regarding WebXR, that's currently not on my roadmap. However, if the demand is there I'd consider adding it

neomantra · 2024-11-04T18:38:18 1730745498

Thanks for making this OP! It's a quick way to get people into it viewing it and great to see community fostering.

* I discovered I can translate with ctrl-mouse, but maybe add WASD keys as well?

* I discovered "L" brings up a camera rotation display. The sliders rotates on weird axes (wrong space?), so not too helpful of a UX. Clicking the help magnifier says "Open Filter with CMD+SHIFT+L" but on Safari that opens the sidebar.

* Simple help display for either of those too might help

EDIT: * Tried it on an iPhone 13 Pro... touch controls work great and it is so smooth an good looking. In landscape mode, the header takes up a lot of space. A full-screen button would be great.

mackopes · 2024-11-04T21:40:53 1730756453

Thanks a lot for trying it out and the feedback!

- Adding better controls (and an explanation) makes sense!

- Good job discovering the L control. That's actually something that's not supposed to be public. I use that tool to find the correct orientation of splats before publishing them. But once I find those values saved in the DB, there's no use for that tool anymore. When I make it available to the users I'll defo make the UX clearer.

- Good point about landscape mode. I'll fix that.

andybak · 2024-11-04T16:51:56 1730739116

Please don't break standard browser behaviour.

The two category buttons at the top should be a tags not buttons. They should change the url history so the back button works and so I can alt-click to open in a new tab.

mackopes · 2024-11-05T13:50:16 1730814616

Good point. I'll fix this asap. I can imagine it can be annoying when you can't alt+click

humbleferret · 2024-11-04T14:27:14 1730730434

Great work!

Are there any particular niches or applications where community-sourced splats are especially well-suited? I see them mentioned on HN from time to time, but I'm curious about their primary use case at the moment.

andybak · 2024-11-04T16:55:06 1730739306

For me they are by far the best way to view a 3d scene captured from the real world. They are the real "volumetric photograph" and (especially in a VR headset) gives you a strong feeling of being "back there".

360/180 panoramas are fun (I especially used to love the Google stereo panoramas that also contained ambient audio loop from the moment of capture) - but a splat scene is a whole different level.

mackopes · 2024-11-04T15:55:48 1730735748

Thanks! At the moment the intent is to showcase what the method is capable (and what isn't).

One of my future plans is to extend it to other methods than just gaussian splatting so that we can compare methods across a wide range of captures with varying light conditions, different materials, etc..

two_handfuls · 2024-11-04T15:23:21 1730733801

I wonder if they may become part of a benchmark in the future.