In the sizzle reel, the early waterdrop demos are beautiful but seem staged, the later robotics demos look more plausible and very impressive. But referring to all these "4D dynamical worlds" sounds overhyped / scammy - everyone else calls 3D space simulated through time a 3D world.
> Genesis's physics engine is developed in pure Python, while being 10-80x faster than existing GPU-accelerated stacks like Isaac Gym and MJX. ... Nvidia brought GPU acceleration to robotic simulation, speeding up simulation speed by more than one order of magnitude compared to CPU-based simulation. ... Genesis pushes up this speed by another order of magnitude.
I can believe that setting up some kind of compute pipeline in a high level language such as Python could be fast, but the marketing materials aren't explaining any of the "how", if it's real it must be GPU-accelerated, but they almost imply that it isn't. Looks neat, hope it works great!
Given what's there today, especially the sizzle reel, I'm pretty dubious.
If the author drops an amazing generative text-to-sim system on top of this... THAT is impressive - but effectively orthogonal to what's there - so I'm withholding excitement for now.
Take the time to read over the repo. It is not revolutionary. It is an integration of a bunch of third party packages (which are largely C/C++ libraries with Python wrappers, not "pure python"!). The stuff unique to Genesis is adequate implementations of well-known techniques, or integration code.
The backflip is awesome but plausibly explained by the third party RL library, and they include an example program which... runs a third party library to do just this.
The performance numbers are so far beyond real world numbers as to be incoherent. If you redefine what all the words mean, then the claims are not comparable to existing claims using the same words. 43 million FPS means, if my math is right, you are spending 70 clocks per frame on a 3ghz processor. On a 4080 you would have ~500k clocks in the same period, but that implies 100% utilization with zero overhead from Amdahl's law. (Also, Hi, Erwin, maybe you think these claims are 100% realistic for meaningful workloads in which case I'll gladly eat crow since I have a huge amount of respect for Bullet!)
I can only judge what's released now, not a theoretical future release, and what's here now is something a really good developer could bang out in a couple of months. The USP is the really good spin around the idea that it's uniquely suited for AI to produce beyond-SOTA results.
> But referring to all these "4D dynamical worlds" sounds overhyped / scammy - everyone else calls 3D space simulated through time a 3D world.
In the research community, "4D" is a commonly used term to differentiate from work on static 3D objects and environments, especially in recent years since the advent of NeRF.
The term "dynamic" has long been used similarly, but sometimes connotes a narrower scope. For example, reconstruction of cloth dynamics from an RGBD sensor, human body motion from a multi-view camera rig, or a scene from video, but assuming that the scene can be decomposed into rigid objects with their individual dynamics and an otherwise static environment. An even narrower related term in this space would be "articulated", such as reconstruction of humans, animals, or objects with moving parts. However, the representations used in prior works typically did not generalize outside their target domains.
So, "4D" has become more common recently to reflect the development of more general representations that can be used to model dynamic objects and environments.
If you'd like to find related work, I'd recommend searching in conjunction with a conference name to start, e.g. "4D CVPR" or "4D NeurIPS", and then digging into webpages of specific researchers or lab groups. Here are a couple interesting related works I found:
All that considered, "4D dynamical worlds" does feel like buzzword salad, even if the intended audience is the research community, for two main reasons. First, it's as if some authors with a background in physics simulation wanted to reference "dynamical systems", but none of the prior work in 4D reconstruction/generation uses "dynamical", they use "dynamic". Second, as described above, the whole point of "4D" is that it's more general than "dynamic", using both is redundant. So, "4D worlds" would be more appropriate IMO.
Its a feature of that field of science. I'm currently working in a lab that is doing bunch of things that in papers are described $adjective-AI. In practice its just a slightly hyped, but vaguely agreed upon by consensus in weird science paper english term, or set of terms. (in the same way that guassian splats and totally just point clouds with efficient alpha blending[only slightly more complex, please don't just take my word for it])
You probably understand what this term is meant to describe, but to spell it out gives a bit of insight into _why_ its got such a shite name.
o "4d": because its doing things over time. Normally thats a static scene with a camera flying through it (3D). when you have stuff other than the camera moving, you get an extra dimension, hence 4D.
o "dynamical" (god I hate this) dynamic means that objects in the video are moving around. So you can just used the multiple camera locations to build up a single view of an object or room, you need to account for movement of things in the scene.
o "worlds" to highlight that its not just one room being re-used over and over, its a generator (well its not, but thats for another post) of diverse scenes that can represent many locations around the world.
> Genesis's physics engine is developed in pure Python, while being 10-80x faster than existing GPU-accelerated stacks like Isaac Gym and MJX. ... Nvidia brought GPU acceleration to robotic simulation, speeding up simulation speed by more than one order of magnitude compared to CPU-based simulation. ... Genesis pushes up this speed by another order of magnitude.
I can believe that setting up some kind of compute pipeline in a high level language such as Python could be fast, but the marketing materials aren't explaining any of the "how", if it's real it must be GPU-accelerated, but they almost imply that it isn't. Looks neat, hope it works great!