I work on a much easier problem (physics-based character animation) after spendi...

glial · 2025-11-15T05:45:44 1763185544

"We present Dreamer 4, a scalable agent that learns to solve control tasks by imagination training inside of a fast and accurate world model. ... By training inside of its world model, Dreamer 4 is the first agent to obtain diamonds in Minecraft purely from offline data, aligning it with applications such as robotics where online interaction is often impractical."

In other words, it learns by watching, e.g. by having more data of a certain type.

onlyrealcuzzo · 2025-11-14T14:03:44 1763129024

Is Physics-based character animation an easier problem?

Almost any problem can be really hard depending on the amount of 9s.

Maybe there's more room for error in a lot of robotics applications than for your physics-based character animation?

golol · 2025-11-14T12:46:44 1763124404

I am pushing the optimism a bit of course, but currently we can see many demos of robots doing basic tasks, and it seems like it is quite easy nowadays to do this with the data driven approach.

wordpad · 2025-11-13T21:53:06 1763070786

Why? Physics of large discrete objects (such as a robot) isn't very complicated.

I thought it's fast accurate OCR that's holding everything back.

markisus · 2025-11-13T22:14:12 1763072052

The problem becomes complicated once the large discrete objects are not actuated. Even worse if the large discrete objects are not consistently observable because of occlusions or other sensor limitations. And almost impossible if the large discrete objects are actuated by other agents with potentially adversarial goals.

Self driving cars, an application in which physics is simple and arguably two dimensional, have taken more than a decade to get to a deployable solution.

jcims · 2025-11-13T23:30:56 1763076656

I just grabbed a beer about ten minutes ago.

Next to zero cognition was involved in the process. There's some kind of hierarchy of thought in the way my mind/brain/body processed the task. I did cognitively decide to get the beer, but I was focused on something at work and continued to think about that in great detail as the rest of me did all of the motion planning and articulation required to get up, walk through two doorways, open the door on the fridge, grab a beer, close the door, walk back and crack the beer as I was sitting down.

Basically zero thought in that entire sequence.

I think what's happening today with all of this stuff is ultimately like me trying to play Fur Elise on piano. I don't have a piano. I don't know how to play one. I'm going to be all brain in that entire process and it's going to be awful.

We need to learn how to use the data we have to train these layers of abstraction that allow us to effectively compress tons of sophistication into 'get a beer'.