A Revolution in How Robots Learn

m_ke · 2024-11-26T20:50:07 1732654207

I did a review of state of the art in robotics recently in prep for some job interviews and the stack is the same as all other ML problems these days, take a large pretrained multi modal model and do supervised fine tuning of it on your domain data.

In this case it's "VLA" as in Vision Language Action models, where a multimodal decoder predicts action tokens and "behavior cloning" is a fancy made up term for supervised learning, because all of the RL people can't get themselves to admit that supervised learning works way better than reinforcement learning in the real world.

Proper imitation learning where a robot learns from 3rd person view of humans doing stuff does not work yet, but some people in the field like to pretend that teleoperation and "behavior cloning" is a form of imitation learning.

ibash · 2024-11-26T23:25:03 1732663503

Would you mind sharing which readings you found most useful in that review?

chillel · 2024-11-27T00:18:49 1732666729

yeah, I'd love to see your list

ratedgene · 2024-11-26T22:25:16 1732659916

Hey, I wonder if we can use LLMs to learn learning patterns, I guess the bottleneck would be the curse of dimensionality when it comes to real world problems, but I think maybe (correct me if I'm wrong) geographic/domain specific attention networks could be used.

Maybe it's like:

1. Intention, context 2. Attention scanning for components 3. Attention network discovery 4. Rescan for missing components 5. If no relevant context exists or found 6. Learned parameters are initially greedy 7. Storage of parameters gets reduced over time by other contributors

I guess this relies on there being the tough parts: induction, deduction, abductive reasoning.

Can we fake reasoning to test hypothesis that alter the weights of whatever model we use for reasoning?

ratedgene · 2024-11-26T22:28:12 1732660092

Maybe I'm just complicating unsupervised reinforcement learning, and adding central authorities for domain specific models.

Animats · 2024-11-26T19:15:53 1732648553

A research result reported before, but, as usual, the New Yorker has better writers.

Is there something which shows what the tokens they use look like?

x11antiek · 2024-11-26T13:51:57 1732629117

https://archive.is/fsuxe

josefritzishere · 2024-11-26T22:02:54 1732658574

There's a big asterisk on the word "learn" in that headline.

codr7 · 2024-11-26T20:21:14 1732652474

Oh my, that has to be one of the worst jobs ever invented.