What if instead of a video game, this was trained on video and control inputs fr... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

jetrink 20 days ago | parent | context | favorite | on: Diffusion models are real-time game engines

What if instead of a video game, this was trained on video and control inputs from people operating equipment like warehouse robots? Then an automated system could visualize the result of a proposed action or series of actions when operating the equipment itself. You would need a different model/algorithm to propose control inputs, but this would offer a way for the system to validate and refine plans as part of a problem solving feedback loop.

Workaccount2 20 days ago [–]

>Robotic Transformer 2 (RT-2) is a novel vision-language-action (VLA) model that learns from both web and robotics data, and translates this knowledge into generalised instructions for robotic control

https://deepmind.google/discover/blog/rt-2-new-model-transla...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact