I'm not actually sure that's true. There is a lot of detail in the world represe...

benlivengood on March 27, 2023 | parent | context | favorite | on: Do large language models need sensory grounding fo...

I'm not actually sure that's true. There is a lot of detail in the world represented in audio and video, and presumably large transformers could learn from the textures and shadows and articulated movements and the physical modeling of how sounds are made, etc.