I didn't ignore any more context than you did, but just I want to acknowledge the irony that "context" (specifically, here, any sort of memory that isn't in the text context window) is exactly what is lacking with these models.
For example, even the dumbest dog has a memory, a strikingly advanced concept model of the world [1], a persistent state beyond the last conversation history, and an ability to reason (that doesn't require re-running the same conversation sixteen bajillion times in a row). Transformer models do not. It's really cool that they can input and barf out realistic-sounding text, but let's keep in mind the obvious truths about what they are doing.
[1] "I like food. Something that smells like food is in the square thing on the floor. Maybe if I tip it over food will come out, and I will find food. Oh no, the person looked at me strangely when I got close to the square thing! I am in trouble! I will have to do it when they're not looking."
> that doesn't require re-running the same conversation sixteen bajillion times in a row
Lets assume the dog visual systems run at 60 frames per second. If it takes 1 second to flip a bowl of food over then that's 60 datapoints of cause-effect data that the dog's brain learned from.
Assuming it's the same for humans, lets say I go on a trip to the grocery store for 1 hour. That's 216,000 data points from one trip. Not to mention auditory data, touch, smell, and even taste.
> ability to reason [...] Transformer models do not
Can you tell me what reasoning is? Why can't transformers reason? Note I said transformers not llm's. You could make a reasonable (hah) case that current LLMs cannot reason (or at least very well) but why are transformers as an architecture doomed?
What about chain of thought? Some have made the claim that chain of thought adds recurrence to transformer models. That's a pretty big shift, but you've already decided transformers are a dead end so no chance of that making a difference right?
For example, even the dumbest dog has a memory, a strikingly advanced concept model of the world [1], a persistent state beyond the last conversation history, and an ability to reason (that doesn't require re-running the same conversation sixteen bajillion times in a row). Transformer models do not. It's really cool that they can input and barf out realistic-sounding text, but let's keep in mind the obvious truths about what they are doing.
[1] "I like food. Something that smells like food is in the square thing on the floor. Maybe if I tip it over food will come out, and I will find food. Oh no, the person looked at me strangely when I got close to the square thing! I am in trouble! I will have to do it when they're not looking."