A mixture of many architectures. LLMs will probably play a part.
As for other possible technologies, I'm most excited about clone-structured causal graphs[1].
What's very special about them is that they are apparently a 1:1 algorithmic match to what happens in the hippocampus during learning[2], to my knowledge this is the first time an actual end-to-end algorithm has been replicated from the brain in fields other than vision.
[1] seems to be an amazing paper, bridging past relational models, pattern separation/completion, etc. As someone who's phd dealt with hippocampal dependent memory binding, I've always enjoyed the hippocampal modeling as one of the more advanced areas of the field. Thanks!
A fun little bit of trivia: Mammalian brains implement Gabor filters in the primary visual cortex (V1), as the first step of the visual processing pipeline.
> The LLM can't describe it, it doesn't see inside itself, however, it has many textbooks in its training dataset, so it will grab an answer from these textbooks because that's how people answer.
EDIT: I see now that you were referring to the answers it uses to justify the result, not the underlying computations. Sorry! You can disregard the actual comment. Leaving for completeness.
ORIGINAL COMMENT:
That's not how it works. Addition in LLMs is believed to function through different mechanisms depending on model size and architecture, but the single consistent finding across different models is that they generalize beyond the training data for at least those simple arithmetic operations.
> I understand Sergey Brin/et al had a grandiose goal for DeepMind via their Atari games challenge - but why not try alternate methods - say build/tweak games to be RL-friendly?
Because the ultimate goal (real-world visual intelligence) would make that impossible. There's no way to compute the "essential representation" of reality, the photons are all there is.
There is no animal on planet earth that functions this way.
Visual cortex and plenty of other organs compress the data into useful, semantic information before feeding into a 'neural' network.
Simply from an energy and transmission perspective an animal would use up all its store to process a single frame if we were to construct such an organism based on just 'feed pixels to a giant neural network'. Things like colors, memory, objects, recognition, faces etc are all part of the equation and not some giant neural network that runs from raw photons hitting cones/rods.
So this isn't biomimicry or cellular automata - it's simply a fascination similar to self-driving cars being able to drive with a image -> {neural network} -> left/right/accelerate simplification.
Brains may operate on a compressed representation internally, but they only have access to their senses as inputs. A model that needs to create a viable compressed representation is quite different from one which is spoon fed one via some auxiliary data stream.
Also I believe the DeepMind StarCraft model used the compressed representation, but that was a while ago. So that was already kind of solved.
> simply a fascination similar to self-driving cars being able to drive with a image
Whether to use lidar is more of an engineering question of the cost/benefit of adding modalities. LiDAR has come down in price quite a bit so it’s less wise in retrospect.
Brains also have several other inputs that an RL algorithm trained from raw data (pixels/waves etc) don't have:
- Millions of years of evolution (and hence things like walking/swimming/hunting are usually not acquired characteristics even within mammals)
- Memory - and I don't mean the neural network raw weights. I mean concepts/places/things/faces and so on that is already processed and labeled and ready to go.
- Also we don't know what we don't know - how do cephalopods/us differ in 'intelligence'?
I am not trying to poo-poo the Dreamer kind of work: I am just waiting for someone to release a game that actually uses RL as part of the core logic (Sony's GT Sophy comes close).
Such a thing would be so cool and would not (necessarily) use pixels as they are too far downstream from the direct internal state!
That depends entirely on whether you believe understanding requires consciousness.
I believe that the type of understanding demonstrated here doesn't. Consciousness only comes into play when we become aware that such understanding has taken place, not on the process itself.
As someone who generally opposes Bukele on an ideological level, I'm not even that mad about the mass sweeps themselves. The way things were going before, it was obvious a drastic solution was needed.
It's the CECOT conditions and the boasting about cruel treatment that I'm unhappy about. At this point it's plain torture, and I believe it's wrong to torture people, even if those people are themselves torturers or worse.
While certainly not human-level intelligence, I don't see how you could say they don't have any sort of it. There's clearly generalization there. What would you say is the threshold?
The threshold would be “produce anything that isn’t identical or a minor transfiguration of input training data.”
In my experience my AI assistant in my code editor can’t do a damn thing that isn’t widely documented and sometimes botches tasks that are thoroughly documented (such as hallucinating parameters names that don’t exist). I can witness this when I reach the edge of common use cases where extending beyond the documentation requires following an implication.
For example, AI can’t seem to understand how to help me in any way with Terraform dynamic credentials because the documentation is very sparse, and it is not part of almost any blog posts or examples online. My definition the variable is populated dynamically and real aren’t shown anywhere. I get a lot of irrelevant nonsense suggestions on how to fix it.
AI is a great “amazing search engine” and it can string together combinations of logic that already exist in documentation and examples while changing some names here and there, but what looks like true understanding really is just token prediction.
IMO the massive amount of training data is making the man behind the curtain look way better than he is.
That's creativity, not intelligence. LLMs can be intelligent while having very little (or even none at all) creativity. I don't believe one necessarily requires the other.
That was an extreme example to illustrate the concept. My point is that reduced/little creativity (which is what the current models have) is not indicative of a total lack of intelligence.
As for other possible technologies, I'm most excited about clone-structured causal graphs[1].
What's very special about them is that they are apparently a 1:1 algorithmic match to what happens in the hippocampus during learning[2], to my knowledge this is the first time an actual end-to-end algorithm has been replicated from the brain in fields other than vision.
[1] "Clone-structured graph representations enable flexible learning and vicarious evaluation of cognitive maps" https://www.nature.com/articles/s41467-021-22559-5
[2] "Learning produces an orthogonalized state machine in the hippocampus" https://www.nature.com/articles/s41586-024-08548-w
reply