Hacker News new | past | comments | ask | show | jobs | submit login
Inferring neural activity before plasticity for learning beyond backpropagation (nature.com)
145 points by warkanlock 10 days ago | hide | past | favorite | 49 comments





It has been clear for a long time (e.g. Marvin Minsky's early research) that:

1. both ANNs and the brain need to solve the credit assignment problem 2. backprop works well for ANNs but probably isn't how the problem is solved in the brain

This paper is really interesting, but is more a novel theory about how the brain solves the credit assignment problem. The HN title makes it sound like differences between the brain and ANNs were previously unknown and is misleading IMO.


> The HN title makes it sound like differences between the brain and ANNs were previously unknown and is misleading IMO.

Agreed on both counts. There's nothing surprising in "there are differences between the brain and ANN's."

But their might be something useful in the "novel theory about how the brain solves the credit assignment problem" presented in the paper. At least for me, it caught my attention enough to justify giving it a full reading sometime soon.


s/their might/there might/

Dang it, how did I miss that. Uugh. :-(


Are there any results about the "optimality" of backpropagation? Can one show that it emerges naturally from some Bayesian optimality criterion or a dynamic programming principle? This is a significant advantage that the "free energy principle" people have.

For example, let's say instead of gradient descent you want to do a Newton descent. Then maybe there's a better way to compute the needed weight updates besides backprop?


I'd be willing to be proven wrong, but as a starting point I'd suggest it obviously isn't optimal for what it is being used for. The performance on tasks of AI seems to be quite poor relative to the time spent training. For example, when AIs overtake humans at Baduk it is normal for the AI to have played several orders of magnitude more games than elite human players.

The important thing is backprop does work and so we're just scaling it up to absurd levels to get good results. There is going to be a big step change found sooner or later where training gets a lot better. Maybe there is some sort of threshold we're looking for where a trick only works for models with lots of parameters or something before we stumble on it, but if evolution can do it so will researchers.


> For example, let's say instead of gradient descent you want to do a Newton descent. Then maybe there's a better way to compute the needed weight updates besides backprop?

IIRC, feedback alignment [1] approximates Gauss-Newton minimization. So there is an easier way, that is potentially biologically more plausible, though not necessarily a better way.

[1] https://www.nature.com/articles/ncomms13276#Sec20


Second order methods, and their approximations, can be used in weight updating, too.

> The HN title makes it sound like differences between the brain and ANNs were previously unknown and is misleading IMO

There are no words in the title which express this. Your own brain is "making it sound" like that. Misleading, yes, but attribute it correctly.


"differs fundamentally", being in the tense that it is, with the widely known context that AI is "modeled after the brain", definitely does suggest that oh no, they got the brain wrong when that modelling happened, therefore AI is fundamentally built wrong. Or at least I can definitely see this angle in it.

The angle I actually see in it though is the typical pitiful appeal to the idea that the brain is this incredible thing we should never hope to unravel, that AI bad, and that everyone working on AI is an idiot as per the link (and then the link painting a leaps and bounds more nuanced picture).


The title does express that, due to context. An article in Nature with the title "X is Y" suggests that, until now, we didn't know that X is Y, or we even thought that X is definitely not Y.

The title of the paper is: "Inferring neural activity before plasticity as a foundation for learning beyond backpropagation"

The current HN title ("Brain learning differs fundamentally from artificial intelligence systems") seems very heavily editorialized.


As https://news.ycombinator.com/item?id=42260033 said, the difference is not a new discovery, not surprising, and not the focus of the paper.

Making the 'fundimental difference' the focus seems like laying the foundation to a claim that AI lacks some ability because of the difference. The difference does mean you cannot infer abilities present in one by detecting them in the other. This is the similar to, and as about as profound as, saying that you cannot say that rocks can move fast because of their lack of legs. Which is true, but says nothing about the ability of rocks to move fast by other means.


Not my area of expertise, but this paper may be important for the reason that it is more closely aligned with the “enactive” paradigm of understand brain-body-behavior and learning than a backpropogation-only paradigm.

(I like enactive models of perception such as those advocated by Alva Noe, Humberto Maturana, Francisco Valera, and others. They get us well beyond the straightjacket of Cartesian dualism.)

Rather than have error signals tweak synaptic weights after a behavior, a cognitive system generates a set of actions it predicts will accommodate needs. This can apparently be accomplished without requiring short term synaptic plasticity. Then if all is good, weights are modified in a secondary phase that is more about asserting utility of the “test” response. More selection than descent. The emphasis is more on feedforward modulation and selection. Clearly there must be error signal feedback so some if you may argue that the distinction will be blurry at some levels. Agreed.

Look forward to reading more carefully to see how far off-base I am.


Theories that brains predict the pattern of expected neural activity aren't new, (eg this paper cites work towards the Free Energy Principle, but not Embodied Predictive Interoception Coding works). I have 0 neuroscience training so I doubt I'd be able to reliably answer my question just by reading this paper, but does anyone know how specifically their Prospective Configuration model differs, or expands, upon the previous work? Is it a better model of how brains actually handle credit assign than the aforementioned models?

The FEP is more about what objective function the brain (really the isocortex) ought to optimize. EPIC is a somewhat related hypothesis about how viscerosensory data is translated into percepts.

Prospective Configuration is an actual algorithm that, to my understanding, attempts to reproduce input patterns but can also engage in supervised learning.

I'm less clear on Prospective Configuration than the other two, which I've worked with directly.


Thanks!

> In prospective configuration, before synaptic weights are modified, neural activity changes across the network so that output neurons better predict the target output; only then are the synaptic weights (hereafter termed ‘weights’) modified to consolidate this change in neural activity. By contrast, in backpropagation, the order is reversed; weight modification takes the lead, and the change in neural activity is the result that follows.

What would neural activity changes look like in an ML model?


Paper actually says that they fundamentally do learn the same way, but the fine details are different. Not too surprising.

The post headline is distracting people and making a poor discussion. The paper describes a learning mechanism that had advantages over backprop, and may be closer to what we see in brains.

The contribution of the paper, and its actual title is about the proposed mechanism.

All the comments amounting to ‘no shit, sherlock’, are about the mangled headline, not the paper.


Oh hey, I know one of the authors on this paper. I've been meaning to ask him at NeurIPS how this prospective configuration algorithm works for latent variable models.

The title of this post doesn't seem to have any connection to the title or content of the linked article.

The comments here saying this was obvious or something else more negative are disappointing. Neural networks are named for neurons in biological brains. There is a lot of inspiration in deep learning that comes from biology. So the association is there. Pretending you’re superior for knowing the two are still different, contributes nothing. Doing so in more specific ways, or attempting to further understand the differences between deep learning and biology through research, is useful.

Looks amazing if it pans out at scale. Would be great if someone tried this with one of those simulated robotic training tasks that always have thousands or millions of trials rather than just CIFAR-10.

Some are surprised that anyone would make this point, either the title or the research.

It might be a response to the many, many claims in articles that neural networks work like the brain. Even using terms like neurons and synapses. With those claims getting widespread, people also start building theories on top of them that make AI’s more like humans. Then, we won’t need humans or they’ll be extinct or something.

Many of us whom are tired of that are both countering it and just using different terms for each where possible. So, I’m calling the AI’s models, saying model training instead of learning, and finding and acting on patterns in data. Even laypeople seem to understand these terms with less confusion about them being just like brains.


> It might be a response to the many, many claims in articles that neural networks work like the brain. Even using terms like neurons and synapses.

Artificial neural networks originated as simplified models of how the brain actually works. So they really do "work like the brain" in the sense of taking inspiration from certain rudiments of its workings. The problem is "like" can mean anything from "almost the same as" to "in a vaguely resembling or reminiscent way". The claim that artificial neural networks "work like the brain" is false under the first reading of "like" but true under the second.


Brain-inspired, neuromorphic architectures are usually very different from neural networks in machine learning. They’re so different (and better) that people who know both keep trying to reproduce brain-like architecture to gain its benefits.

One of my favorite features is how they use local, likely Hebbian, learning instead of global with backpropagation. (I won’t rule out some global mechanism, though.) The local learning makes their training much more efficient. Even if a global mechanism exists (eg during sleep?), brain architectures could run through more training data faster and cheaper. Expensive step just tidies it up in shorter periods of time.

They are also more analog, parallel, sparse, and flexible. They have feedback loops (IIRC). Multiple tiers of memory integrated with their internal representation with hallucination mitigation. They also have many specialized components that automatically coordinate to do the work without being externally trained to. All in around 100 watts.

Brains are both different from and vastly superior to ANN’s. Similarities do exist, though. They both have cells, connections, and change connections based on incoming data. Quite abstract. Past that, I’m not sure what other similarities they have. Some non-brain-inspired ANN’s have memory in some form but I don’t know if it’s as effective and integrated as the brain’s yet.


Totally agree! The "fire together, wire together" approach to training weights is super easy to parallelize, and you can design custom silicon to make it ridiculously efficient. Back when I was a Computational Neuroscience (CN) researcher, I worked with a team in Manchester that was exploring exactly that—not sure if they ever nailed it...

Funny enough, I actually worked with Rafal Bogacz, the last-named author of the paper we’re discussing, during his Basal Ganglia (BG) phase. He’s an incredibly sharp guy and made a pretty compelling argument that the BG implement the multihypothesis sequential probability ratio test (MSPRT) to decide between competing action plans in an optimal way.

Back then, there was another popular theory that the BG used an actor-critic learning model—also quite convincing.

But here’s the rub: in CN, the trend is to take algorithms from computer science and statistics and map them onto biology. What’s far rarer is extracting new ML algorithms from the biology itself.

I got into CN because I thought the only way we’d ever crack AGI was by unlocking the secrets of the best example we’ve got—the mammalian brain. Unfortunately, I ended up frustrated with the biology-led approach. In ten years in the field, I didn’t see anything that really felt like progress toward AGI. CN just moves so much slower than mainstream ML!

Still, I hope Rafal’s onto something with this latest idea. Fingers crossed it gives ML researchers a shiny new algorithm to play with.


No? They work like how people assumed the brain actually works. We still don't understand how the brain works. You're too early to even make this claim

> Even using terms like neurons and synapses. With those claims getting widespread, people also start building theories on top of them that make AI’s more like humans.

Except the networks studied here for prospective configuration are ... neural networks. No changes to the architecture have been proposed, only a new learning algorithm.

If anything, this article lends credence to the idea that ANNs do -- at some level -- simulate the same kind of thing that goes on in the brain. That is to say that the article posits that some set of weights would replicate the brain pretty closely. The issue is how to find those weights. Backprop is one of many known -- and used -- algorithms . It is liked because the mechanism is well understood (function minimization using calculus). There have been many other ways suggested to train ANNs (genetic algorithms, annealing, etc). This one suggests an energy based approach, which is also not novel.


"Except the networks studied here for prospective configuration are ... neural networks. No changes to the architecture have been proposed, only a new learning algorithm."

In scientific investigations, it's best to look at one component, or feature, at a time. It's also common to put the feature in an existing architecture to assess the difference that feature makes in isolation. Many papers trying to imitate brain architecture only use one feature in the study. I've seen them try stateful neurons, spiking, sparsity, Hebbian learning, hippocampus-like memory, etc. Others will study combinations of such things.

So, the field looks at brain-inspired changes to common ML, specific components that closely follow brain design (software or hardware), and whole architectures imitating brain principles with artificial deviations. And everything in between. :)


I'm not sure what you're trying to say here. Hebbian learning is the basis for current ANNs. Spiking neural nets again an adaptation of neural nets. The entire field is inspired by nature and has been an a never ending quest to replicate it

This paper is an incremental step along that path but commenters here are acting as if it's a polemic against neural nets.


It is a good thing as i do not admire much human brain. U learn things slowly...

"AI and Human learn differently."

Obviously. So can the scraping grifters who claim that AI 'learns just like a human' please shut up and never inflict their odious presence on the rest of humanity again? And also pay 10X damages for ruining the Internet.


Brain learns through pain. LLMs learn through expending energy.

Surprise factor zero.

Wait, my brain doesn't do backprop over a pile of linear algebra after having the internet rammed through it? No way that's crazy /s

tl;dr: paper proposes a principle called 'prospective configuration' to explain how the brain does credit assignment and learns, as opposed to backprop. Backprop can lead to 'catastrophic interference' where learning new things abalates old associations, which doesn't match observed biological processes. From what I can tell, prosp. config learns by solving what the activations should have been to explain the error, and then updates the weights in accordance, which apparently somehow avoids abalating old associations. They then show how prosp. config explains observed biological processes. Cool stuff, wish I could find the code. There's some supplemental notes:

https://static-content.springer.com/esm/art%3A10.1038%2Fs415...


This is like expressing surprise that a photon doesn't perform relativistic calculations on its mini chalkboard.

A simulation of a thing is not thing itself, but it is illuminating.

> pile of linear algebra

The entirety of physics is -- as you say -- a 'pile of linear algebra' and 'backprop' (differential linear algebra...)


I don’t think “differential linear algebra” really counts as “backprop”.

> Backprop can lead to 'catastrophic interference' where learning new things abalates old associations, which doesn't match observed biological processes.

Most people find that if you move away from a topic and into a new one your knowledge of it starts to decay over time. 20+ years ago I had a job as a Perl and VB6 developer, I think most of my knowledge of those languages has been evacuated to make way for all the other technologies I've learned since (and 20 years of life experiences). Isn't that an example of "learning new things ablates old associations"?


Is it replaced, or does it decay without reinforcement?

How can we distinguish those two possibilities?

Stuff like childhood memories seems very deeply ingrained even if rarely or never reinforced. I can still remember the phone number of our house we moved out of in 1991, when I was 8 or 9. If I’m still alive in 30/40/50 years time, I expect I’ll still remember it then.



No shit, really?

Was a study really necessary for this?

Do "AI" fanbois really think LLMs work like a biological brain?

This only reinforces the old maxim: Artificial intelligence will never be a match for natural stupidity


> Do "AI" fanbois really think LLMs work like a biological brain?

If you read the article you'd know two things: (1) the article explicitly calls out Hopfield networks as being more bio-similar (Hopfield networks are intricately connected to attention layers) and (2) the overall architecture (the inference pass) of the networks studied here remain unmodified. Only the training mechanism changes.

As for a direct addressing of the claim... if the article is on point, then 'learning' has a much more encompassing physical manifestation than was previously thought. Really any system that self optimizes would be seen as bio-similar. In both mechanisms, there's a process to drive the system to 'convergence'. The issue is how fast that convergence is, not the end result.


I did not read the article - but I guess it all depends on the level of abstraction we are talking about. There is a very abstract level where you can say that AI learns like a biological brain and there is a level where you would say that a particular human brain learns in a different way than another particular human brain.

Claims that LLMs work like human brains were common at the start of this AI wave. There are still lots of fanboys who defend accusations of rampant copyright infringement with the claim that AI model training should be treated like human brain learning.

It only learns like a human when I use it to rip-off other people's work.

"does not learn like human" does not mean "does not learn".

It is alien to us, that doesn't mean it is harmless.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: