Hacker News new | past | comments | ask | show | jobs | submit login
Prefrontal cortex as a meta-reinforcement learning system (deepmind.com)
243 points by godelmachine on May 14, 2018 | hide | past | favorite | 29 comments



Very interesting.

Oversimplifying and ignoring a lot of important details, the key idea proposed by the authors is that the brain's phasic dopamine system is a model-free reinforcement-learning system that learns to train the prefrontal cortex as a more efficient model-based reinforcement-learning sytem -- a form of meta-learning which the authors accurately refer to as meta-reinforcement learning.

The "Results" section provides compelling evidence that the authors might be on to something. The authors show and discuss the outcomes of six different kinds of computer experiments in which a (relatively simple) meta-reinforcement learning software system is shown to learn and behave in qualitatively similar ways as, for example, monkeys and rats in equivalent lab experiments.

I'm still digesting the implications.

Highly recommended reading.


One implication for products, if this work pans out, is that the systems which "adapt to you" (learn your voice, your schedule, what your face looks like) won't have to change the weights of their network. I believe this could lead to much better systems because adjusting the weights of an already-deployed network is dangerous - you could make the system perform poorly. It's also simpler for engineers because you can scrap all that logic and just deploy one RL network.


Sorry my head just exploded, but are you suggesting we have personal "prefrontal cortexes" used to customise general neural networks?


To simplify even further, would it be right to say that dopamine is not just a carrot being used to adjust weights but that the entire dopamine system actively learns how/when/why to use the carrot to effectively train?

if so, some interesting real world parallels i can see:

1/ this is why good teachers early on have such a profound effect. Teachers act as a dopamine system so good teachers teach meta learning via whatever they teach

2/ teaching the meta directly is difficult as the meta of the meta is too abstract. Which is why grammar/math may tend to feel out of touch. They are already the meta.

3/ drugs can mess up the dopamine system and throw the feedback loop out of whack. Even if you have the best resources, the “teacher” is now inept


Although the behavior of the monkeys and rats was simply selecting the correct image to receive a reward, so not an especially complicated task.


Well, it took 50+ years to solve this simple problem somewhat reliably.


No reason to oversimplify and ignore a lot of important details — they're doing that for you!


The original link was to the actual paper[a], not the blog post!

A moderator, dang, changed the link after I posted my comment; see his comment [b] elsewhere in this thread.

Had the original link been to the blog post, I agree, there would have been no need to simplify.

[a] https://www.nature.com/articles/s41593-018-0147-8

[b] https://news.ycombinator.com/item?id=17068449


I'm quite happy with the blog post, I don't have enough background to understand the paper


Somewhat interesting: Bananas contain the ingredients necessary to produce dopamine.

I can see how this all plays out as a way for some apes to get some bananas. Even social structures, fairness and other things (I love this video - two monkeys getting unequal pay https://www.youtube.com/watch?v=meiU6TxysCg ) can be explained by a dopamine system that just likes to give us bananas and sex. And it's incentivized to conserve energy because this is better for survival (also called lazy) - interesting is that humans have some tendency to do stuff that don't necessarily is conserving energy. Maybe this is what makes us successful: Sometimes we invest excessive energy to try new things which happens to let us survive better through innovation (which leads to genetics that encode this behavior).

I like it, although it isn't necessarily what I call my meaning of life (I'm a dualist, because materialism is too damn dry). I like bananas, though (and yes, the role of bananas may be exaggerated here).

Thought long about it. From a materialistic standpoint there are only quantitative differences between a organism which has a sensor, some type of memory and an actuator and human beings, although they sure look different. Still following the same principles.

When I further simplify, we are all just energy (and matter, which is just a form of it) following some first principles hallucinating our consciousnesses trying to evade the eventual entropy that we'll reach nevertheless because this is how the universe works.

- - -

Cool fact is that animals are capable of rational thinking (crows drop nuts to break them approx. at the minimal height necessary to achieve that - optimal energy usage. I'm pretty sure they don't even realize that their brains calculate this based on their experience) and I'm sure that all people act rational w.r.t their training data (some outliers like traumatic events and other life circumstances that differ from the average just change some weights in a way that it looks irrational for outsiders). This is indeed difficult to defend because it depends on the semantics on the word "irrational" which is man-made after all.

Interesting resources for this topic:

a theory trying to explain human behavior and the emergence of consciousness using knowledge of psychology: https://unifiedtheoryofpsychology.files.wordpress.com/2011/1... (principles how the brain works)

https://unifiedtheoryofpsychology.files.wordpress.com/2011/1... (why culture and religion emerges)

https://www.youtube.com/watch?v=lyu7v7nWzfo (consciousness as a hallucination)

https://www.youtube.com/watch?v=HRVGA9zxXzk (a bird which can identify itself in a mirror, simple self-awareness - I like the fact that the brains of birds are more efficient because they are space-constrained to be able to fly better)

(the brain as a neural net with meta-learning capabilities that tries to guess what happens next and what it should do next. Emotions and some pre-wiring based on genetics enables us that we don't start with a complete random brain structure because it gives us better surviving abilities if we're able to see and feel as soon as we get out of our mothers).


anything that contains the amino acids tyrosine or phenylalanine has the ingredients for dopamine


> and yes, the role of bananas may be exaggerated here

Tried to be funny. It seems I've failed. It's not about that banana.

Point was that our social structures can be explained by some first principles e.g. individuals try to acquire enough resources (what a trivial assertion, but I guess that is the point of first principles) and that we can explain the origin of our social structures with them.

Fun fact: It is not very common for monkeys to eat bananas in the wild - they simply don't have them [1].

[1] http://www.businessinsider.com/wild-monkeys-do-not-eat-banan...

Totally OT: wild birds cooperate with humans to get some honey. I love such articles. https://www.npr.org/sections/thesalt/2016/07/21/486471339/ho...



Paper is behind a paywall. Sci-Hub doesn't have it. Does anyone have a link to the paper PDF?




So here's something interesting to note. The "buy this paper" price for this article is almost reasonable in and of itself. $22.00. In fact, for somebody who works in the field or even a tangentially connected field, they could probably justify spending the $22.00 for it.

But... hilariously, a one year subscription to the journal, is only $59.00 (for individuals, online only).

https://www.nature.com/neuro/subscribe

Something seems a bit out of whack when the price of one article is about 1/3rd of the price of the journal for an entire yet.

I'm almost tempted to subscribe. The problem with this model though, is that it doesn't scale. Even at "only" $59.00, how many journals can one afford to subscribe to before the aggregate cost breaks the bank? sigh


It scales because nobody subscribes individually. If you can't access a journal, you get your department head to forward things up the chain so your institution can just add another line item to the millions of dollars they spend annually on journal subscriptions for every researcher and student.

AFAIK almost all of that money goes directly to Elsevier [1] who, well, they pretty much run a monopoly.

[1] https://en.wikipedia.org/wiki/Elsevier


What I mean is, for those people (like me) who don't have a "department head" (or a university at all) to take care of journal subscriptions, and who might be willing to subscri be individually - it doesn't scale beyond a journal to two.

At most, in years past, I subscribed to maybe a total of 10 journals, but that's when I was maintaining an IEEE membership and a lot of their journals are actually not very expensive, in terms of the marginal cost once you are already a dues paying member.

These days though, I find it hard to justify. I find myself wondering if the publishers would do a better job of providing a model that works for individuals, if they might not actually make more money in the end.


That is so darn insightful! So check it out, if there were a way to pay $20/mo for access to ~5 publications/mo from a list of hundreds/thousands, that would be fantastic! Netflix for research. I'd pay for it.

It wouldn't be terribly challenging to build. The tricks are in making the deals with the content owners, and getting the word out to users.

The total market is 137 Billion/year ( https://www.ibisworld.com/industry-trends/market-research-re... ) Not bad.

Looks like half the market is the government. Might be enough there to build a business on. Hard to say. Requires more research.


The trick is that you're making a deal with Elsevier who is either going to kill you or buy you, making this a non-starter for most VCs.

It's tremendously difficult to make money in a space this entrenched. If you go a layer down and try to cut out Elsevier (go to researchers directly) you're asking scientists to basically ignore career-defining opportunities in high-impact publications... so that you can make money and a small market of hobbyists or professionals without a research budget can subscribe to journals. Also, the amount of work that needs to be done around peer-review is obscene and Elsevier (and others) have built a ridiculous moat around related volunteer work that is hugely inefficient but constantly socially reinforced (reviewing can be a status symbol). At that point, the alternative is open access and free for everybody: why not publish there?

I'd be of the opinion that the only way to reasonably attack this space is to build an adjacent content management and distribution platform not focused on journals and edge your way in, like, say, classroom management or MOOCs.

Which, well, there's Top Hat [1] who raised at a $185M valuation last year [2]. Not sure if they care about journals all that much yet though, not a whole lot of insight into their business beyond the pop business news.

[1] https://tophat.com/

[2] https://www.bloomberg.com/news/articles/2017-02-15/top-hat-r...


Wow, you have a lot of knowledge in this area. Thanks! I wonder if because Elsevier is likely a bigger company it could be a good business to create by partnering with them, then let them buy it. So they don't have to take on the risk of developing it in house, but can still reap the rewards.


With continued pressure from SciHub, I'm thinking this might not be a bad bet to make. Elsevier et al. might just be willing to jump on the occasion, in order to save some of their business.

That said, I still think it's time to double-down on supporting SciHub. Both scientists and the interested public like it.


It's priced that way precisely because they want you to go "uh, but chances are I'll want to read another paper or two so I might as well get a subscription", even though you probably won't find two more papers that aren't available elsewhere that you want to read enough to pay $22 for.


This is very likely not an accident but by design. For a good popsci read on similar topics I can recommend "Predictably Irrational, Revised: The Hidden Forces That Shape Our Decisions".



A more accessible discussion by the paper's authors is here at DeepMind:

https://deepmind.com/blog/prefrontal-cortex-meta-reinforceme...


We've changed to that from https://www.nature.com/articles/s41593-018-0147-8. Thanks!


Thanks!




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: