Hacker News new | past | comments | ask | show | jobs | submit login
Reward Is Unnecessary [pdf] (tilde.town)
60 points by optimalsolver 30 days ago | hide | past | favorite | 30 comments



The origins of the approach of a perfect world model equals AI, at least how I encountered it, are from Marcus Hutter. H proved that perfect compression (=prediction of what will happen next) is the ideal way to drive an agent in reinforcement learning[1] (the choice of action being the likeliest high-reward action).

So a perfect world model is enough to win at reinforcement learning. Can you show that if you maximized reward in some RL problem, it means you necessarily built a perfect world-model?

[1] A Theory of Universal Artificial Intelligence based on Algorithmic Complexity, 2000 - https://arxiv.org/pdf/cs/0004001.pdf Side-note: since building a perfect model is uncomputable, this is all a theoretical discussion. The paper also discusses time-bounded computation and has some interesting things to say about optimality in this case.


> So a perfect world model is enough to win at reinforcement learning. Can you show that if you maximized reward in some RL problem, it means you necessarily built a perfect world-model?

No? Because maximizing reward for a specific problem may mean avoiding some states entirely, so your model has no need of understanding transitions out of those states, only how to avoid landing in them.

E.g. if you have a medical application about interventions to improve outcomes for patients with disease X, it's unnecessary to refine the part of your model which would predict how fast a patient would die after you administer a toxic dose of A followed by a toxic dose of B. Your model only need to know that administering a toxic dose of A always leads to lower value states than some other action.

I think a "perfect" world model is required by a "universal" AI in the sense that the range of problems it can handle must be solved by optimal policies which together "cover" all state transitions (in some universe of states).


I'm not really into AI, but i love that this person is posting their blogpost as a LateX-formatted PDF on their personal page on a tilde server.

For those who don't know, a tilde server is a community-operated server distributing shell accounts (via SSH) to its members, and sometimes other services. See tildeverse.org for a small federation of such operators.


I like people trying out things, but at the same time I can't help but be disappointed that PDF was chosen as the presentation format.

I like my text to fit my screen/window rather than an arbitrary piece of paper.


I personally also despise PDF as a medium. I'm just really happy somebody is daring to defy established norms because they feel like it.

In an era of ultra-conformity on the web i find it refreshing to see that some people still use HTTP as a means to share documents of their choice, not just single-page applications.


I assumed that the LateX-formatted PDF was to give the impression that it's a paper, since the document also follows the norm of how paper are written (abstract, we, references, etc).


I think intelligence is an ability to solve problems. There are many types of intelligence and there are many types of problems. For example there is hardcoded primitive intelligence in calculators which enables them to solve math problems and there is high intelligence of humans which is emergent naturally occurring product of evolution which enables them(us) to solve very complex problems of all sorts.


But to solve a problem any more complex than 2 + 2 you need to have a model of the world, however simplistic.

For example to count how many bananas are in a box properly, without weird edge cases, you need to know what a banana is and what it looks like and what a box is and what it looks like.


Like what I said when speaking of Alien Life [0] it all comes down to chemistry and biology and the evolution when talking about naturally occurring intelligence.

When speaking about AI as far as I understand it should be human like in order to be called artificial intelligence. If it mimics human intelligence and solves trivial problems then it is nothing than calculator with primitive intelligence that I mentioned before.

The thing you are talking is evolution, you acquire model of the world during millions of years of the evolution step by step but today computer scientists are trying to hardcode human like intelligence into computers which is extremely hard taking in consideration how long it took for humans to become highly intelligent. I think better approach would be to write evolutionary algorithms which perhaps can yield human like intelligence.

There is some secret ingredient in human evolution which made us the most intelligent specie on Earth no specie is even remotely close to be as intelligent as we are. I think nobody really knows what that secret ingredient is.

[0] https://news.ycombinator.com/item?id=27574320


I certainly do not know what the secret ingredient is, but it is quite plausible that a number of extinct species (Neanderthals, Homo erectus, maybe Homo habilis) also had it to some degree.


We made it because we were and are more cooperative and more social in others words we gather together in order to survive and work together but other species such as bees or ants are also very much cooperative and social but they are not highly intelligent although they have collective intelligence.

I think anatomy of humans played the crucial role in our evolution. Hands are powerful tools which enabled us to create and develop other tools and technologies.


The limitation of world models, in the context of achieving artificial general intelligence, is not the validity or granularity or faithfulness of the model, but rather the limitation argued by Hubert Dreyfus, that computers are not in the world.

Whilst physics can be modelled, and hence kinematics and dynamics can be modelled, intelligence, in the human sense, is different. Intelligence, for humans, is sociological, and driven by biology.

Computers cannot parse culture because culture is comprised of arbitrary, contradictory paradigms. Cultural values can only be resolved in an individual context as an integration of a lifetime of experiences.

Computers cannot do this because they cannot feel pleasure or pain, fear or optimism, joy or sorrow, opportunity, disinterest or attraction. They cannot grow older, give birth, or die. As a consequence, they lack the evaluative tools of emotion and experience that humans use to participate in culture.

But wait, you may protest. Computers don't need to feel emotions since these can be modelled. A computer can recognise a man pointing a gun at it demanding money, which is as good as the ability to feel fear, right?

A computer can recognise faces, so surely its only a small step further to recognise beauty, which is enough to simulate the feeling of attraction, right?

A computer won't feel sorrow, but it can know that the death of a loved one or the loss of money are appropriate cues, so that is as good as the feeling of sorrow, right?

The limitation of this substitution of emotions with modelling is that the modelling, and remodelling has to take place externally to the computer. In biological organisms that are in the world, each experience yields an emotional response that is incorporated into the organism. The organism is the sum of its experiences, mediated by its biology.

Consider this question: In a room full of people, who should you talk to? What should you say to them? What should you not say?

A computer can only be programmed to operate in that environment with respect to some externally programmed objective. e.g. if the computer were programmed to maximise its chances of being offered a ride home from a party, it might select which person to talk to based on factors such as sobriety, and an observation of factors indicating who had driven to the party in their own vehicle.

But without the externally programmed objective, how is the computer, or AGI agent to navigate the questions?

Humans, of course, have those questions built-in to the fabric of their thoughts, which spring from their biological desires, and the answers come from their cumulative experiences in the world.


I agree with what you are saying. I tried to test out of building a body for computers as an experiment to give meaning to computers: http://www.jtoy.net/blog/grounded_language.html


I don't see why you can't just give your "AGIs" actual emotions. The argument you're making doesn't make sense for emotions as the outcomes of our neurobiology, only for emotions as a kind of immaterial spirit inexplicably housed in meat-shells that need no explanation.


> One of the leading critics was the philosopher Hubert Dreyfus, who argued that computers, who have no body, no childhood and no cultural practice, could not acquire intelligence at all.

https://www.nature.com/articles/s41599-020-0494-4


Well, we could give them all of those things, or at least pretty good analogues.


Yeah giving them emotions already requires giving them bodies and socialization. Again, assuming you can do the thing, I don't see the further problem.


There is in fact artificial general intelligence and emotions, desires, etc, in computer worlds:

1) multiplayer games with "bots"

2) all these things, on a lower level, serve to make groups of entities communicate and cooperate. Even in solitary animals like cats these emotions serve to facilitate cooperation, to produce offspring or share territory optimally. There is no problem with creating that artificially: just have multiplayer environments with multiple artificial entities cooperating, resource constraints, "happiness", "pain" and "sorrow".

It's going to take a while before we see these entities compose poetry when their mate dies, but it'll go in the same direction.



> Computers cannot parse culture because culture is comprised of arbitrary, contradictory paradigms. Cultural values can only be resolved in an individual context as an integration of a lifetime of experiences.

Eh, GPT-3 is pretty good at imitating (some parts of) culture already.

> A computer can only be programmed to operate in that environment with respect to some externally programmed objective. e.g. if the computer were programmed to maximise its chances of being offered a ride home from a party, it might select which person to talk to based on factors such as sobriety, and an observation of factors indicating who had driven to the party in their own vehicle.

Have you ever tried debugging a program? Programs can do stuff that's really hard to predict, even if you wrote them specifically to be easy to predict (ie understandable).


This is a really interesting definition of intelligence as building a model of the world:

Solving intelligence is a highly complex problem, in part because it is nearly impossible to get any significant number of people to agree about what intelligence actually means. We eliminate this dilemma by choosing to ignore any kind of consensus, instead defining it as “the ability to predict unknown information given known information”.

To put it more put it more simply, we define intelligence as a model of the world.


Uh, came here to make essentially the same remark, but as criticism. A mere world model, however perfect, is a really hollow definition of intelligence IMHO. It's the definition of a tool, at best. It's only with the introduction of goals that we get to things like taking action, planning, etc., which bring the whole thing to life.

So while I'd agree that a world model is necessary, I seriously doubt it's sufficient for anything that I'd call intelligence.


Well, among tools conditional probability is kind of the one ring to rule them all. Just give me some samples from P(plan | goal) and “things like taking action, planning, etc” are trivial (or really, part of the sampling process).


You're trivialising planning given a model


Nope. In this definition of intelligence planning is part of the model. The model includes a probability distribution over all possible plans, just like GPT-3 includes a probability distribution over all possible news articles.


>planning is part of the model

Even if it was (which I doubt, except implicitly, which is not the same thing), there can be no planning without a goal.

It's a kind of mirrored Chinese Room fallacy: In that case, the complaint is that the performance of the system cannot be ascribed to any distinct part of the whole, concluding that the whole cannot perform. In this case, the performance of the system is falsely ascribed to one distinct part, ignoring the contribution of the other.


I think the world model is one step towards intelligence or one half of it. I've come to believe that the ability to _change_ your world model as new information comes into play is the other half.


Not sure what you mean by goals, because to a degree you don't need dynamic goals (eg. goals that change throughout lifetime of a system) for reactive behavior.


It may be necessary but not sufficient, that's somewhere to start at least.


How will you train your world model? Cross-entropy loss? Oh, it's reward again.




Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: