Hacker News new | past | comments | ask | show | jobs | submit login
To create a super-intelligent machine, start with an equation (theconversation.com)
120 points by kr4 on Nov 28, 2013 | hide | past | web | favorite | 84 comments

It's interesting that, while the system can learn lots of different games, you still have to give it the reward function special for each different game. This may seem obvious, and not much of a limitation- after all, lots of things that we think of as intelligent to varying degrees (different human beings, as well as members of other species) have wildly different ideas of what constitutes "reward", so that can't be an inherent part of the definition of intelligence.

But you don't have to explicitly tell a human what the reward function for Pac-Man is. Show a human the game and they'll figure it out. Which makes me wonder if, while there is some room for variability in reward functions, there might be some basic underlying reward computation that is inherent in intelligence. I can't find the link just now, but I read an article a few months ago (might've even been here on HN) about a system that demonstrated the appearance of intelligent behavior by trying to minimize entropy and maximize its possible choices for as long as possible within a given world model.

Pac-Man was designed as a game for humans, with a priori knowledge of what kinds of things humans find rewarding. Thus the goal is obvious because it was designed to be similar to other human goals. Eat the food, don't get eaten. For this reason, it's not at all special that humans can determine the goal of the game.

Yeah. Try sticking a more abstract game like Go in front of a random person and see how that works out. Without being taught the rules, a human will have absolutely no idea how to proceed. This would put a human in pretty much the same boat as a computer.

A friend of mine had a meta-game he'd play with his step father. His step father would buy a new game, but not tell him the rules. They'd play this game until he figured it out and consistently trounced his step dad. Then his step dad would buy a new game.

Wow, that's a great idea. Sounds like loads of fun.

Secure the largest amount of territory and capture enemy groups? Seems pretty human :p

Not to Edward Lasker: "The rules of go are so elegant, organic and rigorously logical that if intelligent life forms exist elsewhere in the universe they almost certainly play go."

You got all that from looking at a 19x19 grid?

I guess that the reward systems that humanity has evolved are complicated and numerous. We've got the basics (food, shelter), the more complicated basics (sex with a suitable mate, companionship) and the million other factors - curiosity, intellectual challenge, positive and negative feedback, power, agency etc, etc....

My thoughts are that if they were to take such a direction with this AI, they'd give it the basics and let it evolve and learn its own complicated reward structure. When you're trying to get a monkey to play pac-man, you bribe him with a capful of ribena - he doesn't care about fun intellectual challenges, but sweet liquids motivate the hell out of him.

(This is the state of actual monkey research - ribena is monkey crack)

To start on this you would want a system that got rewarded based on how well it was able to predict aspects of its environment. This would have to go in hand with preferring stimulation, so you would need something like preference for inputs which maximize relative entropy with respect to its thus far learned model.

That article on entropy minimization also claimed that a single equation could be the basis of a wide range of intelligent behaviours.


For tasks that do not reward us biologically (i.e. eating, sleeping), we ultimately depend on other people to give us reward, be that money, acceptance, praise, or whatever.

Yeah, this strikes me as similar to - and basically no more intelligent than - Eurisko. Such is the progress we have made on AI in 40 years.

No. AIXI is unique, in the technical sense of that word. It's similar to Eurisko in that there are goals and things. If you get any deeper, the similarity ends.

> AIXI now uses this model for approximately predicting the future and bases its decisions on these tentative forecasts. AIXI contemplates possible future behaviour: “If I do this action, followed by that action, etc, this or that will (un)likely happen, which could be good or bad. And if I do this other action sequence, it may be better or worse.”

What I want to know, is how long does it spend doing this? Does the duration of one of these cycles have to be manually chosen, or is it smart enough/self-aware enough to realise, okay, if I spend forever thinking about something, then nothing will happen?

I guess what I'm asking is, is the algorithm smart enough to eventually internalise its own model of itself?

That's a really good question because a sense of self is integral to our definition of consciousness. Presumably this machine as able to understand math, so it should be able to comprehend it's own formula or source code.

> "is the algorithm smart enough to eventually internalise its own model of itself?"

As far as I understood, it's just random search for models ("For the planning component we use standard Monte Carlo"), and some kind of minimum description length (MDL) based method for model selection ("For the learning component we use standard file compression algorithms").

So the probability that it would, by Monte Carlo search, find a model that includes some kind of description for itself, I assume to be astronomically tiny.

(...And even if it did, it would have no way to know that it has stumbled upon a model of itself. Except the experimental performance of the model, which could or could not be better than some other models and variations.)

Interesting question.

I'm a developer for the linked site - if you're interested in posting the question in the comments I'll ping the author to see if he can answer.

Blogspam. Original source: http://theconversation.com/to-create-a-super-intelligent-mac...

(If you happen to have encountered Marcus Hutter's AIXI before, there is nothing new there compared with any other presentation of his ideas.)

Looks like the mods have changed the URL to the original source.

Am I the only one that gets a bit of a chill watching an UFAI play a game where it runs from its pursuers collecting resources until it gains the power to kill them?

To be sincere, I don't know what kind of Pac Man implementation was that. In the original game, each ghost has a personality [1], the ones in the video seem to just wander aimlessly, so it doesn't really tell much about strength of the algorithm to beat the game because it's not the same game.

So, to comment on your dystopian prediction, I'm not sure this algorithm would do any better against humans, unless all we do is wander aimlessly too ;)

[1] http://www.webpacman.com/ghosts.html

More detailed information on how the ghosts behave: http://home.comcast.net/~jpittman2/pacman/pacmandossier.html...

Probably because they implemented the game themselves and took some shortcuts. But it still seems like pretty much the same game.

I doubt it would do better than humans because it's not very efficient, but a better AI running on the same principles could be smarter than even humans and outperform us outside the virtual world.

True, but it doesn't kill them very efficiently.

If your scale of Pac-man competence stretches from "my grandma" to "arcade addict", then the AI looks incompetent. But if your scale stretches from "rock" to "human", it looks worryingly competent.

Oh, it's not good at killing, yet. That's reassuring.

I highly recommend Markus Hutter's book: "Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability" (http://www.amazon.com/Universal-Artificial-Intelligence-Algo... )

Beware, you need some serious maths and a board to follow him. In the end the book is quite rewarding.

Now to AIXI: I don't belive that having a formula solves AI problem. Yes you can model it, but in real environment, I believe there are simpler models to try first. Remember: nature like simpliness first, complex when needed.

>This scientific field is called universal artificial intelligence

How many times do we need to re-define the same concept? Artificial Intelligence, Human Level Artificial Intelligence, Artificial General Intelligence, Strong Artificial Intelligence etc....

Lets pick one as a community and stick with it. I thought AGI was in the lead there recently what with the conference, journal and high amount of web searches but apparently that isn't enough.

Beyond that, I would like to see how they make their algorithm narrow for the narrow application of the Pac-Man game while keeping it generalizable. My guess is that they don't and this algo ends up being a "starting point" for narrow AI applications to rest on. In that case it's fine and interesting, but doesn't pass the test to integrating specificity within a general learning model.

Because they are not the same concept. Universal Artificial Intelligence is not same as Human Level Artificial intelligence.

There is no reason to assume that human intelligence is universal. Or intelligence is biased towards picking berries and avoiding bears in three dimensional world.

> How many times do we need to re-define the same concept? Artificial Intelligence, Human Level Artificial Intelligence, Artificial General Intelligence, Strong Artificial Intelligence etc....

Every time the name gets co-opted to mean something else. These days "artificial intelligence" == "machine learning and natural language processing" which is most definitely not what TFA is about.

No. There is a perfectly good term which has been around for a very long time: Strong AI. In no context does this mean "ML + NLP" (which, by the way, AI itself hardly means). I think terms like AGI are just rebranding.

I don't think strong AI has quite the history you think it does. It's one of these later coinages. Also, it has separate quasi-related meanings in philosophy of the mind...

Strong AI is one of those hand-wavey things that makes you sound like you're talking about something well defined, when you actually aren't. Has always struck me as counter-productive.

Frustrating stuff.

I like computational intelligence since we still don’t really know what intelligence is and whether our kind of intelligence can be reproduced artificially (although there are some hints that it can be done).

Ok there's no conclusive proof yet as no one has been able to do it yet, but as far as I know no one actually debates this anymore. It would be incredibly surprising, to say the least, if the brain turned out to work completely outside the known laws of physics.

We don't know whether physical reactions in our brains allow for and depend on calculation using real numbers, which can only be approximated using our current computing technology.

> How many times do we need to re-define the same concept?

I sometimes think everybody's still afraid there'll be another AI Winter (and/or that the first one hasn't ended yet), and are therefore anxious to come up with a new name for what they're doing every so often as sort of a pre-emptive dodge.

I have been following this for a while and have taken the time to understand it in a little detail. While it does not provide the 'answers', it does frame the question of 'what is an intelligent machine?' in a very precise manner. It's interesting work, what it needs is for someone to now work out how to build much better models and plug them into the framework provided by AIXI.

> Lets pick one as a community and stick with it.

And when our assumptions turn out to be wrong, should we continue working on it even though we know it can never work? The problem is every new term is forged with assumptions because the very idea of what intelligence is the subject of ongoing debate. (I don't even like the one presented here, even though it's pretty good as far as these things go.)

I sympathize though, the nomenclature is polluted with the corpses of would-be giants and makes it hard to talk about what we can all easily recognize as the same broad-strokes effort.

Like others have said, this seems to be a generalization of the problem of reinforcement learning. For a good introduction to the subject, check out Reinforcement Learning by Sutton & Barto [1]. After reading the first few chapters, you'll be able to understand most of that equation.

[1] http://webdocs.cs.ualberta.ca/~sutton/book/the-book.html

Yeah just looks like Q-Learning to me :/

You just use the rewards to optimize the function that tells you what's the predicted reward at any stage for a given action. And then take those best acions

Yep, Hutter's website provides this book as reading material and to build up to the whole idea.

For those having trouble getting past the hype to the actual content of AIXI, lesswrong has a nice article [1]. From the article:

"The AIXI formalism says roughly to consider all possible computable models of the environment, Bayes-update them on past experiences, and use the resulting updated predictions to model the expected sensory reward of all possible strategies."

I will add that the prior over models comes from a measure of model complexity, which is a reasonable way to construct a prior.

While I think the idea of connecting deep mathematical formalisms with actual AI goals is mostly hype, these formalisms are still interesting in themselves. There is something very intuitive about how AIXI works, but that doesn't mean it is going to practical for a large range of problems.

[1] http://wiki.lesswrong.com/wiki/AIXI

I don't recommend reading anything on that website. That way madness lies.

I don't read lesswrong, but this article is worthy on its own merits. No memetic hazards are lurking in that particular article :-)

These are good generalized life lessons for humans as well.

Seek new knowledge, learn well, reason logically and inductively, recognize patterns and generalize, contemplate and plan for the future, converse and be social, survive, and strive for optimum health (my addition).

Very good advice in whether or not you're an automaton.

Those words are pretty much meaningless. Of course everyone is already trying to survive and recognize patterns and all that.

1. Do we have to engineer how much it's being rewarded in each game?

2. What happens in the infinite case? With pacman, there are always a finite number of choices at each state.

3. If it doesn't have to be told the rules, then how does it make decisions? In theory, it may be able to learn how to play jeopardy, but it may be way too inefficient in practice. Humans don't even start with a blank state.

4. I don't think I ever consciously apply philosophical principles like ockham's razor when I problem solve or learn. It makes me a little uncomfortable that we're starting with a philosophy, rather than having the system discover things itself. I would be ok with it if there was some parallel between ockham's razor and physics (not the methodology of science).

> I don't think I ever consciously apply philosophical principles like ockham's razor when I problem solve or learn. It makes me a little uncomfortable that we're starting with a philosophy, rather than having the system discover things itself. I would be ok with it if there was some parallel between ockham's razor and physics.

Why do you think it is relevant what you do consciously? The only things that you do consciously are those things which your brain is ill-equipped to do. The vast majority of your thinking processes are subconscious, as are the principles that drive your conscious thinking. And I guarantee you, Ockham's razor is in there whether you realize it or not. When things get complicated do you purposefully look for a simpler solution? When trying to understand an unknown situation, do you start with something simple and ad complexity as needed? Ockham's razor.

> I would be ok with it if there was some parallel between ockham's razor and physics.

... there's not?

EDIT: As an AI researcher, I'd be more interested in creating an artificial scientist than "artificial science." So what makes the scientist work? Ockham's razor is at the foundation of that.

About countably infinite state space: with about 10^56 states, Pacman still remains a very challenging domain ! Just to put things into perspective, my colleague recently ran some experiment where he would exhaust 8Gb of RAM, but could reduce this number down to 4Gb using random projections for dimensionality reduction.



Also, don't forget that this is a partially observable domain (with only local sensory information) with no apriori knowledge of the overall goal to achieve. It learns completely from scratch ! If we find this task easy as human beings, it's because we have a lot of prior knowledge that we can transfer into this task

Ockham's razor here isn't a "philosophical principle", it's math.

AIXI is doing a shortest-first search for predictors that match the observed environment, except that it's searching them all simultaneously in infinite parallel (that's why it's not computable) and what gets moved around is probability weighting. Ockham's razor describes the starting state of the weightings when there hasn't yet been any evidence: shorter predictors are given more weight.

Regarding 4:

A disciple of another sect once came to Drescher as he was eating his morning meal.

“I would like to give you this personality test”, said the outsider, “because I want you to be happy.”

Drescher took the paper that was offered him and put it into the toaster, saying: “I wish the toaster to be happy, too.”


YEs this is all good but who determines whether something is "good" or "bad"? That's the interesting part. Who sets the goals? And how do they score an intermediate situation on the way to achieving them?

Don't get me wrong, a machine can achieve goals is really, really useful. After all, chess playing programs use the AlphaBeta algorithm to prune future positions intelligently, but to score the positions they still need human input. It may be, however, that they infer their OWN rules from past positions, with absolutely no human input. Then it becomes interesting. Still, the initial rules of the chess game have to be set down.

So while this is intelligence, this is not sentience.

I assume it gets reward for getting points in the pacman game (eating dots and ghosts) and presumably loses them if it dies or loses the game. There might also be a time factor involved so it doesn't waste time.

I'm really not entirely sure how it's decision making process works and I'm really curious to know. Because simulating every possibility would be ridiculous, but trying to predict the distant future based on some small action that happens in the present is also really difficult.

>So while this is intelligence, this is not sentience.

No one is claiming it is, and "sentience" is a really dubious concept itself.

This article is powerful. The idea of working to blend our philisophical theorems into mathematical modeling is brilliant and should be obvious.

Their definition of intelligence is also extremely precise and gives me a new way to look at the world.

Is AIXI capable of recognizing itself? If an AIXI agent were controlling a robot body, would it realize that the body's actions correlate with its own intentions? Would it pass the mirror test?

AIXI is capable of anything. Give it a utility function (and a pocket universe of computium to run itself) and it will learn behavior that maximizes that utility.

That said, things like the mirror test are so wrapped up in the construction of our own brain that it really doesn't make sense to apply alien intelligences, as you'd you'd be unlikely to get a visible response for unrelated reasons. "That thing in the mirror is me, so what?" Actually that's a bit misleading - the AIXI will have no internal dialogue. It's not built that way. But give it any task, and it will learn to do that task. If that task is to recognize itself in mirrors, it will... but it will do so in a way that is markedly different from humans.

No. It would probably pick up on the correlation, but AIXI assumes that its only interactions with the universe are through its input and output channels, so it cannot recognize that a particular part of the universe is itself. It does not know how to formulate that hypothesis.

What is recognition? What are intentions? These terms are void of meaning outside of human context. So if intelligence is defined around those human-centred terms (that are even too vague to account for human-specific intelligence[1]), surely nothing will ever be considered intelligent but a human being.

[1]: https://en.wikipedia.org/wiki/The_Bell_Curve

Does anyone know of an overview for people who already know the relevant mathematics? Does their "English" definition have a corresponding mathematical definition?

How is this better than applying the minimum description length principle formalization of Occam's razor? (from 1978 [1])

I admit that I didn't spend time to really delve into it but nothing in this article strikes me as particularly ground breaking.

[1] http://en.wikipedia.org/wiki/Minimum_description_length

edit: innovative --> ground breaking

It's basically that, formulated into a reinforcement learning algorithm.

It's innovative because this combination which is unique to Marcus Hutter's Ph.D. thesis basically "solves" the problem of general AI.. if given infinite computation resources. It's provable mathmatically to the be the best possible artificial general intelligence, with constant-time operation.. it's just that constant happens to be larger than the known universe.

That sounds useless, but it's not. What is shows is that artificial (general) intelligence is really an optimization problem, and provides a (theoretical) benchmark comparison, and directs current research into more practical algorithms which approximate that ideal within realistic space and time constraints.

Thanks, that sheds a bit more light on it.

To me it still sounds like "I solved the problem because I formalized it." The formalization is probably novel (since he got his PhD for it) but somehow I find it difficult to believe noone else had a similar idea before (it definitely sounds familiar to me having learned about reinforcement learning, minimum description length, etc. before). I'll have to read up on it.

It's not constant-time. It's not even computable, even in theory, even with infinite resources.

True but the approximations of it are (i.e. limiting the amount of running time each hypothesis has and limiting the number of hypothesis you test.)

This is sort of like criticizing the concept of a Turing machine because no one has built one with infinite tape or running time.

I'm not criticising AIXI, I'm just pointing out that calling it "constant time" is about as wrong as you can get, in the world of time complexity. Especially for AIXI, but also (less so) for an approximation like AIXItl.

Further, you can show with information theory that limiting the number of hypothesis tested or the running time does not reduce generality so long as you test up to a certain amount for a problem domain (e.g. hypothesis sizes up to the maximum representable if the causally observable universe where converted into computium, and maximum number of steps equal to the number of possible combinatorial configurations of program state).

Of course as mentioned, this gives you constant factors equal to the size of the observable universe. No one said it was practical :)

"Constant time" is still wrong even in the approximations. Constant time means O(1). But your comment already refers to a running-time complexity dependence on several parameters of the input, such as the size of the observable universe.

Size of the observable universe is constant. And maximization by exhaustive search requires touching every possible state. So yes, it is O(1). This is explained in Marcus' Ph.D. thesis, I believe.

If he says so, I should admit I'm wrong, but maybe not just yet.

This way of thinking (size of the observable universe is constant) avoids the whole question of time complexity. To actually measure the time complexity of the algorithm, you need to consider inputs of different sizes. You could run exactly the same algorithm in a different universe (that is why it's called "universal" after all) and you'd get a different running time. The time complexity is then the relationship between the two running times and the universe sizes.

This idea is a generalization of Solomonoff's work on induction from the dawn of Computing. Solomonoff was one of the initial 3 discoverers of Kolmogorov Complexity (Chaitin being the other). MDL is an attempt at a computable Kolmogorov Complexity, however creating a codebook is difficult so MDL is no panacea. Hutter's work is more broad, being interested in the intelligence of an Agent and uses ideas like MDL and compression.

...applying the minimum description length principle...

Or the Minimum Message Length principle (from 1968 [1])

[0] https://en.wikipedia.org/wiki/Minimum_message_length

What if you gradually introduce more and more complicated games? Start with Pong, then PacMan, Super Mario World, etc etc until you get to today's games, which are far more complicated and more importantly, very realistic. Now that you AI is well trained, attach a camera to the computer and point the machine at the most complicated game of all: reality.

Good point: I don't think they've ever taken an agent which has been learning for a while in one environment and placed it in another. I don't know, but I would expect that the current version at least would have to fail hard, basically unlearning everything, before it started to succeed in the new environment. That's a weakness, if so.

How do humans avoid this? Because our brains are evolved to be plastic but not too plastic. We encounter new environments (like going from one level of Pac-Man to the next), but not new laws of physics (like going from Pac-Man to Mario).

The whole point is that the intelligence is "universal." By definition, we are talking about a thing which is able to transcend what it knows. So to me, the whole point is to introduce a variety of environments so that it learns the truly salient knowledge that is applicable across many games. And the most important of facts is this: stay alive. We might be able to bootstrap an ai up to a self-aware, self-preserving, rational agent which can adapt to any environment.

Can someone explain the equation in this article?

Yes. The core of AIXI is the part that figures out the nature of the universe in which it finds itself (a formalization of induction), and the algorithm for that is essentially "simulate all possible universes and see which ones give data that matches our observations, then apply Occam's razor and assume the simplest one is true".

It will not have escaped your notice that "simulate all possible universes" would require an infinitely powerful computer, and even finite approximations quickly become wildly intractable, making the algorithm unusable for anything significantly more complex than Pac-Man. Thus AIXI is interesting for philosophical purposes, being a mathematical formalization of intelligence, but not useful for engineering purposes.

Here's a 5 min video of AIXI playing Pacman (the description provides details): https://www.youtube.com/watch?v=RhQTWidQQ8U

Look for YT videos with Marcus Hutter, they're excellent. He explains very good, very systematic. Great stuff.

I'm surprised there's no reference to computational complexity. I can give a perfectly fine example of a machine that satisfies their definition but would not be considered intelligent because it's inefficient.

It certainly makes sense to have a number of if-checks, which an equation essentially is. The difficult part is feeding the equation with the correct information, and with the correct timing and sequence, etc..

What if we tried to mirror the evolutionary process by starting with an objective like survive, reproduce, etc?

That's the idea genetic algorithms, genetic programming etc. are based on.

Surely, it will be built faster if it is open source.

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact