But you don't have to explicitly tell a human what the reward function for Pac-Man is. Show a human the game and they'll figure it out. Which makes me wonder if, while there is some room for variability in reward functions, there might be some basic underlying reward computation that is inherent in intelligence. I can't find the link just now, but I read an article a few months ago (might've even been here on HN) about a system that demonstrated the appearance of intelligent behavior by trying to minimize entropy and maximize its possible choices for as long as possible within a given world model.
My thoughts are that if they were to take such a direction with this AI, they'd give it the basics and let it evolve and learn its own complicated reward structure. When you're trying to get a monkey to play pac-man, you bribe him with a capful of ribena - he doesn't care about fun intellectual challenges, but sweet liquids motivate the hell out of him.
(This is the state of actual monkey research - ribena is monkey crack)
What I want to know, is how long does it spend doing this? Does the duration of one of these cycles have to be manually chosen, or is it smart enough/self-aware enough to realise, okay, if I spend forever thinking about something, then nothing will happen?
I guess what I'm asking is, is the algorithm smart enough to eventually internalise its own model of itself?
As far as I understood, it's just random search for models ("For the planning component we use standard Monte Carlo"), and some kind of minimum description length (MDL) based method for model selection ("For the learning component we use standard file compression algorithms").
So the probability that it would, by Monte Carlo search, find a model that includes some kind of description for itself, I assume to be astronomically tiny.
(...And even if it did, it would have no way to know that it has stumbled upon a model of itself. Except the experimental performance of the model, which could or could not be better than some other models and variations.)
I'm a developer for the linked site - if you're interested in posting the question in the comments I'll ping the author to see if he can answer.
(If you happen to have encountered Marcus Hutter's AIXI before, there is nothing new there compared with any other presentation of his ideas.)
So, to comment on your dystopian prediction, I'm not sure this algorithm would do any better against humans, unless all we do is wander aimlessly too ;)
I doubt it would do better than humans because it's not very efficient, but a better AI running on the same principles could be smarter than even humans and outperform us outside the virtual world.
Beware, you need some serious maths and a board to follow him. In the end the book is quite rewarding.
Now to AIXI: I don't belive that having a formula solves AI problem. Yes you can model it, but in real environment, I believe there are simpler models to try first. Remember: nature like simpliness first, complex when needed.
How many times do we need to re-define the same concept? Artificial Intelligence, Human Level Artificial Intelligence, Artificial General Intelligence, Strong Artificial Intelligence etc....
Lets pick one as a community and stick with it. I thought AGI was in the lead there recently what with the conference, journal and high amount of web searches but apparently that isn't enough.
Beyond that, I would like to see how they make their algorithm narrow for the narrow application of the Pac-Man game while keeping it generalizable. My guess is that they don't and this algo ends up being a "starting point" for narrow AI applications to rest on. In that case it's fine and interesting, but doesn't pass the test to integrating specificity within a general learning model.
There is no reason to assume that human intelligence is universal. Or intelligence is biased towards picking berries and avoiding bears in three dimensional world.
Every time the name gets co-opted to mean something else. These days "artificial intelligence" == "machine learning and natural language processing" which is most definitely not what TFA is about.
I sometimes think everybody's still afraid there'll be another AI Winter (and/or that the first one hasn't ended yet), and are therefore anxious to come up with a new name for what they're doing every so often as sort of a pre-emptive dodge.
And when our assumptions turn out to be wrong, should we continue working on it even though we know it can never work? The problem is every new term is forged with assumptions because the very idea of what intelligence is the subject of ongoing debate. (I don't even like the one presented here, even though it's pretty good as far as these things go.)
I sympathize though, the nomenclature is polluted with the corpses of would-be giants and makes it hard to talk about what we can all easily recognize as the same broad-strokes effort.
You just use the rewards to optimize the function that tells you what's the predicted reward at any stage for a given action. And then take those best acions
"The AIXI formalism says roughly to consider all possible computable models of the environment, Bayes-update them on past experiences, and use the resulting updated predictions to model the expected sensory reward of all possible strategies."
I will add that the prior over models comes from a measure of model complexity, which is a reasonable way to construct a prior.
While I think the idea of connecting deep mathematical formalisms with actual AI goals is mostly hype, these formalisms are still interesting in themselves. There is something very intuitive about how AIXI works, but that doesn't mean it is going to practical for a large range of problems.
Seek new knowledge, learn well, reason logically and inductively, recognize patterns and generalize, contemplate and plan for the future, converse and be social, survive,
and strive for optimum health (my addition).
Very good advice in whether or not you're an automaton.
2. What happens in the infinite case? With pacman, there are always a finite number of choices at each state.
3. If it doesn't have to be told the rules, then how does it make decisions? In theory, it may be able to learn how to play jeopardy, but it may be way too inefficient in practice. Humans don't even start with a blank state.
4. I don't think I ever consciously apply philosophical principles like ockham's razor when I problem solve or learn. It makes me a little uncomfortable that we're starting with a philosophy, rather than having the system discover things itself. I would be ok with it if there was some parallel between ockham's razor and physics (not the methodology of science).
Why do you think it is relevant what you do consciously? The only things that you do consciously are those things which your brain is ill-equipped to do. The vast majority of your thinking processes are subconscious, as are the principles that drive your conscious thinking. And I guarantee you, Ockham's razor is in there whether you realize it or not. When things get complicated do you purposefully look for a simpler solution? When trying to understand an unknown situation, do you start with something simple and ad complexity as needed? Ockham's razor.
> I would be ok with it if there was some parallel between ockham's razor and physics.
... there's not?
EDIT: As an AI researcher, I'd be more interested in creating an artificial scientist than "artificial science." So what makes the scientist work? Ockham's razor is at the foundation of that.
Also, don't forget that this is a partially observable domain (with only local sensory information) with no apriori knowledge of the overall goal to achieve. It learns completely from scratch ! If we find this task easy as human beings, it's because we have a lot of prior knowledge that we can transfer into this task
AIXI is doing a shortest-first search for predictors that match the observed environment, except that it's searching them all simultaneously in infinite parallel (that's why it's not computable) and what gets moved around is probability weighting. Ockham's razor describes the starting state of the weightings when there hasn't yet been any evidence: shorter predictors are given more weight.
A disciple of another sect once came to Drescher as he was eating his morning meal.
“I would like to give you this personality test”, said the outsider, “because I want you to be happy.”
Drescher took the paper that was offered him and put it into the toaster, saying: “I wish the toaster to be happy, too.”
Don't get me wrong, a machine can achieve goals is really, really useful. After all, chess playing programs use the AlphaBeta algorithm to prune future positions intelligently, but to score the positions they still need human input. It may be, however, that they infer their OWN rules from past positions, with absolutely no human input. Then it becomes interesting. Still, the initial rules of the chess game have to be set down.
So while this is intelligence, this is not sentience.
I'm really not entirely sure how it's decision making process works and I'm really curious to know. Because simulating every possibility would be ridiculous, but trying to predict the distant future based on some small action that happens in the present is also really difficult.
>So while this is intelligence, this is not sentience.
No one is claiming it is, and "sentience" is a really dubious concept itself.
Their definition of intelligence is also extremely precise and gives me a new way to look at the world.
That said, things like the mirror test are so wrapped up in the construction of our own brain that it really doesn't make sense to apply alien intelligences, as you'd you'd be unlikely to get a visible response for unrelated reasons. "That thing in the mirror is me, so what?" Actually that's a bit misleading - the AIXI will have no internal dialogue. It's not built that way. But give it any task, and it will learn to do that task. If that task is to recognize itself in mirrors, it will... but it will do so in a way that is markedly different from humans.
I admit that I didn't spend time to really delve into it but nothing in this article strikes me as particularly ground breaking.
edit: innovative --> ground breaking
It's innovative because this combination which is unique to Marcus Hutter's Ph.D. thesis basically "solves" the problem of general AI.. if given infinite computation resources. It's provable mathmatically to the be the best possible artificial general intelligence, with constant-time operation.. it's just that constant happens to be larger than the known universe.
That sounds useless, but it's not. What is shows is that artificial (general) intelligence is really an optimization problem, and provides a (theoretical) benchmark comparison, and directs current research into more practical algorithms which approximate that ideal within realistic space and time constraints.
To me it still sounds like "I solved the problem because I formalized it." The formalization is probably novel (since he got his PhD for it) but somehow I find it difficult to believe noone else had a similar idea before (it definitely sounds familiar to me having learned about reinforcement learning, minimum description length, etc. before). I'll have to read up on it.
This is sort of like criticizing the concept of a Turing machine because no one has built one with infinite tape or running time.
Of course as mentioned, this gives you constant factors equal to the size of the observable universe. No one said it was practical :)
This way of thinking (size of the observable universe is constant) avoids the whole question of time complexity. To actually measure the time complexity of the algorithm, you need to consider inputs of different sizes. You could run exactly the same algorithm in a different universe (that is why it's called "universal" after all) and you'd get a different running time. The time complexity is then the relationship between the two running times and the universe sizes.
Or the Minimum Message Length principle (from 1968 )
How do humans avoid this? Because our brains are evolved to be plastic but not too plastic. We encounter new environments (like going from one level of Pac-Man to the next), but not new laws of physics (like going from Pac-Man to Mario).
It will not have escaped your notice that "simulate all possible universes" would require an infinitely powerful computer, and even finite approximations quickly become wildly intractable, making the algorithm unusable for anything significantly more complex than Pac-Man. Thus AIXI is interesting for philosophical purposes, being a mathematical formalization of intelligence, but not useful for engineering purposes.
Look for YT videos with Marcus Hutter, they're excellent. He explains very good, very systematic. Great stuff.