
Programmer Creates An AI To (Not Quite) Beat NES Games - npongratz
http://techcrunch.com/2013/04/14/nes-robot/
======
sethbannon
FYI the video is worth a full watch. The author of the paper is painfully
funny.

~~~
shurcooL
I can confirm it was well worth my 16 mins.

This really makes me want to go back to my game and play some more with AI.
Although, at the same time I realize that's the mistake I made - making the
game/platform for creating AI. It really takes a toll. This guy uses existing
NES games, which allows him to focus on the AI and even generic game learning.
Brilliant.

You can clearly see the advantages of AI in the short-term play (think in the
order of milliseconds, exact frame-by-frame button presses) over anything a
human could ever achieve.

Imagine combining the short-term button-mashing of said AI with the long-term
planning of human players...

~~~
jerf
You're looking for the pleasingly-symmetric-to-AI acronym IA, for Intelligence
Amplification: <http://en.wikipedia.org/wiki/Intelligence_amplification>

------
shmageggy
Much better article here:
[http://www.wired.co.uk/news/archive/2013-04/12/super-
mario-s...](http://www.wired.co.uk/news/archive/2013-04/12/super-mario-solved)

The video is long but worth all 16 minutes. I like how he isolated such a dead
simple heuristic for success in video games. It's not surprising, in
retrospect, that "numbers going up" would correlate with winning, but to prove
it in practice is pretty neat.

~~~
jstanley
He isn't very explicit about it, but the "time travel" that is mentioned
basically allows it to go back in time if things turn out badly. This is what
allows it to make such unbelievable trick moves: in short timespans it is
almost a brute-force search, but over the longer term it can't backtrack as
much.

If that makes sense.

~~~
acchow
Brute force search with time in one dimension is equivalent to this "time
travel", isn't it?

~~~
jstanley
Yes, that is essentially what is happening.

------
raldi
Sounds to me like the next big improvement would be to give the algorithm a
concept of boredom: If the numbers haven't gone up in a while, consider that
very bad. Try progressively more crazy things.

For example, unpausing Tetris. Or having Mario run left.

~~~
skore
> Try progressively more crazy things.

Teaching a computer various concepts of 'crazy' must be tons of fun! :-D

------
ok_craig
Pausing Tetris to avoid losing was hilarious.

~~~
niggler
"The only way to win is not to play"

~~~
acchow
"The only winning move is not to play."

The author is humorous, poetic, and brilliant.

~~~
keeran
and possibly a fan of WarGames :)

~~~
alan_cx
Exactly what I thought. I even think it was a clever way to end, and a point
he might have been making as a sort of subtext.

~~~
acchow
I've never seen WarGames. After reading the synopsis, the subtext is
undeniably intentional.

~~~
alan_cx
30 year old movie spoiler alert!!!!

War Games: Very, very basically, from deep, deep memory...

A computer about to launch a global nuclear war is played at tic-tac-toe to
learn it cant win. It deduces that the only way to win a nuclear war is not to
start the thing in the first place. Any other strategy is a lose, or what we
call mutually assured destruction.

Its obviously more complicated than that. Even though its 30 odd years old, it
worth a watch.

~~~
Ntrails
<http://i.imgur.com/fSBsN7c.png>

want to play?

~~~
alan_cx
Nice :)

But no.

------
jaredsohn
Direct link to paper: <http://www.cs.cmu.edu/~tom7/mario/mario.pdf>

Direct link to video:
[http://www.youtube.com/watch?feature=player_embedded&v=x...](http://www.youtube.com/watch?feature=player_embedded&v=xOCurBYI_gY)

------
jal278
Just to note that the AI has complete access to the NES's RAM (and also
backtracks whenever it gets into trouble), which is not quite fair to compare
to other approaches to AI or playing against a human. I'd be pretty damned
good at battleship if I could see your board and rewind whenever I missed.

~~~
kappaloris
yes, but the point is also that the program doesn't really know the goal, it
just tries to increase bytes that look like counters. that's the fun thing.

you should compare it with a game whose rules you don't know, being able only
to understand your score. like trying to play spider solitaire on windows
without knowing how the game works (it allows you to undo your moves).

~~~
jal278
i agree that it is an interesting result, however having access to
_everything_ is weird from an AI point of view and really muddies any
conclusions to be drawn; the headline seems to make it sound like this is a
breakthrough in AI, when it is using a cheat that won't apply to any sort of
real situation.

notice that in your example of solitaire, _knowing_ the values of hidden cards
really changes the nature of the game and makes playing the game much less
strategic

~~~
shrikant
Not sure why TechCrunch made it seem that way ("a breakthrough in AI"), when
the author explicitly says he's submitted this to SIGBOVIK 2013 -- an April 1
conference that usually publishes fake research. (<http://sigbovik.org/2013/>)

It is clearly an April Fool's hack.

~~~
hamidnazari
Good, I'm not the only person who suspected that.

This is from the paper. "I tried again, and it was a huge breakthrough: Mario
jumped up to get all the coins, and then almost immediately jumped back down
to continue the level! On his fi rst try he beat 1-2 and then immediately
jumped in the pit at the beginning of 1-3 just as I was starting to feel
suuuuuper smart (Figure 7). On a scale of OMFG to WTFLOL I was like
whaaaaaaat?"

------
jamesjporter
Video is hilarious and worth watching the whole thing as others have
mentioned, but there's some great stuff in the paper also. He mentions some
really interesting future directions, e.g. obviating the need for the training
data set of a human playing the game by having the computer initially try
different sets of random inputs and continuously improving on those that do
the "best" according to it's objective function. The bit about the computer
trying to play "dinosaur coloring" is fantastic also.

------
keithwhor
The day has come, "AI rage quits, passes Turing test."

------
anonfunction
My favorite part was how the program finds glitches and uses them ruthlessly.

~~~
blauwbilgorgel
At university we had a contest with programmable tanks. Many wrote elaborate
algorithms and tried to find the optimal strategy, but the winner was very
crude: It tried to get a little bit more life than its opponent, and if it was
ahead it would immediately start spawning threads, until the play server
couldn't handle it anymore and crashed -- Winning by default.

~~~
keeperofdakeys
I had a similar kind of assignment, using a java system called robocode. I
ended up writing a system that could track a tank moving in a static arc,
while moving itself, which worked surprisingly well.

------
fiatmoney
Then they taught it to make paperclips, and ended up with one of these:

<http://wiki.lesswrong.com/wiki/Paperclip_maximizer>

~~~
Houshalter
Until it finds a glitch in the matrix that allows it to pause real life.

~~~
anonymfus
Bruteforcing all states of NES game could result in finding vulnerability in
NES emulator and then to fill memory of emulator with FFs via changing
settings it could make mess from accessible file system or, some life spans of
universe later, find vulnerabilities in OS, hardware or connected network.

------
smd4
"This game is mostly a go to the right and jump and throw hammers at birds..."
Classic delivery. I love this guy.

~~~
acchow
He made me laugh repeatedly.

------
pmb
If your numbers are going up, it means you are having more fun!

<http://radgeek.com/gt/2006/04/28/antieconometrix_comix/>

It's too bad that game that need more lookahead - Tetris, etc, are not
amenable to this approach.

~~~
chc
It sounds to me like it's basically how deterministic a game is that
determines whether this works. For a Tetris with a defined order that the
pieces appear in, I bet this algorithm could do pretty well.

~~~
jerf
For the purposes of this AI player, all NES games are fully deterministic.
Even what hardware the NES had that could be used for random number generation
is under the control of the emulator and being run deterministically. There's
no "random" in the usual sense here.

------
marvin
Pardon me if I'm really ignorant of the AI field here, but doesn't this look
like the beginning of more general artificial intelligence? The organic feel
of some of these moves is almost scary to me.

~~~
wmf
Keep in mind that it's using time travel to do those moves. In the real world
time travel is not currently available.

~~~
raldi
When a baseball player swings a bat, in anticipation of the ball arriving just
in time to get knocked out of the park, do you consider that time travel?

Is the baseball player's anticipation of the future really all that different
from what this program is doing?

~~~
Falling3
Yes, it's incredibly different! A huge problem in AI is internally modelling
the outside world. This completely sidesteps that.

~~~
raldi
Maybe, but that doesn't really have anything to do with the "time travel"
question.

~~~
Falling3
Of course it does. Models are useful because they have predictive power. If
you have time travel, you don't need to predict anything.

------
jmyc
Related: genetic algorithms to play video games. I can't find the exact one I
read about before, which was for Super Mario Bros., but searching for "mario
genetic algorithm" leads to a number of papers, videos, &c.

~~~
ihsw
There are a[1] couple[2] on YouTube.

[1] <http://www.youtube.com/watch?v=qYluZRwrw9w>

[2] <http://www.youtube.com/watch?v=DlkMs4ZHHr8>

------
nokya
I love the fact that a newspaper can still publish this without fear of losing
viewers. It's like eating at a restaurant that everyone qualifies for dog food
but... well...we go anyway :)

------
aaron695
Very funny video and worth watching.

But it's not even close to beating NES Games. I'm not a expert in AI but I'm
not sure it actually achieves anything.

I'd consider it more an art project than a computer science project.

~~~
TallboyOne
You must not be very fun at parties.

~~~
yew
Depends on what sort of parties you like.

------
gnarbarian
Using A* does a much better job: <http://www.youtube.com/watch?v=DlkMs4ZHHr8>

Look at that AI play. It's godly.

~~~
jitl
The paper uses an approach general enough for any game on the NES, which is
what makes it exceptional. The humor is killer, too.

------
hhaidar
The ending is very poetic in a way.

------
wiremine
More authors should make videos like. Well done!

------
nate_martin
I wonder how this would handle an NES golf game.

~~~
rcfox
In "NES Open", you earned money depending on how you placed. So, given long
enough jumps back in time, there is a way to track an increasing score.

Either way though, I think the results would be _sub-par_.

~~~
anonfunction
slowclap.gif

------
bicknergseng
The only winning move... is not to play.

------
michaelochurch
Uninformed speculative question here: could the algorithm be improved with
some sort of momentum term in the input selection?

It seems to succeed with move-to-the-right (position counter) but fall into
what looks like Brownian motion on Pac-Man. It seems to me that putting
momentum into its strategy (and changing when the objective function declined)
might reduce the implicit branching tree.

What it simulates, then, is the tendency for people to fall into "a groove"
and depart from it when their objective function ceases increasing.

~~~
raldi
I'd really like to see a fuller analysis of what Pac-Man was doing there. As I
understand the algorithm, it should have seen into the future where the dot
gets eaten, and not turned back.

But maybe the program failed to discover the bytes which hold the score, and
so it didn't think eating dots was a particular notable activity.

~~~
LordLandon
Maybe it gets distracted increasing the pac-scent? There's certainly a lot
more of that around in memory

------
camus
i hope i live until the day when robots can beat games without any prior
knowledge of the game logic,well actually it would be quite scary

~~~
pbhjpbhj
Scary indeed. Those robots might be very good at ... Thermonuclear War.

------
repler
... sooo, that's not really a great application of a GA. Perhaps not even a
great implementation.

Akin to making things out of Legos at best. (ie: "because!")

~~~
gjm11
It's not an application of a GA at all. Before dismissing something, it's best
to understand it.

------
drakaal
Considering there are bots to beat you at Call Of Duty, and Quake, this is not
such a feat.

I can't find a reference but I recall an Emulator for NES that used to be able
to be player 2 for something like 40 NES games.

~~~
icebraining
I think you missed the point somewhat. Those bots were coded just for those
games, with human-made models and strategies adapted to them.

This bot, on the other hand, understands nothing about the games themselves
besides the input keys and the score number(s). It's playing blind.

~~~
drakaal
Having written real AI's this aint one.

4 directions and two buttons. You could point a crappy machine learning algo
at it, and it would figure it out. That is far fewer variables to manipulate
than what a Quake bot has to deal with.

I would bet money I could zap a pigeon every time it died in mario and it
would learn to be a master in 12 hours. It won't master CoD anytime soon.

~~~
jaredsohn
How would it figure out what it means to be "winning"? It might be possible,
but would require using OCR to read the 'score' (and guessing as what is meant
by the score) and would also need some way to determine if the goal is to "go
to the right", etc. (also possible). Doing this would be interesting and more
practical in that it could work with games where you do not have access to the
memory, but seems like it would require more work in interpreting how the
game's state has changed and wouldn't allow going back in time.

A Quake bot is easier to develop because it is written for a specific game.

I accept that bet for training a pigeon to play SMB :), although the scenario
you described isn't comparable (it is only one game and you would be giving
the pigeon extra feedback about how to play it. It still seems hard and I
think it would be notable if you achieved it.)

~~~
drakaal
This AI has access to all of the ram of the game.

There are Guitar Hero Bots taht "OCR" (your words) the screen to play the
guitar. Not such a hard trick to read a score or see a turtle. (you know we
have self driving cars right?)

~~~
jaredsohn
I think trying to understand what is going on in a game by looking at the
screen would be more challenging than trying to understand by looking at the
game's memory if you want to write a bot that can play any generic game.

The examples you gave are each specific to particular domains.

This whole topic reminds me of how it is much easier to develop natural
language processing for use in a particular domain compared to generically.

~~~
drakaal
I write "generic" Natural language processing for a living.

Also "nintendo" is a particular domain. When you only have a "language" of 8
directions, A, and B, and only time to deal with in response to a screen that
side scrolls in only one direction the programming is easy.

A chess engine has more potential decisions for a bishop than this has for a
given move. And unlike the the NES, the chess engine has to adjust for changes
in the behavior. Given the same input the NES would make the same choices. You
can macro through most the games.

~~~
polymatter
I think 2 things need to be emphasised.

1\. This was a paper submitted for a joke conference. It is in no way a giant
leap in AI. It is something that some nerds find amusing. Like an anti-Rube
Goldberg machine. It is trying to accomplish something quite complex using a
ridiculously stupid approach.

2\. Its stupid because it is looking at the memory state as just a list of
numbers that change during the gameplay. It has no knowledge of the actual
game. It doesn't parse the memory to figure out game state or anything. It
doesn't even have any generic idea of what games involve. It figures out "oo,
the number that starts at location 243 goes up in the training data. I'll mash
buttons to try to make memory location 243 go up".

