
AI Cheats at Old Atari Games by Finding Unknown Bugs in the Code - mtuncer
https://www.theverge.com/tldr/2018/2/28/17062338/ai-agent-atari-q-bert-cracked-bug-cheat
======
dpflan
This reminds me of AI research using NES Games. The AI eventually became
proficient at completing Mario levels, and along the way it discovered novel
strategies for survival, obtaining points, and finishing levels.

> Check out this timestamp to watch the machine "cheat":
> [https://youtu.be/xOCurBYI_gY?t=9m55s](https://youtu.be/xOCurBYI_gY?t=9m55s)

> Researcher's site about the project:
> [http://www.cs.cmu.edu/~tom7/mario/](http://www.cs.cmu.edu/~tom7/mario/)

> The Paper: _The First Level of Super Mario Bros. is Easy with Lexicographic
> Orderings and Time Travel...after that it gets a little tricky._ :
> [http://www.cs.cmu.edu/~tom7/mario/mario.pdf](http://www.cs.cmu.edu/~tom7/mario/mario.pdf)

~~~
pbhjpbhj
Lol, that Youtube video, at the end the AI pauses the game of Tetris forever
so as not to lose.

~~~
maxander
Another case where “the winning move is not to play.”

~~~
gkilmain
I have a friend who I play chess against and every time he's about to lose he
offers me a draw. And will continue offering me a draw until I win. Not happy
to see the AI performing like that bitch MukyMuky.

~~~
notahacker
In repeat games your friend's strategy might actually be counterproductive,
assuming he's playing against people that are rational enough to figure out
that he always starts offering a draw round about the time he expects to lose,
but not so good at chess that they sometimes don't sometimes miss
opportunities for a quick checkmate...

I'd be disappointed if an AI chess computer invariably let me know that I had
a better path to winning the game than it did.

~~~
gkilmain
That is exactly what happened. He played a friend of mine whose much lower
rated. Eventually the lower rated player found himself in winning position but
when the draw was offered he assumed he must be missing something and accepted
it. So it worked once.

------
personjerry
> It’s not the most powerful or widely used form of AI at the moment, but it
> is making something of a comeback. The ability to crack Q*bert could be read
> as a good omen that evolutionary algorithms are going to be very useful in
> the future.

Wow that's quite a jump to make

~~~
mnx
This sound like me at the end of every school essay. A forced and over-broad
conclusion just to get a "proper" ending.

~~~
tclancy
@#$&%!

------
andyjohnson0
The title seems misleading to me. The AI isn't finding bugs by somehow
examining the game's source code, it's trying random gameplay and exploiting
any advantages that emerge. That it's finding previously unknown bugs seems to
be almost entirely down to trying things that human players wouldn't think to
do.

~~~
tantalor
You confuse bug (unintended behavior) with its cause (bad code).

~~~
BatFastard
We called them "Unintended features", and they were usually quite popular with
users.

------
nopinsight
The case is an example of wireheading [1] and illustrates the difficulty of
eliciting behaviors we _actually_ desire from complex systems we do not fully
understand.

[1]
[https://wiki.lesswrong.com/wiki/Wireheading](https://wiki.lesswrong.com/wiki/Wireheading)

Another lesson: Evolutionary algorithms are really hard to control. Using
neural networks developed through evolutionary algorithms means that we are
employing a mostly opaque (though not entirely black) box created by a
mechanism we can't mentally keep track of in detail. Hope that they are not
deployed to control any critical systems until we get a much better grasp of
them.

~~~
nopinsight
Has anyone been able to comprehensively state all of essential human values
for a general AI to follow? Thankfully, we do not yet have an operational AGI
and it is still quite a bit away from reality. (Narrow AIs we are using do not
pose much of a problem because they are limited in capabilities.)

------
raverbashing
Well how do you say what's cheating or not? It works and it increases the
evaluation score

In this case one possible workaround to "cheating" would be to reduce the
control precision, add some jittering to control inputs or change the goal
function. But I'd say if it's being done solely with using the intended
controls it's not cheating (as opposed to changing memory or using a debug
'cheat code').

Still, even in real sports some "cheating" is allowed (see Fosbury Flop)

~~~
tboughen
If it’s not technically cheating, it could be described as gamesmanship.

From Wikipedia
[https://en.m.wikipedia.org/wiki/Gamesmanship](https://en.m.wikipedia.org/wiki/Gamesmanship)

“Gamesmanship is the use of dubious (although not technically illegal) methods
to win or gain a serious advantage in a game or sport. It has been described
as "Pushing the rules to the limit without getting caught, using whatever
dubious methods possible to achieve the desired end".”

~~~
sincerely
Another term for this is "angle shooting".

------
NicoJuicy
I always found this a good project to demonstrate AI
:[https://xviniette.github.io/FlappyLearning/](https://xviniette.github.io/FlappyLearning/)
( based on Neuro evolution ) - speed it up for faster results

------
camgunz
Can we put AI to work on proving that we live in a simulation? I would never
enter/exit my apartment 38 times alternating between forwards, backwards and
each side, but an AI would. Maybe then all the walls start flashing and then
we'll know!

~~~
AnIdiotOnTheNet
What would it possibly matter? If I told you tomorrow that the entire universe
as you know it is running on some extra-dimensional alien computer, how
exactly is your life changed? Is it any more or less meaningful? Will your
suffering be any less painful, your happiness any less joyful?

Besides, how would you even tell the difference between a bug in the
simulation and legitimate physics? I mean, look at electron tunneling.

~~~
shakna
> Is it any more or less meaningful? Will your suffering be any less painful,
> your happiness any less joyful?

My happiness won't change, but I would be excited.

If we are indeed in a simulator, then I would be compelled to create or join
an effort to attract the attention of a being outside the simulator. Not for
worship, but discourse.

To be able to communicate with something outside of what we had perceived as
reality, and would be no less real, would be an amazing opportunity.

~~~
AnIdiotOnTheNet
I admit, that would be exciting and interesting... but also probably
impossible. Communication requires shared context, and it is likely whatever
our experience of reality is bears no relationship whatsoever to theirs.
Imagine Super Mario Bros is a simulation that hosts intelligence. Do you think
he interprets data that ultimately becomes pixels on a screen anything like
the way we do?

------
Semiapies
So, it's basically working as a goal-oriented fuzzer.

~~~
hatsunearu
Fuzzers are like bug/anomaly/new state finding-oriented reinforcement learning
programs, so yeah, in a way :P

------
tabtab
So it can become a dirty cheat just like a human. AI _is_ getting more
"natural" after all.

