
TensorKart: self-driving MarioKart with TensorFlow - pickle27
http://kevinhughes.ca/blog/tensor-kart
======
JonnieCache
In contrast, here's what is effectively an oracle machine playing mario kart:
[https://www.youtube.com/watch?v=ZBNgbJ5hXtQ](https://www.youtube.com/watch?v=ZBNgbJ5hXtQ)

(Amazingly detailed) info:
[http://tasvideos.org/5243S.html](http://tasvideos.org/5243S.html)

~~~
rl3
I like how it just glitches itself to an almost instant win on half of the
maps.

~~~
dclowd9901
These are Tool-Assisted Speedruns. That means that it's a human player using
things like slow motion, mem dumps and other mechanisms to play perfect games.
It's more an example of human's abilities when augmented with computers than
AI discovering those glitches itself.

~~~
hkmurakami
The most amazing run I've seen so far was a RTA (realtime time attack) of mega
man 2. A human player is manually collision glitching and writing over memory
with a sequence of inputs. And the RTA time in 2016 is now faster than the
initial TAS records.

~~~
sgrove
Do you have a link? Sounds like it'd be a very interesting watch.

Edit: I found one that has an example (I think) around 7:38
[http://www.nicovideo.jp.am/watch/sm13963118](http://www.nicovideo.jp.am/watch/sm13963118)
\- the collision detection pushes megaman into the wall and jumps between
different sections. Very interesting indeed!

~~~
hkmurakami
Sorry I think the one I was thinking of was Megaman 2. I did find a link for
you. Starts at 2:36.
[http://www.nicovideo.jp/watch/sm23825129](http://www.nicovideo.jp/watch/sm23825129)

Also you might be interested in the Final Fantasy 6 memory overwrite bug that
was discovered in 2016 as well. It uses the Window Color menu settings as the
data reference.

Btw regarding the Megaman 2 RTA, there's an even more ridiculous collision bug
being used around 11:30
[http://www.nicovideo.jp/watch/sm28321223](http://www.nicovideo.jp/watch/sm28321223)

------
adyus
Congrats on finishing the project! As you've already linked at the bottom of
your post, it's possible that OpenAI could've solved most of your I/O issues.

One thing I'd suggest is exploring a reward function, instead of using only
pre-recorded training data. That is, give the AI a goal to complete (in this
case, finish the race) and let it learn by itself!

~~~
Drdrdrq
I would love to learn how to do that - any suggestions?

EDIT: to clarify: what should I google for?

~~~
adyus
Here's what I could find in a couple minutes:

[https://github.com/openai/universe-starter-
agent](https://github.com/openai/universe-starter-agent)

OpenAI's example universe agent. Remember that while their goal is an agent
that works in any and all environments (read: games), you could certainly
optimize yours just for MarioKart.

~~~
Drdrdrq
Thanks, looks promising! Can't wait to try it! :)

------
cr0sh
This is pretty cool; as someone who is currently working on the second project
(traffic sign recognition) for the Udacity "Self-Driving Car Engineer"
nanodegree, using TensorFlow - it is interesting to me how it seems like the
"standard" MNIST CNN can be adapted to so many other use cases.

For the project I am currently working on, I'm using a slightly modified form
of LeNet - which isn't too different from the TF MNIST tutorial; after all,
recognizing traffic signs isn't much different than recognizing hand-written
numbers...

...but "driving" a course? That seems radically different to my less-than-
expert-at-TensorFlow understanding, but that is only due to my ignorance.

I'm glad that these examples and demos are being investigated and made public
for others - especially people learning like myself - to look at and learn
from.

~~~
halflings
From the post:

> Later, I switched to use Nvidia’s Autopilot...

So I guess he didn't use the MNIST CNN model.

~~~
cr0sh
However, if you look at the code:

[https://github.com/SullyChen/Autopilot-
TensorFlow/blob/maste...](https://github.com/SullyChen/Autopilot-
TensorFlow/blob/master/model.py)

You can see that it follows much the same pattern as LeNet CNN for MNIST - a
few (ok, more than a few!) convolutional layers followed by a few fully
connected layers.

Maybe you could call it a "follow on" or perhaps an ANN pattern?:

Conv -> Conv -> Reshape/Flatten -> FC -> FC -> FC

(disregarding activation and such)

...which is really the lesson of the LeNet MNIST CNN - at least, that's my
takeaway.

~~~
halflings
You're right, that does look similar... I expected this to be based on some
type of RNN!

------
rl3
The inevitable follow-up article that delves into training offensive banana
peel usage should be interesting.

------
jostmey
Quote: "Driving a new (untrained) section of the Royal Raceway:"

So the author did a proper test of the model by scoring it on an unseen track
to make sure it generalizes! This is very awesome!

~~~
ska
How did we get from "bare minimum sensible testing" to "This is very
awesome!"? Are things that bad on average?

~~~
kevinwang
There's probably a broad range of people in the hn community.

~~~
ska
Fair point.

NB: generalization should be one of, if not _the_ first thing you think about
and plan for in any ML project.

------
bduerst
Personally I think the most impressive thing here isn't that you created a
self-driving MarioKart, but that you trained TensorFlow based on input
screenshots of your desktop.

I feel like that could be a good next step - a ubiquitous neural net model
that, after mapping inputs, will learn to play any video game that's on your
screen.

~~~
jboggan
Especially since the hard work of increasing the screen resolution has already
been done.

Also, bravo on including the stupid little bugs that gave you trouble. It
always sustains me working on a hard project to know that a self-driving video
game was blocked by a missing newline in a C HTTP request. It makes me step
back and laugh at the ridiculous complexity of what we take for granted in our
day to day work.

------
cjmcqueen
Best part, "With this in mind I played more MarioKart to record new training
data. I remember thinking to myself while trying to drive perfectly, “is this
how parents feel when they’re driving with their children who are almost 16?”"

------
bitL
It's basically a project these days at Udacity's Self-driving car nanodegree
under "Behavioral Cloning" ;-)

------
jordigh
I was ready to be impressed about seeing an AI that could consistently beat
the game's own AI, with blue turtle shells and all. Oh well, still pretty
impressive to be able to drive on the easiest course without opponents.

------
sakabaro
Check out also MarI/O, very impressive:
[https://www.youtube.com/watch?v=qv6UVOQ0F44](https://www.youtube.com/watch?v=qv6UVOQ0F44)

------
nartam11
How are the original computer opponents able to play MarioKart?

~~~
taway_1212
1\. The AI in games has access to internal representations of game state and
does not have to recognize it from pixels on screen. This is a massive
difference.

2\. The logic is usually a bunch of (human-authored) scripts consisting of if-
else spaghetti.

~~~
bryondowd
Also, the AI opponents don't have to play by the same rules. They go by fun >
fairness to keep things interesting. That's why you normally can't keep a huge
lead on AI opponents, because they "rubberband" back up to you faster than
they should be able to.

Wouldn't surprise me if they don't even 'drive' in any sense while off-screen,
just increment some abstract position relative to the track length. But I
don't know this for a fact.

~~~
dclowd9901
"Wouldn't surprise me if they don't even 'drive' in any sense while off-
screen, just increment some abstract position relative to the track length.
But I don't know this for a fact." This seems unlikely, especially given how
item pickup zones operate. Since an item box disappears for a short period of
time after someone drives over it, it's imperative that the position of the
CPU player who drove over it, and the one that comes after that (and
inherently gets no item) is represented accurately. Even off-screen, AI
continues to collect and utilize items.

Then again, maybe this is just done by cheating simply with an RNG.

~~~
JonnieCache
This is indeed what they do. Mario Kart 64 is known for having the most
egregious rubberbanding ever.

------
eli_gottlieb
Personally, I'm just a little impressed that you can train an active agent to
play a game using old-fashioned supervised learning on screen states and
controller states rather than relying on "action-oriented" learning techniques
like reinforcement learning, online learning, or even a recurrent model.

It really shows how _simple_ many control tasks actually are!

~~~
tobilarscheid
This is exactly what I wondered about. So what exactly is the function you are
training for? Is it basically like "if the screen (showing the track) looks
like this, apply these controls"?

~~~
PeterisP
An more accurate description of the function would be "given this picture of a
screen, what is the most likely key my author was pressing in this situation"
\- no goals, no values, no optimization, but simply learning to imitate the
actions performed by a human.

Coincidentally, one of the neural network components in AlphaGo did pretty
much the same, i.e. attempted to guess what human player would usually play in
this situation purely based on the image and nothing else.

------
tjfontaine
Next, have it upload its race results to kartlytics

~~~
TomAnthony
Yeah - that was a very cool project!

------
xigency
I'm interested in knowing why the Python and C components communicate with
HTTP, beyond reading about the bugfix. Wouldn't it be easier to use sockets or
files or some other mechanism to integrate the two languages?

Just something to think about as a developer. I would imagine that on a local
machine, using HTTP as the protocol might add latency.

~~~
haikuginger
This was my initial reaction as well; it seems like a raw socket or even
embedding a Python interpreter would be better ways to go.

------
CM30
Pretty interesting I must say. Have to admit though, I kind of expected the
self driving AI to be trying to win Grand Prix or Versus races instead of
doing well in Time Trials. But hey, I can see how that would be utterly
painful to try and set up, especially given how times you get hit by items or
rammed off the track in more recent games.

~~~
bisby
Step 1 is to make the AI find an ideal path through the course.

Step 2 is to make AI figure out how to return to the ideal path through the
course when other people are stealing your items or shelling you.

step 3 is to make the AI figure out how to counter attack to slow down the
opponents.

Step 4 is OH GOD WE TAUGHT THE AI HOW TO ATTACK RUN FOR YOUR LIVES.

~~~
bryondowd
Step 2.5 would be to make the AI figure out how to evade or minimize the
effect of or ability to initiate opponents' offensive moves. That would be the
most interesting bit to me. Would be neat to see an AI intentionally stay in
2nd place with an item at the ready until the home stretch, to avoid being
blue-shelled.

~~~
bisby
Intentionally stay in second, unless it has reason to believe that it can stay
in 1st place, even after getting blue-shelled.

But yes. point being, self driving is a feat of it's own, competing with
opponents is a whole different ballgame with it's own set of challenges.

------
TomAnthony
It would be very interesting to see how well this does with more training
data, especially with multiple players.

------
holografix
This is very cool and I think if Kevin spends a bit of time learning
reinforcement learning it could be amazing.

It seems like a lot of people doing reinforcement learning on video games get
bogged down on training on raw pixels only... it would take a tremendous
amount of data to make the driver recognise when and where to use certain
power ups, however if you encoded this as a variable, wow it could be really
cool.

I believe this is fundamentally how we humans learn with so few examples.
Other humans "encode features for our brain to track" by telling us how it
should be done and what information to prioritise.

------
ramzyo
This is really cool, and any reason to bring this game back into my life is
warmly welcomed

------
tomrod
I love this!

I'm working (albeit very slowly, as a beginner) on a similar project with
Geometry Dash and Python. You're a great inspiration!

------
gm-conspiracy
I appreciate the write-up. Thank you!

------
dylanbfox
great write up! this is awesome

------
jondiggsit
No power slide? Failure.

