
Deep Reinforcement Learning: Pong from Pixels - Smerity
http://karpathy.github.io/2016/05/31/rl/
======
keyle
Fascinating.

Many years ago, when I wanted to become a programmer and I didn't know
anything about code, I used to fantasize and be amazed by programs. Code was
like dark magic.

This is how I feel today about machine learning. Neural networks, liquid state
machines. It's wonderful voodoo to my eyes.

I hope one day I get to work in that field, it seems so useful for solving big
world problems. I have notice a definite rise in articles being written and
shared on HN about it lately, that's great.

For those completely in the dark, I found this library to have great wiki
pages about the basics of neural network programming. Great read, I recommend
it. [https://github.com/cazala/synaptic/wiki/Neural-
Networks-101](https://github.com/cazala/synaptic/wiki/Neural-Networks-101)

~~~
jddjdbdbd3
Oooh liquid state machines/reservoir computing, what delicious voodoo. Look up
optical reservoir computing, what wonderful nonsense. Smash the thing into a
million random pieces and look out for the pieces that happen to make your
problem easy. Wht. But it works.

------
bgalbraith
Reinforcement Learning is one of the most exciting areas of research in
machine learning and AI going on right now in my opinion. It is going to play
heavily in creating AI that can make decisions in dynamic environments.

A great introduction to the topic is the book Reinforcement Learning: An
Introduction by Sutton & Barto. You can find the official HTML version of the
1st edition and a PDF of a recent draft of the 2nd ed. here:
[https://webdocs.cs.ualberta.ca/~sutton/book/the-
book.html](https://webdocs.cs.ualberta.ca/~sutton/book/the-book.html)

------
grenoire
In almost all real-time networks playing games, there's very high jitter in
inputs. Even when the machine is moving straight, it's always very keen on
doing some wiggling with the other keys.

My question is: Is it possible to eliminate that by further training? Naively
you could drop 'stupid' inputs, but I assume that may also mess with the
machine's understanding.

~~~
almostarockstar
Notice how the output (or the network) is stochastic, not a solid value. You
could certainly tweak the output sampling function to reduce jitter.

Further training may also reduce this, but technically, since the jitter is
not causing any reduction in the reward, it might not. The best approach would
likely be to alter the reward system to discourage jittery play...but again,
there is no point because it does not reduce fitness.

I suppose where this is important is in robotics where jittery movement might
actually be dangerous, or wear down hardware. In that case, you could
certainly use an output smoothing function and tweak the reward.

~~~
folli
May be you could introduce a (small) penalty to every keystroke. This might
select against unnecessary movement and thus reduce jitter.

------
d33
In the context of AI and gaming, I definitely recommend this series of three
youtube videos:

[https://www.youtube.com/watch?v=xOCurBYI_gY](https://www.youtube.com/watch?v=xOCurBYI_gY)

Some games are played better than human would.

~~~
symmetricsaurus
These videos by Tom7 are indeed great. They don't display the use of deep
learning though; the method is something quite different [1].

Maybe you could use the objective functions of Tom7 to determine which moves
are good or bad in order to train a policy network?

[1]:
[http://www.cs.cmu.edu/~tom7/mario/mario.pdf](http://www.cs.cmu.edu/~tom7/mario/mario.pdf)

------
Pica_soO
A neural Network is specializing in one particular problem set? You can not
create Meta-Neural networks that reconnect the specialized Networks or grow
new ones?

