
NeuroEvolution – Flappy Bird - adtac
https://xviniette.github.io/FlappyLearning/
======
w23j
This is one of these fantastic "all that is done with so little code?!"
moments.

Amazing that this can be done in 300 lines of perfectly readable JS without
any libraries. And the author apparently wrote it in 2 days. Great intro to
genetic algorithms and reinforcement learning. Universities should teach like
that.

Edit: Somebody has asked about the big picture how the thing works:

You use a neuronal network to decide at each step to flap the birds wings or
not. That is the only output. The input is only the birds height (y-position)
and the height of the next hole.
([https://en.wikipedia.org/wiki/Feedforward_neural_network](https://en.wikipedia.org/wiki/Feedforward_neural_network))

The question now is how to find the correct weights for the net.

That net is not trained in a traditional supervised way like with gradient
descent (which would be more complicated). Instead it uses a genetic
algorithms to find new weights for neuronal networks of future generations.
That is use some of the best individuals. Some randomly generated ones and
some created by breeding from existing individuals.
([https://en.wikipedia.org/wiki/Genetic_algorithm](https://en.wikipedia.org/wiki/Genetic_algorithm))

And that's basically it in this case.

~~~
placebo
These are the exactly gems that I am constantly in search of, and thrilled
every time I come across one.

This is going to turn out to be a rant, but sometimes it feels like there's a
culture of solving problems by a philosophy of "let's throw more money at it,
more technology at it, more people, more unnecessary abstraction layers,
complexity and bloat - it's bound to be solved eventually". And with enough
fire power, it usually is - but at what cost... I'm not sure what the source
of the problem is though, and it's not only with code. Many times I struggle
to understand certain concepts via professional literature, and when I finally
understand them, I'm puzzled as to why they are taught in a way that if I
didn't know better, I'd assume a deliberate attempt to confuse the reader
instead of making the learning simple, intuitive and fun.

Exactly the same applies to code I've had to refactor and sometimes rewrite
over the years. Obviously, some of it has to do with trying to get a product
out in crazy schedules which requires making compromises, but I can tell the
difference between compromise and "I don't really care" attitude, and I'm
talking about the latter. It's a pity though - not only for the one that has
to sort out and sometimes clean up the mess, but also for the one that creates
it. Things are so much more fun where you care and are passionate about what
you do, that I'm sorry for those who are missing out on it.

End of rant. Not sure it belongs here but had to get it out of my system :-)

~~~
taeric
Sounds like you are noting that brute force is a very effective technique.
Combined with explaining being a tough skill.

It is seductive to want things to be elegant. It is costly to wait for the
elegant solution.

~~~
placebo
If that's all it was then I wouldn't bother to bring it up. What I'm saying is
that more focus on the beauty of a solution (e.g robustness, scalability,
flexibility, _simplicity_ ) can be a more financially sound solution in the
long run than brute force. Of course, I'm not holding my breath for this to
become the norm since using brute force to "get what we want now" is popular
because it gets fast results (and this not limited to the software industry) -
I'm just suggesting the real costs come later.

As for explaining being a tough skill - what I find in common with the
previous point is the lack of stress on keeping things as simple and obvious
as possible. If you truly understand something, you should be able to convey
that understanding assuming you really want to - it just takes investment,
usually by asking yourself why it is obvious to you, then putting yourself in
the other party's shoes and then finding the shortest and clearest path to
bring them from where they are to where you are. In my experience it's not a
tough skill - just requires introspection into your own understanding and not
assuming anything about what the other person understands.

~~~
taeric
I was not trying to dismiss what you said. If anything, I was condensing it
into my understanding.

The basis is not that brute force is better. The essence is most of us are not
seeking an elegant solution to a problem. We are seeking a solution to a
problem. Often, just getting that answer is all that matters. Finding a more
concise way to get it is something I fully agree that someone should be trying
to do. And, in the long term, it is a huge boon if it is found. For most
tasks, though, the original solution is all that was needed.

------
gioele
Interestingly, the updates are done on the basis of only 2 external inputs
[1]: the height of the bird in the screen and height of the aperture in the
next pipe. Using only these two parameters the neural network decides whether
to flap or not.

I would had expected at least also the horizontal distance from the next
pipe...

[1] [https://github.com/xviniette/FlappyLearning/blob/gh-
pages/ga...](https://github.com/xviniette/FlappyLearning/blob/gh-
pages/game.js#L147-L150)

~~~
fenomas
Wow, that's really unexpected.

This isn't my field, but given the simplicity of the inputs and network, and
the way commenters are seeing the demo achieve perfect play after anything
between 2 and 200 generations, it makes me wonder if this isn't more of a
brute-force search than actual learning?

That is, it smells like there's a "correct" set of neuron values - where any
genome within some tolerance of those values wins forever, and any other
genome dies quickly. If that's the case, the system can't really evolve
towards a solution, can it? It would just cycle randomly through lots of
genomes that die immediately, until by pure chance one lives forever. I only
tried the demo a few times but that's what it looked like it was doing.

~~~
maaaats
"All" evolutionary algorithms are basically a local search with smart
heuristics. Where a local search is a brute-force where you move in small
directions based on feedback on where you are in the solution space.

~~~
fenomas
I understand how the algorithms work. What I'm suggesting is that the demo
seems to behave like a hill-climbing algorithm that's been unleashed on a
terrain that's flat everywhere except the solution.

~~~
maaaats
Ah, I see what you mean now. Yeah, the problem space seems a bit simple. I
never see any "learning" before it suddenly achieves perfect play.

------
JoshTriplett
At generation 83, a single bird emerged that successfully navigated through
several hundred columns (current score 100000+) and showed no signs of
failing.

It took a few dozen generations to find versions that would make it through a
few columns if those columns had gaps without too much vertical distance
between them. Somewhere in generation 60-70, I could see versions figuring out
how to transition between heights, but failing by overshooting, or not
accounting for gravity on the far side of the gap. But once it figured out how
to transition between distant heights, it had something that kept working.

GIF of the successful bird:
[https://imgur.com/a/A0li8](https://imgur.com/a/A0li8)

~~~
iMerNibor
took 230 generations for me to do anything then it just went from dying within
the first 3 pipes to mastering it within 4 generations

~~~
oxalorg
On my first run, 2 individuals learnt(?) perfect play in 3 generations.

~~~
eric_h
On my second run, 3 individuals seemed to have learned perfect play by gen 8,
but after successfully navigating through dozens of pipes, two of them died
and one continued indefinitely.

------
ChicagoBoy11
This reminds me of a section (I think it might be the beginning) of "Surely
you're Joking, Mr. Feynman" where he recounts fixing radios as a youngster,
and the outrage of a client of his that he at times would just sit and stare
at the radio. Apparently the guy went around town telling people about this
incredible thing he had witnessed -- this kid he knew fixed radios "by
thinking."

For every incredibly complicated algorithm designed to solve some really huge
problem and deployed at massive scales, there are another thousand little
problems at a far lower scale. I did a project at a school once using a GA to
automatically create class rosters based on gender balance, student social
networks, and grade distribution. This used to be a two-week long guidance
counselor project at the school. Now they press a button.

Everyone in this forum clearly understands the impact that software is having
on the world. But nevertheless, I still think we underestimate the impact that
a more computationally literate society with more readily available computing
power can have. I think examples like this excellently written piece of code
highlight that.

Here is one -- admittedly very bright -- person who can from the comfort of
his browser come up with such a fascinating and ingenious solution to this
computational problem. Yes, it is being used for flappy bird, but this would
easily extend to maybe a whole set of domain of problems that he/his
company/his community could be facing. Yes, it is flappy bird, but this alone
could serve as the motivation for an incredible course in a high-school
dealing with topics from coding to the theory of evolution. "The Evolution of
Cooperation" had a similar effect on me -- Axelrod uses nothing but very
elementary Algebra to make such a strong and insightful argument that got me
thinking like nothing ever had before.

Here's hoping we make this kind of "play", and this sort of thinking, as
widespread as possible in society.

------
m3adow
When I first watched it, nothing changed for about 200 generations (thanks for
the x5 speed up). I wanted to comment on that, but decided to reload first.
Now my 12th generation is flapping for a good two minutes while im writing
this comment.

It seems only the second time there was machine learning involved.

~~~
taneq
Evolution is generally characterized by 'punctuated equilibrium', it sounds
like you just started on a plateau.

------
wapz
Can anyone give me insight on where the code learning is? I cloned the
repository and after a quick glance I don't see where the "meat" is.

~~~
cedricblack
It is a using a genetic algorithm so learning lies in the mutation and
crossover of chromosomes as well as selection of the fittest individuals.

[https://github.com/xviniette/FlappyLearning/blob/gh-
pages/Ne...](https://github.com/xviniette/FlappyLearning/blob/gh-
pages/Neuroevolution.js#L172)

~~~
Fannon
It's also using a simple neural network, which is the target of the genetic
algorithm, if I understood it right. I haven't seen this combination often -
does that make sense in general, or is this just interesting as in playing
around with those concepts?

~~~
stefs
it's a pretty standard procedure to train NNs through GAs but usually not very
efficient (e.g. compared to backtracking).

in some cases you might lack an easy way to calculate a fitness score out of
the NN performance, which is needed to run the GA.

i tried training a simple NN with a stupid hill climber some time ago but
quickly hit a roadblock even with very few neurons because of local minima ...
or maybe bugs.

i guess for more complicated problems the pure GA training method might just
not be "cost effective" (i.e. time/quality tradeoff).

~~~
maaaats
> _it 's a pretty standard procedure to train NNs through GAs but usually not
> very efficient (e.g. compared to backtracking)._

Different applications, though. Backtracking in the normal sense needs input
and expected output (e.g. lots of training data), while GA/EA learns to solve
it without explicit wrong/correct actions, just the score at the end.

------
Novex
Hmmm.. I made the pipes move up and down and now the neural network isn't
doing so well :P

I've added "vertical velocity of next pipe" and "distance until next pipe" as
inputs as well as upped the hidden layer neurons to 4 - anyone have any other
ideas to improve its performance?

------
RickS
As somebody who's never looked into this kind of task before, I couldn't
believe it was powered by a ~300 line file and not some massive library.

Really awesome.

------
SCHiM
Ahh yes, this is very interesting although I agree with many of the posters
here that the model is probably too simplistic even for demo purposes.

For programmers there are even more striking examples of AI out there,
complete with code, documentation and explanations.

Check this out:

[http://www.primaryobjects.com/2013/01/27/using-artificial-
in...](http://www.primaryobjects.com/2013/01/27/using-artificial-intelligence-
to-write-self-modifying-improving-programs/)

In me this sparked a months long fascination with genetic algorithms solving
programming puzzles with a few sets of 'initial memory states' and 'desired
memory states'.

The articles start with very simplistic generated programs, and end with a
sophisticated model able to generate guess-the-number games using loops,
conditionals and handling input from users.

------
james_a_craig
Generation 7 produced a perfect result here. I think the underlying problem is
just too easy, really.

------
staticelf
This is really cool! I am at generation 17 and it shows already no signs of
failing.

------
reustle
This is really cool, and I'm nowhere near smart enough to understand fully how
it works, but I'm wondering. Is the "computer" blind in this case? It doesn't
seem to understand, even after 50~ runs, that it needs to aim for the opening.
It seems to simply go after luck, hoping that some line up. And if it lines up
a few times, it will get further, so it is learning something, but still not
seeing the change the next time.

Edit: Seeing the new comments here, it seems like it only starts learning
after you refresh once or twice. Very cool!

------
hathers
Hit 1M points, 73 generations! Little guy just keeps on going.
[http://imgur.com/a/HFwRC](http://imgur.com/a/HFwRC)

------
NicoJuicy
Where can i learn more about NeuroEvolution or similar algorithms?

It's amazing this fits in thos LoC, but i don't see the total picture.. Any
books you can recommend?

~~~
w23j
The big pictures in this case is:

You use a neuronal network to decide at each step to flap the birds wings or
not. That is the only output. The input is only the birds height (y-position)
and the height of the next hole.
([https://en.wikipedia.org/wiki/Feedforward_neural_network](https://en.wikipedia.org/wiki/Feedforward_neural_network))

The question now is how to find the correct weights for the net.

That net is not trained in a traditional supervised way like with gradient
descent (which would be more complicated). Instead it uses a genetic
algorithms to find new weights for neuronal networks of future generations.
([https://en.wikipedia.org/wiki/Genetic_algorithm](https://en.wikipedia.org/wiki/Genetic_algorithm))

And that's basically it in this case.

------
XaspR8d
I got a perfect evolution... on generation 1. Sooo I guess who needs genetics
when you're born perfect?

(Naturally I came to the comments because I thought this was BS)

------
JasonSage
I accidentally left this open in another tab for several hours and it got to
over 550k score on generation 35. That was fun to see!

~~~
jeanlucas
Here on generation 33 the games started to have more duration, more than 100k
score. Curious.

EDIT: actually generation 33 is still happening, even on "5x" mode, 260k
points and counting. Looks like it totally got how the game works.

------
guiomie
The first time I ran it, it seemed to have "learnt" perfectly after 4
generation ... Now im at 16 still no luck.

------
michaelsbradley
For fun, try replacing the setTimeout usage in game.js with calls to
setImmediate[+]. My gen 57 birdie (x4) is increasing its score by approx 2,000
points/sec.

[+]
[https://github.com/YuzuJS/setImmediate/](https://github.com/YuzuJS/setImmediate/)

------
pokpokpok
awesome! if anyone is interested in this, i wanted to show around my similar
neuro-evolution toy, pond:
[http://maxbittker.github.io/pond/](http://maxbittker.github.io/pond/)

these agents live in a world where they're trying to find and eat food

------
thomasdd
Cool. 16 generations and that's it!

------
michaelsbradley
Exercise: can you find in the Neural Network Zoo the closest match for the
Machine Learning code that powers this demo?

[http://www.asimovinstitute.org/neural-network-
zoo/](http://www.asimovinstitute.org/neural-network-zoo/)

------
ReverseCold
My 3rd generation flappy bird seems to be invincible.

[http://i.imgur.com/Zz82dp4.png](http://i.imgur.com/Zz82dp4.png)

Will update when/if it dies.

EDIT: How lucky did I just get? Still going at 100k+

------
a_imho
Gen 37 on Chromium

Not worked on FF, maybe it has to do with my plugins/privacy settings.

------
turtleofdeath
Watched this for a solid three minutes and flappy birds never made it past the
third column. Painful.

Edit: generation 25 got 10k points and was continuing onward when I stopped.
Beautiful.

------
NicoJuicy
I've i'm not mistaking, the deciding factor should be how many pipes it past.

But here it is: "should it flap the wings or not"

What's the difference? Is this a better way?

------
Lich
When I closed it (played some games, then checked on it after about 1.5
hours), it was generation 22, Alive 2/50, Score 561,000.

------
NicoJuicy
Any implementation/advise on how to speed up the algorithm by doing on or 2
supervised learnings or how to implement this?

------
wruza
We need another genetic network that will play against successful birds,
posing gates in a way that will kill them.

Robot wars, all that!!!

------
ishaanbahal
Wow, this was so cool. Generation 17, two birds, 18K points, still going. I
wan't to learn machine learning now...

------
ericmo
Has anyone tried on Firefox? I'm running FF49 and in gen 50 it has less than
400 max score.

------
nhatbui
Generation 48. What about you guys?

~~~
girvo
Generation 10, which was weird, but got to 70k points before I closed it. I
wonder why mine was so much lower, sheer luck?

~~~
icelancer
Yep. A good metaphor for evolution in vivo.

------
amelius
Does it still learn if there is only 1 bird left in the simulation?

------
anotheryou
haha, generation 3 and already at score > 10000

one lone hero

------
turtleofdeath
Can this be used on, say, Mario Bros?

~~~
notgood
You mean this script? Or machine learning in general? If the later then it has
already been done:
[https://www.youtube.com/watch?v=qv6UVOQ0F44](https://www.youtube.com/watch?v=qv6UVOQ0F44)
(great educational video)

~~~
jeanlucas
looking at the author's github profile he did a lib to abstract the artificial
intelligence and used in some other games.

------
monoid
didn't have the original flappy bird a randomized tube placement?

------
alpineidyll3
Fantastic post

------
baboun
Together as one, only the strong will survive.

