Hacker News new | past | comments | ask | show | jobs | submit login
A pile of matchboxes that can learn [video] (youtube.com)
71 points by ColinWright on Nov 15, 2017 | hide | past | web | favorite | 14 comments



I really like the physicality of this method of teaching people about ML. Often times people get lost in all the abstraction, especially if they haven't be trained to be comfortable with an abstract concept on top of an abstract concept.

If you want to play against MENACE you can do so here: http://www.mscroggs.co.uk/menace/.


As mentioned in the video I managed to get MENACE to resign on the first turn After just under 40 Games of beating it in a row.


I first "encountered" MENACE in a children's book on robotics after I got my first city library card when I was in the second grade; so around age 7 or so.

The explanation of how it worked, from what I recall, was very brief, and I didn't understand it much at all.

But it was my first introduction to the concept - outside of my experiences with "science fiction" of the time - that an inanimate "machine" could actually learn. This really ignited my passion for computing and robotics, something I have carried with me since.

It ultimately led me to becoming a software engineer, and to exploring machine learning and artificial intelligence over the years as well.


When I was a kid I made tick-tack-toe playing matchbox automaton. It was described in one of the M. Gardners book.


Why is this better than drawing a state diagram on paper?


For you or I, maybe it isn't. But for someone whose background isn't computing, is a more physical learner (say, my kids), this is a pretty easy-to-follow introduction to the idea of reinforcement learning that is both intuitive, and naturally leads to further discussion.


Because it's tangible and interactive and approachable and memorable. It's something that anyone (read: everyone who doesn't know how what state diagrams are) can visualize and understand within a few minutes at a fun little booth at a public science fair.


The match boxes are a state diagram, so if someone can't comprehend a paper diagram they can't comprehend this either.


I know people for whom this is much more understandable than state diagrams. They are equivalent under an appropriate isomorphism, but that doesn't mean that they are equally easy to understand for everyone.

They aren't. You made this assertion:

    ... if someone can't comprehend a
    paper diagram they can't comprehend
    this either.
I believe you are wrong.


Because it demonstrates model-free reinforcement learning without using a computer!


A 15 min video. Hmmm. Maybe somebody who watched it wants to recap ?


It plays tic-tac-toe. There are 304 matchboxes for each possible state in tic-tac-toe[1]. Each match box has three colored beads in it. Different colors indicate the next move to make.

You take the state of the board, find the corresponding matchbox, draw one bead at random out of the box, and make the move.

Depending on if the game is a win, loss, or draw, you add or remove beads in the matchboxes to increase or decrease the odds of making that move again.

After about 150 games or so, the matchboxes almost always draw against a competent opponent.

There is a 1/10 chance that the matchboxes get into a bad state where they can get an empty matchbox. This does not happen in the video.

[1] In this scenario, the matchboxes always go first. Letting it go second slightly more than doubles the number of matchboxes.


This is the very similar to the technique described in Fred Saberhagen's short story "Without a Thought."


There's a textual explanation here: http://www.mscroggs.co.uk/blog/19




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: