
Learning by playing - stablemap
https://deepmind.com/blog/learning-playing/
======
samuell
Been occasionally pestering family and some colleagues over the last 1/2 - 1
year, about the idea that "playfulness" of children is why they learn so
quickly, and that it is relevant to ML :P

Children build up a continuously more sophisticated "hypothesis" of how the
world works, and are always inspired (by some neural process?) to explore the
the limits of that hypothesis by perturbing things in new little fun ways ...
throw away that ball ... upwards, downwards, sidewards ... bite it, taste it,
sit on it, stand on it.

I guess there is some randomness to the new experiments chosen to try, but
they are also based on the continuously improved understanding of the world.
Realizing that a ball is bouncy opens up a world of new "fun" experiments to
try, a.s.o. a.s.o.

Well, pretty obvious to every parent of course :D ... but in any way a really
major and important area to learn from I think.

------
QasimK
From my understanding they can train the agent to accomplish a fairly complex
task, by first "training" the agent to be able to accomplish simpler tasks
that are perhaps only slightly related to the final goal. If you want to learn
to run, first learn the more basic skills of balancing, standing up and
walking. The agent appears to decide what task it wants to try to pursue, but
still receives signals about all the tasks available to it, and has a way of
planning and following through on what future tasks to carry out.

I wonder if there's any way to automatically generate useful, simple goals?

~~~
stokasto
Co-author of the paper here. That is a good high-level summary of the approach
:).

Generating useful "simple"/"low-level" goals automatically indeed is an
interesting avenue for pushing this further.

~~~
craigus
Hi Tobias, thanks for dropping into the thread.

When this kind of software becomes mature enough that a consultant can install
a robotic arm on a factory line, and quickly (several hours) train it to do
the job of a factory line worker, there will be a massive economic incentive
to do so.

How far do you think we are from this level of maturity? What are the
remaining steps required to reach that level?

------
zappo2938
I know this has to do with machine learning for robots but play was central to
Plato's Socratic method and I remember vaguely in the Republic it was
suggested that children start to learn by playing. Perhaps, we should call him
the philosopher, Play-Doh. (Sorry, I couldn't resist. I'll show myself the
door.)

~~~
posterboy
Wait a second, it's pronounced Plateau, like what a learning agent would reach
eventually.

The long O is obviously French. And actually it's transliterated Platon
(Πλάτων), but who knows, maybe the English knew that ν (nu) and υ (upsilon)
are easy to confuse. I was joking but hey, who knew I would, reading the wiki
artikel on "ν", discover that it was optional on occasion to link two words.

------
dreamling
Truly fascinating, I really commend the article for being easy to... grasp.
The virtual animations and the
video([https://www.youtube.com/watch?v=mPKyvocNe_M](https://www.youtube.com/watch?v=mPKyvocNe_M))
of the real life robo arm really made the concept crystal clear.

I bet it was exciting to see SAC-X graduate to each concept.

Did it ever surprise you?

