
Neuromorphic computing implementation of pole balancing - kleebeesh
http://neuromorphic.eecs.utk.edu/pages/research-demos/
======
obstbraende
They use evolutionary search to discover spiking neural networks whose
response dynamics can solve a control task. This is a fascinating approach,
but one that I've only ever seen as a means to do theoretical neuroscience: A
way to obtain interesting spiking networks whose dynamics we can study in the
hope of developing mathematical tools that will help understand biological
networks.

But here, from the claims in the post and the lab website, it sounds as if the
goal is in application: Creating better, more efficient controllers. This
comes across as a little detached from the applied machine learning
literature. At the least, I missed a comparison to reinforcement learning
(which has a history of learning to solve this exact task with simpler
controller designs and most likely shorter search times) and also to non-bio-
inspired recurrent networks.

One more point: Even if I follow along with the claim that 'deep learning'
approaches don't have memory (implying recurrent networks aren't included in
that label), I want to point out that this particular task setup, with
positions/angles as well as their rates of change provided, can be solved by a
memoryless controller. It would have done more to highlight the strengths of
the recurrent network approach if a partially observable benchmark task had
been used, e.g. feeding positions and angles only. Much more difficult high-
dimensional tasks e.g. in robotic control are tackled in the (deep)
reinforcement learning literature among others.

------
imaginenore
Two-pole balancing (2009):

[https://www.youtube.com/watch?v=fqk2Ve0C8Qs](https://www.youtube.com/watch?v=fqk2Ve0C8Qs)

Double inverted pendulum balancing (2015), a much harder task:

[https://www.youtube.com/watch?v=8t3i2WPpIDY](https://www.youtube.com/watch?v=8t3i2WPpIDY)

Double inverted pendulum balancing with a physical cart (2011), a much much
harder task:

[https://www.youtube.com/watch?v=B6vr1x6KDaY](https://www.youtube.com/watch?v=B6vr1x6KDaY)

Triple!!! inverted pendulum balancing with a physical cart (2011), a much much
much harder task:

[https://www.youtube.com/watch?v=cyN-
CRNrb3E](https://www.youtube.com/watch?v=cyN-CRNrb3E)

------
Animats
The single inverted pendulum balancing problem has been solved using neural
nets, fuzzy logic, and nonlinear control theory. It's a standard problem in
controls classes.

Here's a system learning how to do this.[1] Takes about 200 trials.

Here's the _triple_ inverted pendulum balancing problem, solved using
feedforward control.[2]

[1] [https://www.youtube.com/watch?v=Lt-
KLtkDlh8](https://www.youtube.com/watch?v=Lt-KLtkDlh8) [2]
[https://www.youtube.com/watch?v=cyN-
CRNrb3E](https://www.youtube.com/watch?v=cyN-CRNrb3E)

------
alexbeloi
Are there any non-evolutionary training methods for neuromorphic networks?

Currently, fixed architecture ANNs can solve the cartpole problem very quickly
already with Q-learning or policy gradient methods:
[https://gym.openai.com/envs/CartPole-v0](https://gym.openai.com/envs/CartPole-v0)

It seems like some kind of neuromorphic networks are going to be necessary for
the long term AI 'dream', but there really needs to be something better then
evolutionary algorithms for training, those just don't scale.

~~~
parkerm7
Learning algorithms remain a big issue in neuromorphic computing. So far, we
have had some success with evolutionary optimization algorithms. The nice
aspect of EO at the this stage of the game is that EO can reveal useful
patterns and network features which could guide later learning systems.

I am working on a project right now which will require a more complex network
interacting with many real world sensors, so it should be interesting to see
how EO performs for such a problem. Our EO has been demonstrated to scale on
parallel machines as large as ORNL's Titan, but I readily acknowledge that
much more work needs to be done with learning algorithms.

------
mooneater
"A Deep Learning system does not have a temporal component" but isnt that what
recursive deep learning is for.

And how do they calibrate low/middle/high.

~~~
kleebeesh
Roughly speaking, low/middle/high are just buckets, each of which represents a
third of the distribution for position, velocity, or angle.

We attempt to illustrate that in this image:
[http://neuromorphic.eecs.utk.edu/images/graphics/Inv-
Example...](http://neuromorphic.eecs.utk.edu/images/graphics/Inv-Examples.jpg)

~~~
mooneater
Ok so this looks like Fuzzy Logic? Or does the system automatically determine
the low/middle/high boundaries?

------
philipov
I understand that some of the inputs don't affect the behavior and so don't
connect to anything, but why do some of the synapses not connect to anything?
In particular, what is the meaning of the two nodes above and below the
outputs. They are on long chains that appear meaningless. Why did the network
grow there; is it just evolutionary noise?

~~~
gbruer
Unless I'm looking at it wrong, the synapse above the output takes input from
2 up and 2 to the left, which is a path that leads back to an input.

But you're right that the synapse right below the outputs has no input, which
doesn't make any sense. The original generated network had a neuron with no
inputs at (6,12) just to the left of the outputs, left over from the
evolutionary algorithm like you suggested. We ran a program that was supposed
to prune the network by removing the evolutionary noise that had no effect on
the outputs. It looks like that program just removed any neurons with no
inputs, leaving behind that meaningless path of synapses.

------
musesum
How was the wiring between neurons and synapses created?

~~~
gbruer
There's a brief description here
[http://neuromorphic.eecs.utk.edu/pages/research-
overview/#pr...](http://neuromorphic.eecs.utk.edu/pages/research-
overview/#programming-via-evolutionary-optimization).

Basically, the inputs and outputs are fixed, random neurons and connections
are added, the networks are all tested, the better ones are changed and/or
combined, and then repeat until a "good" network is found, where good in this
case means balancing the pole for a long time without running into the walls.

~~~
musesum
Interesting. This could map onto a cellular automata. Do the connections
evolve from left to right? Or do you start with a static size universe?

~~~
gbruer
The grid size and the location of the inputs and outputs are fixed throughout
the process. Are you suggesting the connections could grow left to right as it
evolves? I think that'd be interesting, but I'm not sure if it would be
useful.

