AI Overcomes Stumbling Block on Brain-Inspired Hardware

igorkraw · on Feb 18, 2022

Having started my PhD originally in neuromorphics before I switched lanes, no they didn't, at least not the important one.

They still perform gradient descent using a GPU. I love BrainScaleS but until we have analog/neuromorphic training, the elephant in the room of "why not make an ASIC for the prettrained model" remains. We can do robust training on GPU already.

There is interesting work being done with predictive coding based training that might fix it, but as far as I know it's still put there.

billlli · on Feb 18, 2022

Thanks for your comment, one of the authors here…

Fully self-learning systems are certainly one of the overarching goals of our field. Unsurprisingly, there are many challenges to be solved along the way.

> why not make an ASIC for the prettrained model

Our paper does not really touch the topic of deployment (except for the study on post-deployment degradation of the circuits, maybe). Model-specific ASICs, however, would likely not pose an economically viable solution.

> We can do robust training on GPU already.

We certainly can! Deploying those trained models on novel, "imperfect" hardware is the challenge.

igorkraw · on Feb 18, 2022

I hope you I did not cause offense, for neuromorphics this is a wonderful paper and it's important to do basic research like this! I'm just a bit jaded after 5 years of following the literature and seeing most papers sidestep what I see as the big road block.

We can now train sparse, quantized, robust neural networks which are already specified in terms of primitives for which ASIC macros can easily be designed. If we are going to make a new chip that anyway, IP like this is a benchmark I compare to in my mind.

If we want flexibility,FPGAs are being integrated with modern CPUs and will allow you to program precise weights if you want them, making it more feasible to do complex tasks.

So this is regarding the point of ASICS. I don't want to bash your paper, but to me this is the competition to beat and why I reacted to the title given by quanta with context that I think I'd important for people not familiar with the literature.

I fully believe neuromorphic or neuromorphic inspired inference engines will (continue to) have their place.

As for the deployment of robust weights to imperfect hardware, an ex colleagues of mine started this line of research when I did my internship at IBM https://www.nature.com/articles/s41467-020-16108-9

So I meant robust in this sense, robust to deployment to real devices

16890c · on Feb 18, 2022

To your point of on-chip analog neuromorphic training, there is some recent work with one of the same authors [1] on event-based backprop in spiking neural networks. So far they only have simulations, but this is likely an important step toward fully integrated, scalable training of SNNs on neuromorphic hardware.

[1] https://www.nature.com/articles/s41598-021-91786-z

igorkraw · on Feb 18, 2022

This is part of the research I find more exciting, but the challenge is to actually make this work for things which aren't MNIST. I might be wrong on this, but I haven't seen any novel learning rules deal with Fashion MNIST or CIFAR so far. MNIST can be solved based only on image statistics and is a bad check in this regard - almost everything can learn MNIST

orbifold · on Feb 18, 2022

One of the co-authors here. Demonstrating inference on a neuromorphic chip can already be challenging, especially if it contains analog components. This is a way to make this kind of hardware "useful" on a given task. Of course learning on chip is the holy grail, but given that this kind of hardware has potential latency and power-consumption advantages for small (event-based) data, a general way of optimising the weights is pretty cool, I think.

p1esk · on Feb 18, 2022

So what advantages does implementing spiking NNs in hardware vs implementing non-spiking NNs in hardware? Usually people say "better power efficiency" but I have never seen apples to apples comparisons. Is mixed signal chip running spiking NN actually more efficient than a mixed signal chip running a non-spiking (traditional, GEMM based) NN? Have such comparisons been done in the literature? If not, where is this claim coming from?

Also, why do people try to implement SNNs in hardware, when they don't work well in software? Shouldn't we first try to figure out how the brain actually does it (processes information), and only then try to build expensive specialized hardware for it?

orbifold · on Feb 18, 2022

The chip under discussion is able to do both (analog matrix multiplication and SNN operation). Briefly there are a bunch of “nano”-devices which are more amendable to spike based operation. Moreover analog computation is hard to scale up so the layer wise digitalization and weight loading and communication wipe out a lot of the potential benefit in the case of analog matrix accelerators. Part of the advances in recent years make SNN work “well enough” in software, especially considering the relative smaller overall investment in them.

p1esk · on Feb 18, 2022

But BrainScale chip was built to run spiking ops, so even when it does analog matmul, it's not optimized to do it exclusively and end to end, right? How about we compare it to a chip that was designed to perform analog matmul, for example, this one: https://www.mythic-ai.com/product/m1076-analog-matrix-proces...

If we measured the forward pass time to run something like Resnet-50 on Imagenet, taking into account any accuracy degradation, and compared to what BrainScale can do with the SNN equivalent of Resnet-50 - that would be interesting.

Don't get me wrong, what you did there is nice (chip in the loop with SNNs), I'm just struggling a bit with understanding the motivation. What does "works well enough" mean? Shouldn't it work much better than anything else to deserve building custom hardware for it? Especially if a regular matmul based NNs work better and might actually run faster and be more power efficient (when run on state of the art custom hw)?

I mean, this would be a no-brainer :) if you told me "this is how our brain works, and we want to emulate it in hardware to speed up neuroscience experiments", but that's just not true, is it? We don't know how the brain processes information, even such basic things like how the information is actually encoded, or what kind of computation a neuron performs.

Or if you don't care about the brain, it would make sense if the SNN algos produced state of the art results, and everyone would want to run them in their iphones. Or ok, if no state of the art results, at least good results with the best speed/efficiency. But if you have neither best results, nor best hw performance, I'm really scratching my head here...

orbifold · on Feb 19, 2022

Let's see, first of all this is a research project not an attempt to build a commercial product (yet). Indeed it is not currently optimised to do analog matrix multiplication particularly well, but what prevents it from performing better is well understood and relatively easy to fix. Of course we are aware of commercial efforts, but this is a pretty long running research project, so not all of those necessarily existed, when it was set in motion. As an example wafer-scale integration was accomplished on BrainScaleS way before Cerebras pushed for it, clearly they have executed far better on building a commercial product around it.

Our motivation is to build large scale accelerated neuromorphic hardware and to prove that it can be useful. It is not particularly efficient or even feasible to train SNN with GPUs over large timescales, so eventually we will need to use on-chip learning. This paper could be seen as an intermediate step, it's useful to know that the hardware can be optimised in the very least as a base line for further experiments.

Applications are not our primary concern at the moment, for the most part we believe that once we have identified the right algorithm(s) and hardware that is able to support the implementation of these algorithms it will be possible to apply it to many problems. For ANNs backpropagation had been figured out a long time ago, but the recent successes started after 2010. To be clear for SNN inference our chip is far faster and has far better latency than a GPU and is roughly ~10x faster than Intel's Loihi. Application areas for that are admittedly niche, especially as long as we can't scale to far more neurons.

igorkraw · on Feb 18, 2022

Yes, it is pretty cool, please don't think I'm bashing your paper. If anything, I'm bashing Quanta and popscience for overhype and the literature in general for mainly avoiding the difficult question for the more tractable one - while fully understanding why.

As I said in a sister comment, I fully believe in neuromorphics inspired inference engines. It's just that we have some of them already, and while your paper is novel, people should take this in the appropriate context.

xpuente · on Feb 18, 2022

The second elephant is don't split training from inference. There is no such thing as inference in biological systems. All these efforts are futile as long as we do not get rid of the GD.

mpfundstein · on Feb 18, 2022

i think one needs to let go of those questions. this is fundamental research that might yield enormous benefits. yesz there are more efficient ways currently(!) available , but that frankly doesn't matter yet. we are now busy figuring out the fundamentals.

i personally think this research yielded a very cool insight. namely that you can fix the problem of decreased performance when transferring a learned model on a supercomputer to a neuromorphic chip. this is very cool

igorkraw · on Feb 18, 2022

I addressed this in some sister comments(including similar work tbsz transferred to other types of devices), but specifically for your comment: in my view, science has higher standards for presentation and communication than start-up submarine posts, and the fact that we are still relying on GPUs to train and other avenues to do this exist is relevant to asses it. I am happy about basic research being done and do not think every paper needs to set a SotA or change the world - but overhype is exactly why every paper needs to try selling it as such, because if you don't go viral and make a big impact on main media, your career might suffer. Hence install of being able to say "we made something cool, a lot of work left to be done", papers need to conjure up a paradigm shift put of every publication (again, not bashing the authors, just the system)

AussieWog93 · on Feb 18, 2022

Are you intimate with the literature the way GP is?

I studied a different field (application of machine learning to brain-machine interfaces), but I would (and still do) regularly see completely mundane research presented as something groundbreaking by a person/institution seeking clout.

I've actually pointed out the bullshit a couple of times here, and received very similar responses to yours.

I think it's a case of cutting-edge, not-widely-deployed technology seeming really exciting to someone hearing about it for the first time, even if the research itself does nothing particularly new compared to a few years ago.

mpfundstein · on Feb 18, 2022

I feel your pain. seriously! i worked 2 years on deep learning for crowd counting and 99% of all papers (and corresponding press releases) were utter crap. Super minor improvements were presented as the big new thing. But thats how the field is. Everyone trying to get tenure, so people need to scream loudly from the trees what assume shit they are doing.

I for one just learned to ignore it and just look at the merit of a paper and what it contributes.

dr_dshiv · on Feb 18, 2022

More generally, this is an amazing reference if you interested in the field:

Csaba, G., & Porod, W. (2020). Coupled oscillators for computing: A review and perspective. Applied Physics Reviews, 7(1), 011302.

Did you know Von Neumann posthumously patented a non-Von Neumann architecture based on coupled oscillators?

visarga · on Feb 18, 2022

The idea would be great if it could scale to 100B weights. In the loop training is cool - forward prop using the neuromorphic hardware, network update using Pytorch.

But for now, they only get 98.7% on MNIST, state of the art is 99.91%, a 14x lower error rate. You have to try hard to get under 99%, just take a look. Maybe it's because they can't use backprop. Backprop is so powerful it's hard to beat it.

https://paperswithcode.com/sota/image-classification-on-mnis...

https://www.kaggle.com/c/digit-recognizer/leaderboard

billlli · on Feb 18, 2022

One of the authors here…

Scaling certainly is one of the next big challenges, the current network sizes severely limit us in our inference performance.

Just to clear things up: Our circuits were actually trained via backprop. This is what allowed us to reach performance levels very close to equivalently sized but simulated SNNs (and even rather close to the accuracy of ANNs of the same size).

slowmovintarget · on Feb 18, 2022

Seems like this is also a mathematical explanation of why Einstein's brain was so powerful. He was overparameterized in similar fashion with more connections per neuron than the average brain.