Hacker News new | comments | show | ask | jobs | submit login
Hardware Architectures for Deep Neural Networks [pdf] (mit.edu)
172 points by blopeur 16 days ago | hide | past | web | 17 comments | favorite

I've been wanting to know how electronics' power efficiency compares to biological neurons, and this paper gives a clue. The most efficient hardware it mentions is the GV100 "Tensor Core", at 400GFLOPS/W for FP16.

If a typical neuron requires 10^6 ATP per activation [1], and if it takes 30.5 kJ/mol to charge ATP [2], and if a typical neuron has 100 axons, each of which is contributing one FLOP, then I _think_ a human neuron is about 500 times as efficient as a GV100 [3], at 200,000 GFLOPS per watt.

[1] https://www.extremetech.com/extreme/185984-the-human-brains-...

[2] https://en.wikipedia.org/wiki/Adenosine_triphosphate


  10E6 ATP = 1 activation = 100 FLOP
  30.5 kJ = 1 mole ATP = 6E23 ATP
  1 kJ = 0.28 Wh
  2E5 GFLOPS = 1 W

  GV100 "Tensor Core":
  4E2 GFLOPS = 1 W

A neuron has only 1 axon, it may branch to many different places however.

It's more useful to loop at synaptic operations. 1 FLOP per synaptic activation is a vast underestimate of what happens computationally in the brain. Synaptic activations have many homeostatic side effects.

One can also not discount the cost of the memory needed for the tensor core.

Keep in mind the energy costs of the supporting hardware for that brain, though.

Computers have supporting hardware too - powerplants (plus related infrastructure) and cooling systems, at minimum.

Humans consume 0.7-1.3 kW per capita. A network of these chips scaled to human brain capacity would consume around 10 kW. It is really not a big difference. Our robot overlords are going to be quite efficient!

Well, keep in mind that you should never really use the advertised number as your comparison point, because that's more the "highest that you'll ever see" rather than the "average reliable performance". Not to mention they assume everything magically is in the caches as far as memory is concerned, which is absolutely untrue.

Yes, there are many confounding factors and sources of uncertainty on both sides -- here are a few I can think of:

* memory isn't accounted for

* FLOP per activation is estimated at 100

* neurons can vary by a factor of ~1000 in how much energy they require to activate, based on their size and speed

* ATP: I'm not really sure what to make of the energy here; I used 30kJ / mol. Wikipedia says "The hydrolysis of ATP into ADP and inorganic phosphate releases 30.5 kJ/mol of enthalpy, with a change in free energy of 3.4 kJ/mol."[1] What's the efficiency of our digestive & recharging systems -- maybe 5% - 30%?


How do you figure an "axon contributes one FLOP" if it's spiking? Seems to me a _spike train_ (that is, a long series of activations) would be equivalent to a floating point operation, not a single spike.

My estimate is undoubtedly a case of "Doesn't realize the difficulty of the question, therefore naively tries to answer it."

I think you have a much more sophisticated understanding of neurons than I do!

Human brain uses 10^5 less power than von Neumann architecture, according to Caltech paper from ‘90s.

Do you have a specific reference?

This Mead paper[1] suggests "...a factor of 10 million more efficient than the best digital technology that we can imagine"...which I interpreted as a gross approximation at the chip level assuming 10nm process, not considering any particular architecture, and doesn't include roughly 2-3 orders of magnitude reduced efficient at the system level.

[1] http://authors.library.caltech.edu/53090/1/00058356.pdf

That was the paper I was thinking of (thanks), but maybe the figure is from an IBM True North paper. I vaguely recall 10^-5 for brain efficiency, 10^-4 for True North vs. von Neumann.

There is also a paper from same authors with about the same content:

Efficient Processing of Deep Neural Networks: A Tutorial and Survey Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, Joel Emer


link seems broken!

It's a 32MB PDF, as it's image heavy, and takes a while to download from their servers.

Not broken for me - it just needs a while to load, the file is huge

290 image-heavy pages. smh

Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact