
The Dark Silicon Problem and What It Means for CPU Designers (2013) - plainOldText
http://www.informit.com/articles/article.aspx?p=2142913
======
Phillipharryt
From the article "The heat generation per unit area of an integrated circuit
passed the surface of a 100-watt light bulb in the mid 1990s, and now is
somewhere between the inside of a nuclear reactor and the surface of a star. "

I can't tell if this is hyperbole or not, it amazes me but no amount of
googling is coming up with a useful answer. Is anyone able to confirm or deny
it for me?

~~~
andlier
After a quick napkin calculation+google it seems the sun has around 20kW of
power per square centimeter. So not entirely unfeasible that a cooler star, or
a nuclear reactor is closer to the typical 10-100W/cm2 of a modern cpu/gpu.
Still some orders of magnitude off from our closest star. (Hope calculation is
correct)

~~~
Cybiote
The temperature of the sun at its surface is ~5772 Kelvin. To get power per
unit area, use Stefan-Boltzman: σ * T^4, σ * (5772 K)^4 ≈ 6294 W/cm^2.
Dividing the sun's luminosity (power) by its surface area will also give a
similar value.

A 815 mm^2, 250 W GPU will be 250 W / 8.15 cm^2 ≈ 31 W / cm^2.

~~~
frozenport
Doesn't that only include radiated heat? If I touched it, it would be hotter.

~~~
saagarjha
It wouldn't be hotter (the temperature wouldn't change), but it would transfer
more heat to you.

------
JudasGoat
The author states "For every watt of power the CPU consumes, it must dissipate
a watt of heat." It was always my understanding that that in electrical
devices, (with the exception of heaters) the amount of heat produced was
inversely proportional to the efficiency of the device. So is it really true
that all energy provided to the CPU or SOC is dissapated as heat?

~~~
Elrac
The author's sentence is essentially a truism, because fiddling with
information _as such_ and in theory doesn't use up any energy. As practically
implemented in current CPU electronics, though, information needs to be
communicated from point A to point B as a change in voltage. To convey that
voltage change means having to move some amount of electrical charge into or
out of the tiny capacitor that is a transistor's gate, via the non-perfect
conductor that is the doped-silicon trace between the two points. Resistance
saps part of the energy moving those electrons around and converts it to heat.

~~~
dnautics
> fiddling with information _as such_ and in theory doesn't use up any energy.

that's not true at all. Any time you destroy information (for example an and
gate can destroy information) you use energy.

~~~
Elrac
Chances are you know more about this than I do, and I'd be interested to learn
more.

If you're willing to explain, I wonder how an AND gate destroys information.
Certainly the output of a 2-input AND gate carries less information than both
inputs together. But unless the gate is destroying the input signals, which it
isn't, I don't see how infomation is being destroyed.

~~~
dnautics
[https://en.wikipedia.org/wiki/Landauer%27s_principle](https://en.wikipedia.org/wiki/Landauer%27s_principle)

------
blackflame7000
Why don't they start making CPUs 3 Dimensional like a cube with 6 "processors"
each with multiple cores as its "sides" with the pins on the opposite sides of
the cube wall. Seems to me more internal volume might allow for more cleaver
head distribution channels

~~~
deepnotderp
1\. Heat dissipation is now ~n x m times worse where n is your transistor
layer count. And where m is the increased thermal resistance*

2\. Power delivery is now ~n times worse where n is your transistor layer
count. *

3\. Connections between chips are very slow, power hungry and expensive.
Fabrication of "monolithic" 3D is temperature wise painful and usually results
in crummier transistors.

With that being said, innovative 3D integration methods in specific
applications can help a lot. Shameless plug: we at Vathys do this for deep
learning chips.

* to a first order of approximation

~~~
tormeh
The heat can be tackled in part by pumping water through holes in the CPU. I
believe it was IBM that came up with this. Can't tell if it's feasible or not.

~~~
deepnotderp
Yes but microfluidics limits how thin each die can be.

~~~
agumonkey
What about stacked heat vias ?

~~~
deepnotderp
What are those? Do you mean something like thermal "dummy" vias?

~~~
agumonkey
yeah, something imprinted in all layers so you could evacuate heat

------
bogomipz
I had a couple of question about this bit of history mentioned in the article.
I'm hoping someone could shed some light on this:

>"You can emulate floating-point arithmetic by using integer instructions—but
taking 10–100 times as long."

Exactly how is/was floating point arithmetic emulated using only integers?

Why is that range given an order of magnitude? Is this dependent on the
precision I'm guessing?

~~~
phkahler
We had floating point when programming in BASIC on old 6502 8bit computers.
There were software routines for doing the math. You know, multiply the
mantissa, add the exponents... If someone gave you pointers to a couple 4-byte
chunks of data and told you to write code to do floating point multiply on the
contents using only C-char variables, what would you write? That's why it's
100 times slower than a nice modern fmul. It wasn't quite that bad, they could
use the carry flag which isn't available in C.

~~~
sehugg
Yep, here's the old Woz/Roy Rankin 6502 code for log, exp, conversions, and
basic math:
[http://www.6502.org/source/floats/wozfp1.txt](http://www.6502.org/source/floats/wozfp1.txt)

~~~
bogomipz
Wha a wonderful bit of history your link is. Thanks for sharing.

------
bogomipz
I had a question - the author states:

>"The most obvious is the instruction decoder, which is near the start of the
pipeline, and is responsible (in the loosest possible terms) for passing the
inputs to each of the execution units."

Why would it "in the loosest possible terms"? Isn't this "precisely" the job
of the decoder?

~~~
kabdib
IIRC in many chips the decoder stage translates native instructions into
micro-ops, which are RISC-like and the main food for execution units. The
translation is not necessarily a simple one (one native instruction is often
more than one micro-op, and it's possible to collapse multiple native
instructions -- especially stuff like prefixes -- into one or more micro-ops).

~~~
bogomipz
Ah OK that make sense. This is likely what the authors means here by
"loosely." Cheers.

------
jarym
We just need a clueless CIO to turn up and ask if 'Cloud' will solve the
problem :D

~~~
melbourne_mat
I love the cynicism :-)

~~~
jarym
hehe - not everyone got it from the downvotes I got!

