
"It’s done in hardware so it’s cheap" - dmit
http://www.yosefk.com/blog/its-done-in-hardware-so-its-cheap.html
======
csense
Slightly off topic, but it's widely accepted among physicists that the act of
computation expends energy [1]. Thus, there are actually limits to how much
the cost of a given computation can be reduced, regardless of how cleverly we
build the computer, or what we build it out of (silicon, DNA, fiber optics,
whatever) [2].

[1] <http://en.wikipedia.org/wiki/Landauer%27s_principle>

[2] If we're willing to use algorithms that don't destroy information, or
willing to operate at arbitrarily low temperatures, as I understand it there's
no theoretical limit to how small we can make the energy costs, but these
restrictions seem highly impractical.

~~~
kragen
While this is true, current hardware is several orders of magnitude away from
the Landauer limit, so there's still quite a bit of room for building
computers more cleverly before we confront it.

However, reversible algorithms are not in fact particularly impractical. They
require a somewhat different way of thinking about things, but they're
dramatically easier than e.g. quantum computing speedup.

Similarly, cryogenic computation is entirely reasonable, especially in space.

~~~
ars
"several orders of magnitude"???

We are an order of magnitude worth of orders of magnitude away from that
limit.

Specifically: 5.539×10^11 times as much energy (for the I7 920, picked
randomly).

[http://www.wolframalpha.com/input/?i=130+w+%2F+%280.0178+ele...](http://www.wolframalpha.com/input/?i=130+w+%2F+%280.0178+electron+volt+*+82300+MIPS+%2F+instruction%29)

~~~
kragen
Thanks for doing the calculation! I'm not checking it right now, but it's in
the right range. However, in the tradeoff between shortest time of execution
of serial programs and power efficiency per instruction, the i7 is way over at
the shortest-time-of-execution extreme. Common GPUs are about a thousand times
as power-efficient, and many embedded microcontrollers are nearly as power-
efficient as GPUs. Check out the Bitcoin hardware performance Wiki pages for
hard data.

------
robomartin
As someone who has done extensive work in image processing using custom
hardware I am not really sure what he is talking about. Is this intended to
suggest that software is cheaper than hardware? Or that it has performance
advantages over specialized hardware? Not sure.

It's tough to beat smartly-designed specialized hardware in image processing.
Some of the things I've done would require ten general purpose computers
running in parallel to accomplish what I did in a single $100 chip. So, yes,
less cost, higher data rate, reduced thermal load, reduced physical size, less
power requirements, etc.

Maybe I don't get where he is going with this?

~~~
masterzora
It is tough to pull a coherent thesis from this piece but it seems to be close
to "be aware of the costs involved because, although specialised hardware can
be useful in many situations, it is not a magic wand you can wave." The way it
opens he seems to imply that he is used to people excusing slow/inefficient
ideas by handwaving that doing it in hardware will be fast without doing any
real critical evaluation on what gains hardware can actually bring. The fact
that some systems can see real gains from specialised hardware does not serve
as a counterargument if this is in fact his thesis.

~~~
_delirium
You can get a bit better idea where he's coming from from his previous series
(linked in the post) responding to people arguing that high-level languages
would be faster if hardware were designed for them. In his view the Lisp-
machine idea of HLL-specialized hardware design rarely pans out vs. just RISC
with an optimizing compiler. This piece seems to be applying the same critique
to algorithms more generally, that "we'll just do it in hardware" isn't a
magic win, because it's not always an issue of impedance mismatches with the
hardware that can be simply fixed by choosing different hardware.

The places he suggests you _can_ get a win seem sensible: 1) cases where the
cost of dispatching instructions and handling intermediate results dominates,
in which case a CISC-ish specialized instruction implemented in silicon may be
a win over stringing together simpler operations; and 2) cases where you can
get extra parallelization in hardware that isn't available through general-
purpose instructions (e.g. doesn't map on nicely to SSE-style instructions).

~~~
Spooky23
So I guess the premise is that you have two extremes: minimal instruction set
(RISC) and complex (a CPU designed to run Python, for example).

IMO, as is often the case, the answer lies in the middle. Look at the
tremendous impact that adding AES acceleration features to x86 processors has
on applications that require encryption.

~~~
_delirium
That seems to be what he's arguing, actually; that you can get hardware
speedups if you carefully target very specific things that a parsimonious
addition of hardware features can enable.

