
The Hardware Lottery - blopeur
https://arxiv.org/abs/2009.06489
======
MayeulC
I thought this was going to be about silicon lottery.

Key points:

* For a new software approach to be recognized as valuable, it needs appropriate hardware to exist at the right time

* As Moore's law fades away, we can no longer rely on steady performance improvements for generalist CPUs, so specialized HW will make a comeback

An interesting note:

> Hardware is only economically vi-able if the lifetime of the use case lasts
> more than 3 years

I admit I stopped reading at section 5.

~~~
marcosdumay
The next point is that, given the path dependency, the odds of the current
choices for machine learning being optional are very low and the evidence is
preponderantly against it.

I think that was mean to be the main point of the article, but I do think
section 5 is way more interesting than 6.

~~~
KorfmannArno
optional or optimal?

~~~
marcosdumay
Optimal

That wasn't the only type there, but this one makes a lot of damage.

~~~
KorfmannArno
Lol, you're on a roll. Hard to type eh...

------
optimalsolver
Papers Explained has a rebuttal of some of the points:

[https://www.youtube.com/watch?v=MQ89be_685o](https://www.youtube.com/watch?v=MQ89be_685o)

------
tinktank
Interesting article, I enjoyed it. The HFT/low-latency community has been
riding this insight for many years, they call it "mechanical sympathy" and
tend to favour algos/data structures etc that are designed assumign a
particular hardware architecture in mind.

~~~
bmc7505
[https://mechanical-sympathy.blogspot.com](https://mechanical-
sympathy.blogspot.com)

------
wheybags
I've always felt that vliw might have been a victim of this. I'm not super
knowledgeable on it though, apart from being familiar with the disaster that
was itanium.

~~~
londons_explore
VLIW suffers from lack-of-abstraction.

ie. you can make some great hardware and software which perform really
efficiently today. If you later try to make a v2 of that system, and want to
reuse the same software, then sorry - you're outa luck!

Also, while compilers today are very complex, they are still far from what's
necessary to really make good use of a VLIW machine.

If someone manages to solve the above two things, then VLIW will 100% wipe the
floor with current architectures. Being able to throw out all logic trying to
get parallelism out of a serial instruction stream would have massive power
and area improvements for the same computation throughput.

~~~
ncmncm
A perceived need to run old binaries seems apparent here.

With a lot of code in use being Java and Javascript compiled JIT, the need to
run somebody else's binaries is beginning to fade.

Similarly, the high cost of memory page systems and context switches is an
artifact of executing unvetted code. When everything untrusted is compiled in
a way that it physically cannot breach memory boundaries, the need for the OS
and programs to live in separate memory spaces evaporates, along with its
overhead.

------
MichaelZuo
Calling the progression and availability of technology a “lottery” is
definitely a confusing term. It implies concepts totally unrelated with how
hardware development actually works.

Hardware and software co-evolve! There is no such thing as people creating
hardware ex-nihilo. This reads more like a researcher venting about his
limited budget and development capacities.

~~~
ncmncm
Her. And you miss the point.

Obviously they co-evolve, and the consequence is deeply sub-optimal systems,
as a result of essentially random, momentary conditions.

~~~
MichaelZuo
Thanks. Yes it is possible that the systems are deeply sub-optimal, though
calling ‘Hardware’ by itself a lottery is still misleading. ‘computer lottery’
or ‘hardware-software lottery’ or ‘systems lottery’ would all make more sense
though that would make for a less catchy title.

Though since not all the optimization parameters are known for complex systems
with multiple stakeholders, or at least not known outside of a select few,
i.e. what academic researchers face, we shouldn’t be too hasty in saying it’s
definitely deeply sub-optimal.

For example, there could be unknown, private, criteria that have been highly
optimized for.

~~~
ncmncm
Until ~2002-5 we optimized for 8080 backward compatibility, at ruinous expense
to performance. We still maintain it, albeit with less ruinous effect, except
in GPUs and phones, which today do the majority of the computation, albeit to
little effect.

Those last maintain backward compatibility with a different, 1980s, design,
albeit with another break at 64 bits that sacrificed the worst aspects.

Essentially all mainstream processors emulate what amounts to a hypertrophied
PDP-11, in order to produce good scores on benchmarks coded in C. A better
language that does not attempt to model C might be able to use better
processor designs that C cannot fully exploit, but we have no practical way to
do the experiment.

------
GistNoesis
Hardware and software co-evolve. But they are also co-evolving with the
structure of our economy.

Today machine learning is used to inside servers to provide multiple responses
at the same time.

This batching is because it's easier to gain more money by selling plenty of
low quality decisions (ads), rather than a few good quality one.

But to grasp the bigger picture, there is also the fact that silicon chips are
hard to develop in a DIY fashion. Our economic models have made it so that the
whole silicon industry is based on trade secrets to create barriers and
incremental improvements following Moore's Law for more than fifty years.

You see, to build a chip, you need to have everything perfect ; Pure sand
crystals, dangerous chemicals, very small features. Everything engineered to
the atom and orchestrated to perfection. It makes great product to sell for
years.

But you see, this perfection has a price. Everything must lay flat in 2D.
Everything must be built. And this is where the flash crash happen. Because
the alternative route is vastly superior and evident in hindsight. So vastly
superior and evident that the secret is harder to keep. We are even purging
our biological ecosystem to keep the secret.

The technological future of computing is in self-assembling nano computing
units. You tell the computer units how to build more of themselves. That's how
nature has done it for millions of years. Science-fiction call them nanites.
You can even reuse existing DNA factory. Or you can bootstrap from scratch.
When you see that a typical virus is like 32kb. How can you imagine that with
the right resources, you can't write a self generating 3D grey goo liquid.

If you need more computing power it's just a matter of giving it more energy
and material. Chemistry scales a lot better. And in the battle of exponential
curves, it's a winner takes all market.

Computing power is a resource the same way oil, rare-earth minerals, steel
are. It has been managed to be kept under control for fifty years. You need to
understand that there is a balance to be strike between enjoying the benefits
of technology and the stability of the economy.

That's why Moore's law died, we killed it to keep control.

