
Chip advances require fresh ideas, speakers say - bcaulfield
https://www.eetimes.com/document.asp?doc_id=1332959
======
narrator
I think computing technology stagnating will probably one of the unexpected
stories of the next decade. Look at Intel. Canon Lake (10nm) was supposed to
be released in 2016 and now it's pushed back to mid-2018. Will even be able to
get the yields to ship in volume at that time? A weird consequence of this
will be the rest of the world will catch up in semi-conductor fab technology.

Also, we are starting to see weird stuff happen with GPU/processor and memory
prices going up as inflationary pressures are able to outrun chip technology
advances. We might start to see commodity computer products hoarded as a store
of value.

~~~
matt4077
CPU performance stagnation has already been the (somewhat unexpected) story of
the last decade: Computer have simply become fast enough for almost all users,
so progress has mostly been invested into power saving.

Advances have mostly been made in storage speed (SSDs), battery performance,
and display quality (resolution, colours, and lately refresh rate), plus
vector processing for machine learning.

~~~
gambler
_> Computer have simply become fast enough for almost all users_

I think this is a misconception. Computers were "fast enough for users" since
mid-eighties. Anything is fast enough if you cannot imagine it working better.
What happened is that companies came up with more advanced software products
for consumers and that drove the demand for faster consumer hardware.

The reality is that all the cutting-edge computing today requires a modern
$600 GPU. The difference is that in the past we would have expected this kind
of hardware to "trickle down" to normal users in couple of years. Today we
don't. Instead of marketing better computers companies market cloud services.
It's a really shitty phenomenon. We're seeing the reversal of the PC
revolution.

~~~
earenndil
> Anything is fast enough if you cannot imagine it working better

Exactly. Is memory cheap enough that I can load the entire contents of my hard
disk into it at a reasonable price? Is optane (or something similar) cheap
enough that syncs from memory back onto something persistent happen at close
to realtime? Are processors fast/parallel enough that I can actually start
_running_ all the programs I will ever want to run and just switch to them
when necessary? Do I have enough cores that even if most of my running
processes start glitching out and using 100% cpu, the rest of them will still
run buttery smooth? Add to that that resolution of monitors will increase, and
details displayed even on lower resolutions will only get more intricate. And
things currently on the high-end can only get cheaper.

------
baybal2
My prognosis as bit of an insider popping in and out of Shenzhen.

All guns are pointed at _memory_

Memory is an uncompetitive industry, a cash cow unseen in history, comparable
only to oil. The SEL empire is built not on top of galaxy notes, but on a pile
of memory chips.

The easiest way to get an order of magnitude improvement right away is to put
more memory on die and closer to execution units and eliminate the I/O
bottleneck, but no mem co. will sell you the memory secret sauce.

Not only that memory is made on proprietary equipment, but decades of research
were made entirely behind closed doors of Hynix/SEL/Micron triopoly hydra,
unlike in the wider semi community where even Intel's process gets leaks out a
bit in their research papers.

SEL makes a lot of money not only selling you the well known rectangular
pieces, but also effectively forces all top tier players buying their fancy
interface IP if they want to jump on the bandwagon of the next DDR generation
earlier than others: [https://www.design-reuse.com/samsung/ddr-
phy-c-342/](https://www.design-reuse.com/samsung/ddr-phy-c-342/) . This makes
them want to keep the memory chip a separate piece even more.

Many companies tried to break the cabal, or workaround them, but with no
results. Even Apple's only way around this was just to put a whopping 13 megs
of SRAM on die.

Changing the classical Von Neuman style CPU for GPU or the trendy neural net
streaming processor changes little when it comes to _hardware getting
progressively worse_ at running synchronous algorithms because of memory
starvation.

You see, the first gen Google TPU is rumored to have the severest memory
starvation problem, as do embedded GPUs without steroid pumped memory busses
of gaming grade hardware.

When PS3 came out, outstanding benchmark results on typical PC benchmark tasks
were wrongly attributed to it having 8 dsp cores, while they were not used in
any way. It was all due to it reverting back to more skinny, and synchronous
operation friendlier memory. The amazing SPU performance was all thanks to
that too. DSP style loads benefitted enormously from nearly synchronous memory
behaviour.

~~~
jacksmith21006
Can you provide a source on the memory starvation for the TPUs? Plus are we
talking generation 1 or 2 or both?

Did not seem to be an issue in the paper with the first generation.

[https://cloud.google.com/blog/big-data/2017/05/an-in-
depth-l...](https://cloud.google.com/blog/big-data/2017/05/an-in-depth-look-
at-googles-first-tensor-processing-unit-tpu) An in-depth look at Google's
first Tensor Processing Unit (TPU ...

~~~
pjscott
In the paper on the first-generation TPU, in section 7. They estimate that
there would have been impressive gains in speed, both absolute and per-watt,
if they'd had enough design time to give it more memory bandwidth:

[https://drive.google.com/file/d/0Bx4hafXDDq2EMzRNcy1vSUxtcEk...](https://drive.google.com/file/d/0Bx4hafXDDq2EMzRNcy1vSUxtcEk/view)

------
jacksmith21006
Thought one of the more interesting ideas is the Jeff Dean paper on using the
TPU for more traditional CS operations like an index lookup.

Like to see this progress and get more data on the difference is power
required using a CPU and traditional algorithms versus using a TPU and new
algorithms like Jeff outlined.

[https://arxiv.org/abs/1712.01208](https://arxiv.org/abs/1712.01208) The Case
for Learned Index Structures

------
gambler
No shit. We're using a computer architecture envisioned in the late 50s. Our
main processing units are made by a near-monopoly. The situation is so bad
that to do any real computing we need to put another computer (GPU) inside of
our computer. If you think about it, it's patently absurd.

As far as grand new ideas: fleet architecture by Ivan Sutherland.

As far as master-of-the-obvious ideas: make a hybrid unit that contains flash,
RAM and an FPGa-based processor. Persistent by default. Connect a lot of them
and get rid of disks, caches and memory bottlenecks. Do something similar to
what XMOS does for peripherals (simulate them) to simplify hardware even
further.

You could even combine the two ideas above. Make a computer where programming
isn't just about instructions, but about re-configuring hardware and
information routing within the system.

------
blackrock
Maybe it's time to fuse the CPU and memory together?

Fuse an i7 processor with 32 GB of RAM. This should be far sufficient for all
normal consumer needs, at this time.

You can still allow for additional memory, but this would function as a level
2 memory which is slower to access.

~~~
theandrewbailey
AMD has been shipping GPUs with HBM (on package RAM) for years. Recently, AMD
and Intel have struck a deal to put that GPU+HBM in a CPU package.

[https://wccftech.com/intel-kaby-lake-g-amd-4gb-hbm-gpu-
pictu...](https://wccftech.com/intel-kaby-lake-g-amd-4gb-hbm-gpu-pictured-
vega-11/)

When HBM first came out, I mused at how much faster/more efficient a CPU+HBM
chip could be. I wonder what's stopping it from happening.

------
anonytrary
The chips of the future may be designed nothing like the chips of the past.
This will call for more than just basement circuit rehashing. The future of
chips will rest on novel solid-state and condensed-matter research.

------
tejasmanohar
Friendly reminder that "The Free Lunch Is Over: A Fundamental Turn Toward
Concurrency in Software" [0] was published in 2005 (~13 years ago).

Thinking aloud here-- Is the lull in chip design due to the shift in focus to
software over hardware due to the boom of internet companies circa 2000. Just
compare entry level EE vs. CS available jobs or pay for example. Or, maybe
it's just that, like markets, R&D also operates in cycles? Regardless, it
would be interesting to see a comparison of $$$ invested in R&D on each side
over the last _n_ years.

[0]: [http://www.gotw.ca/publications/concurrency-
ddj.htm](http://www.gotw.ca/publications/concurrency-ddj.htm)

------
mhkl
I suggest to look at Mill Computing. They are designing a new CPU which has
the power of Intel, the security of a mainframe and the power usage of
Raspberry Pi.

~~~
boznz
Interesting but I Don't see anything about physical hardware or timelines in
the articles, can someone do a TLDR summary?

