
The Tyranny of the Clock – Ivan Sutherland (2012) [pdf] - mr_tyzic
http://worrydream.com/refs/Sutherland%20-%20Tyranny%20of%20the%20Clock.pdf
======
francoisLabonte
In the networking world Fulcrum built some very low latency switch chips used
in switch routers using asynchronous logic. The Alta switch chip was the last
of that generation.

[http://www.hotchips.org/wp-
content/uploads/hc_archives/hc23/...](http://www.hotchips.org/wp-
content/uploads/hc_archives/hc23/HC23.19.6-Networking/HC23.19.620-Frame-
Pipeline-Davies-Fulcrum-proceedings.pdf)

Intel acquired Fulcrum and has not had a new product. One can speculate that
they were acquired in part for their experience and tool to design
asynchronous pipelines.

In the DSP world Octasic makes DSPs that use asynchronous desisns:

[http://www.octasic.com/technology/opus-dsp-
architecture](http://www.octasic.com/technology/opus-dsp-architecture)

~~~
wmf
The Fulcrum/Intel FM10000 was released but it didn't get much love.

------
phkahler
>> Imagine what software would be like if subroutines could start and end only
at preset time intervals. “My subroutines all start at 3.68 millisecond
intervals; how often do yours start?”

Mine start at 50 microsecond intervals. I've worked on stuff with shorter and
longer intervals. Sometimes we have lists of tasks that need to run at
different rates, so scheduling becomes a real pain. Welcome to the world of
real time embedded software in high performance systems. The same thing
applies, we make sure the worst case execution time is within the allowed
intervals and use a master clock to sync everything up.

~~~
jacquesm
I've done quite a few of these embedded systems with real time constraints and
your summary is quite accurate. The good part for me is that once you have
things nailed down they (usually, and if not then you're really in for a long
night) don't shift and what works will continue to work reliably.

This in contrast to non-real-time systems which tend to just freeze for random
periods of time (sometimes seconds or even minutes) without any apparent
cause. That's something that really puzzles me about todays software+hardware.
In theory it should all be faster than ever but in practice I spend as much or
even more time waiting for my computers than I ever did in the past.

Maybe I'm just more impatient but I don't believe that's the reason here.

Real time should be the norm, not the exception, just like encrypted
communications should be the norm, not the exception.

Computers should respond without noticeable latency to user input at all
times.

~~~
daveloyall
Remember moving your cursor through menu options in an NES game? (or using a
8-bit word processor?) Computers should be like that. Consumers shouldn't
accept anything else.

Responsive interfaces allow users to develop muscle memory.

I used to type in complicated sequences of commands into my commodore 64 to
perform common actions. (wow, I didn't know I wanted a $HOME with some scripts
in it. Now I know!)

When I'd make a typo in one of those sequences, it would commonly be quicker
for me to reset the machine and start from the top.

If I performed the same action twice (with a reset inbetween) and got a
different result, I could logically conclude that I had a hardware problem.
(Not every HW problem is permanent. Heat and grounding problems can both be
fixed, unless some threshold is crossed...)

Anyway, I figure that Wintel denied grandma that kind of computer because they
learned from the hobbyists that neophyte users with quality hardware+software
will exceed the creators in skill in less time than it takes to design the
next gen rig--and a couple genius users will start building their own out of
impatience!

There's no profit in quality hardware+software combinations.

------
wmf
Asynchronous logic is significantly more power efficient, so it may be one
approach to "save Moore's Law" (for one generation perhaps). But it would
probably require some company that really cares about power efficiency,
doesn't care about industry best practices, and is willing to risk hundreds of
millions in R&D.

~~~
B1FF_PSUVM
People have been poking at it for over 30 years now.

I'm scratching it up to "it's the future, and always will be ..."

~~~
marcosdumay
People have been poking at every architecture improvement for 30 years.
Moore's Law makes it clear that the only winning move is to go into the most
popular path.

Now that Moore's Law is on its way out, people can actually try new things,
and discover what pays off and what does not.

~~~
Razengan
This could also be a chance to figuratively reboot computing tech, and start
anew from different fundamentals: quaternary, biological, photonic...

What may have been hard to implement 30-40 years ago may be easier now with
current technology. Some of these could definitely supplement existing
binary/boolean silicon in certain domains if not replace it, like using actual
brains in AI-as-a-Service, for image recognition and so on,

------
effie
This reminded me of one of Gustafson's reasoning for change in how numerical
computations are done - currently used principles of hw architecture result in
hw wasting lots of energy and time, mostly in the process where numbers get
from RAM to CPU and back. It seems more people already realize this, which is
good. I hope to see some general purpose hw inspired by these ideas of
efficient computation.

~~~
trsohmers
The inefficiency you are talking about is not due to the fact that there is a
synchronous clock (within each individual "block", since there needs to be
some async logic going between the different clock domains of DRAM and the
processor). The waste in getting numbers from RAM to the register file is
primarily due to the hardware managed cache hierarchy, which we are addressing
at REX Computing, along with John Gustafson as an advisor.

~~~
sliverstorm
Waste? Or overhead in exchange for having a cache?

~~~
trsohmers
Obviously tooting my own horn here, but our solution is replacing hardware
managed caches with software managed caches... You get to have larger, lower
latency "caches" of SRAM in our solution, that also use less power in moving
data from RAM to the register file through them since we very granularly
control the hierarchy.

------
peter_d_sherman
Handshake Technology:
[http://www.ispd.cc/slides/slides_backup/ispd06/8-2.pdf](http://www.ispd.cc/slides/slides_backup/ispd06/8-2.pdf)

Warning: Slightly commercial in nature. But some good information about how it
works starting on page 4. Worth reading from there.

------
EdwardCoffin
A few months ago there was another discussion here of an older article of his
on the same topic [1].

Archive.org has some of their old FLEET architecture papers and slide decks:
[2]

[1]
[https://news.ycombinator.com/item?id=11425533](https://news.ycombinator.com/item?id=11425533)

[2]
[https://web.archive.org/web/20120227072220/http://fleet.cs.b...](https://web.archive.org/web/20120227072220/http://fleet.cs.berkeley.edu/)

------
daveloyall
Hm, I came up with this idea independently, 5 < years < 10 ago, after reading
the first third of _Code: The Hidden Language of Computer Hardware and
Software_.

Neat!

I just figured that you could redesign common ICs so that they had a new wire
akin to the "carry" bit. I called it the 'done' wire, and I figured you could
just tie it to the CLK of the next IC. Ya know? So 'doneness' would propagate
across the surface of the motherboard (or SoC) in different ways depending on
the operation it was performing. Rather than the CLK signal, which is
broadcast to all points...

(I know that my idea is half baked and my description is worse. I'm glad I
found this PDF!)

I knew the big advantage would be power savings. I called the idea 'slow
computing', and I envisioned an 8-bit style machine that would run on solar or
a hand crank and be able to pause mid calculation until enough power was
available... Just like a old capacitor-based flash camera will be able to
flash more frequently when you have fresh batteries in it.

You'd just wire the power system up with the logic. Suppose an adder fires a
"done" at some other IC. Now, put your power system inline, like MiTM... When
it gets the "done", it charges that capacitor (a very small one? :) ) and only
when enough power is available does it propagate the "done". ...Maybe the
"done" powers the next IC. I dunno.

As I said, half baked. Glad to find out that I'm not the only one that dreamed
of 'clockless', though!

~~~
Cyph0n
The big issue with the done signal you're referring to is how do you generate
it? In other words, how does the circuit "know" that it's finished execution?

There are several options. One is to simply add a delay element to each
circuit that is matched to the circuit's delay. Another is to use a circuit-
level handshaking protocol, similar to that used in TCP.

It's not an easy thing to tackle and leads to performance loss in the long run
relative to a synchronous design.

~~~
daveloyall
I don't understand. It's clear to me when an adder is 'done'. Hm, so I'm
guessing it does get more complicated than that. :)

~~~
Cyph0n
Yes you're right, in the case of an adder it's pretty straightforward. It can
get complicated in other circuits.

------
Cyph0n
For those interested in asynchronous circuit design, this group is one of the
best in the world in the field.

[http://www.cs.columbia.edu/async/](http://www.cs.columbia.edu/async/)

~~~
grandalf
Any recommended links or papers to get a sense of the state of the art?

------
gradschool
There doesn't seem to be any sign of recent activity on the asynchronous
research center site affiliated with the article. Is anyone aware of currently
active academic or industrial research groups in this field?

------
Ericson2314
I would personally love try to design some fancy asynchronous stuff, but I got
the impression the impression that current FPGAs would make this difficult.

------
tomgreen000
do the ARM Amulet research efforts which came out of Manchester University
fall under this?
[https://en.wikipedia.org/wiki/AMULET_microprocessor](https://en.wikipedia.org/wiki/AMULET_microprocessor)

------
lasermike026
Just forwarded this off to my EE colleagues.

I'm for approaches that may be superior overall.

------
unixhero
I apprechiated the read.

Didn't understand it though.

