
Maybe Clockless Chip Design's Time Has Come - throwaway000002
https://www.semiwiki.com/forum/content/5196-maybe-clockless-chip-designs-time-has-come.html
======
andmarios
This field has a well established name: asynchronous logic/circuit.

The author of the article failed to even mention once this term, instead
repeating the term “clockless” as he was in some kind of branding spree.

Furthermore one of the article's tags was “apple”, despite AFAIK the content
not being affiliated with apple in any way.

~~~
drewm1980
Apple's in there as part of the dev history, though it just sounds like they
hired up some of the engineers working on this to work on other projects.

------
parma
Another very interesting async design is GreenArrays GA144: "This very
powerful and versatile chip consists of an 18x8 array of architecturally
identical, independent, complete F18A computers, or nodes, each of which
operates asynchronously. Each computer is capable of performing a basic ALU
instruction in ~1.5 nanoseconds for an energy cost on the order of 7
picojoules". Programmed in Forth.
[http://www.greenarraychips.com/index.html](http://www.greenarraychips.com/index.html)

~~~
ant6n
That would be 667 MHz at 4.7 milliwatts for people who believe in different
units.

~~~
i336_
That is _impressive_ power consumption.

I'll admit I'm not especially familiar with the field, but that sounds quite
exceptional to me.

Is there anything out there offering this sort of price/performance ratio
that's easily accessible, ideally programmable using open-source hobbyist
tools?

------
throwaway000002
Not that the article is particularly detailed, but squinting at the clock
diagram for Wave Semi, it appears to be a some kind of token passing, where
the token is the clock.

I guess this question is for the semi- folks out there: given any n-bit
boolean function, for smallish n, is there a way to implement the function so
that the output is rock-stable (within reason) and only flips if given an
input flip the corresponding function flips?

Because, if not, I don't see how this clock passing system won't require
factoring in logic delay.

I'm sure a lot of details have been elided. Also, if it isn't apparent, I have
no logic design knowledge.

All I know, from an outsider's perspective, having a global clock is
ludicrous. When will designers finally kill it?! Us software people have had
to kill the concept of intrinsic clocks with our consensus protocols for a
while now...

~~~
javcasas
"Given any n-bit boolean function, for smallish n, is there a way to implement
the function so that the output is rock-stable (within reason) and only flips
if given an input flip the corresponding function flips?"

Yep, there is a way. So you are designing your logic circuit as usual:

1\. For each input, decide the output

2\. Simplify the function using a Karnaugh Map
[https://en.wikipedia.org/wiki/Karnaugh_map](https://en.wikipedia.org/wiki/Karnaugh_map)

3\. From the Karnaugh Map, implement it with AND-OR gates.

When you are doing the Karnaugh Map you have to ensure every adjacent 1 or 0
(depending if you are doing standard or inverse logic) are grouped together.
You will use more gates to implement the same logic. Or you can use less
gates, and wait for the output to stabilize, you know, with a clock.

Update:

Wikipedia shows an example:
[https://en.wikipedia.org/wiki/File:K-map_6,8,9,10,11,12,13,1...](https://en.wikipedia.org/wiki/File:K-map_6,8,9,10,11,12,13,14_anti-
race.svg)

This has a race condition on the inverse implementation, and thus can generate
glitches if used in an asynchronous circuit (but it is fine on a synchronous
one):
[https://upload.wikimedia.org/wikipedia/commons/archive/0/02/...](https://upload.wikimedia.org/wikipedia/commons/archive/0/02/20071022024729!K-map_6%2C8%2C9%2C10%2C11%2C12%2C13%2C14_anti-
race.svg)

We can fix it adding another Karnaugh group, thus making it glitch-free, and
then can be used in asynchronous circuits:
[https://upload.wikimedia.org/wikipedia/commons/archive/0/02/...](https://upload.wikimedia.org/wikipedia/commons/archive/0/02/20100810221056!K-map_6%2C8%2C9%2C10%2C11%2C12%2C13%2C14_anti-
race.svg)

~~~
signa11
> 2\. Simplify the function using a Karnaugh Map
> [https://en.wikipedia.org/wiki/Karnaugh_map](https://en.wikipedia.org/wiki/Karnaugh_map)

> 3\. From the Karnaugh Map, implement it with AND-OR gates.

oh dear, that brings back so many memories. however, for modern day
optimizations with millions of primitives, kmaps are totally unfeasible.
binary-decision-diagrams aka BDD, no not the other one ;), are the ones which
makes things somewhat reasonable.

edit: if you are really curious about these things, then the text by zvi-
kohavi (switching and finite automata theory) is an _excellent_ primer.

~~~
javcasas
Sure, I will not use Karnaugh maps when I design a processor. They are awesome
for teaching logic gates, but useless for building big logic. It's like
pretending to build a skyscrapper with just bricks and mortar. Given enough
effort you will succeed, but you are likely to die of old age long before.

------
brudgers
Asynchronous circuits:
[https://en.wikipedia.org/wiki/Asynchronous_circuit](https://en.wikipedia.org/wiki/Asynchronous_circuit)

~~~
throwaway000002
There's a brilliant bit in the wikipedia entry:

The Caltech Asynchronous Microprocessor (1988) was the first asynchronous
microprocessor (1988). Caltech designed and manufactured the world's first
fully Quasi Delay Insensitive processor. During demonstrations, the
researchers amazed viewers by loading a simple program which ran in a tight
loop, pulsing one of the output lines after each instruction. This output line
was connected to an oscilloscope. When a cup of hot coffee was placed on the
chip, the pulse rate (the effective "clock rate") naturally slowed down to
adapt to the worsening performance of the heated transistors. When liquid
nitrogen was poured on the chip, the instruction rate shot up with no
additional intervention. Additionally, at lower temperatures, the voltage
supplied to the chip could be safely increased, which also improved the
instruction rate—again, with no additional configuration.

Amazing. I want this right _now_!

~~~
bjwbell
Chips today have thermal clock rate scaling. Thermal control must be
explicitly disabled to not get clock rate scaling.

If you haven't read anandtech they have articles covering this, since clock
rate scaling increases the difficulty of benchmarking.

------
nraynaud
I think there is a technological path for gradually introducing synchronous
design in clocked world: you can give them clocked inputs without problem, and
you can gate their output at their worst propagation time without problem
either.

I think with such an easy technological path, there is no intrinsic reason not
to have them around us, even exposed as clocked systems. If someone told me
today that there already are synchronous blocks in a famous silicon chip I
would certainly not be surprised.

For example, I'm not a silicon guy so I'm guessing, but adders are generally
exposed as one cycle instructions in software, but they still have to
propagate the carry on the width of the word, so I guess adders are simply a
tree of gate that synchronously propagate and we know the carry propagates
faster than the clock tick.

~~~
rdc12
That sounds like you are hinting at some form of prefix-adder, they are quite
need in that they scale (N being the bit length), in a logarithmic way.

------
n00b101
I recently asked an Intel chip designer if asynchronous circuits could be a
way to deal with stalling of Moore's Law and he was adamant that asynchronous
circuits are a "fantasy." His argument was that even though synchronous use
about 20% of the energy on a modern, chips, with asynchronous you still need
to pass around synchronization tokens which would double the energy required
and worse it would be on the critical path.

------
BooneJS
Fulcrum Microsystems was a Clockless ASIC startup that was acquired by Intel
in 2011 for their Ethernet switch silicon technology.
[http://newsroom.intel.com/community/intel_newsroom/blog/2011...](http://newsroom.intel.com/community/intel_newsroom/blog/2011/07/19/intel-
to-acquire-fulcrum-microsystems)

------
DigitalJack
I remember this company from about 10 years ago. I don't remember the company
name, but the logic was called Null Control Logic then.

~~~
nickpsecurity
Theseus Research. They're in my other comment in this thread with other links.

------
softbuilder
See also, FLEET[1]. I'm surprised this hasn't gained more traction(or at least
hype). I saw Sutherland give a talk about it in 2011/2012 or so.

[1][https://inst.eecs.berkeley.edu/~cs152/fa06/lecnotes/async.pd...](https://inst.eecs.berkeley.edu/~cs152/fa06/lecnotes/async.pdf)

------
nickpsecurity
I recently dropped a lot of links to asynchronous cell libraries, chips, and
so on in this comment:

[https://news.ycombinator.com/item?id=10621937](https://news.ycombinator.com/item?id=10621937)

Just in case pro's want to comment on any of it or think it's cool.

------
theon144
>They also had to invent a new type of gate – one that switches based on the
sum of the number of its input that are at logic 1.

What? Isn't that just AND? I wasn't really too clever from looking at the
diagram provided, so I might be missing something.

~~~
akovaski
Looking at the waveform graph on the site, it looks like it turns the output
to 1 when both inputs are 1, and turns the output to 0 when both inputs are 0
(In other words, it assigns the output to the inputs when the inputs are in
agreement). Though, that's just a guess.

------
exabrial
Didn't some company have a clockless ARM core available a few years back?

~~~
davelnewton
Yes, about ten years ago:

[http://www.eetimes.com/document.asp?doc_id=1299083](http://www.eetimes.com/document.asp?doc_id=1299083)

There were chips before that, too.

------
AYBABTME
I wonder how real time systems would work without a clock to synchronize
everything.

~~~
nraynaud
actually you would want your system to react at the speed of its input (the
world as seen form its point of view), not at the speed of its clock. And if
you actually wanted a clock (say for an alarm clock in the morning), you could
still input a clock to your system, that wouldn't entitle gating the whole
system and having clock domains and that stuff.

~~~
davelnewton
That's not necessarily true, especially when interfacing with mechanical
components.

------
ertyuiopas
Leakage is the big problem in deep submicron. You can't keep a state without
periodically refreshing it (cf. DRAM). Also, asynchronous circuits are very
prone to metastability. If these problems were easily solved, we'd have seen
this old idea be widespread by now.

