
What happened to clockless computer chips? - CapacitorSet
http://stackoverflow.com/questions/530180/what-happened-to-clockless-computer-chips
======
gradschool
The quote from Ken Stevens, whose opinion I respect, makes a compelling
argument, and certainly Intel has a lot of expertise on asynchronous design,
but I'm wondering if its position is similar to that of Kodak inventing
digital imaging in the 1980s. That is, there would be less money for Intel to
make in a future where asynchronous design prevails, so there's no incentive
for them to develop it. With a properly executed tool chain, asynchronous
design would be a comparable skill to software development (that is, less of
an elitist activity than it is now) and the circuits would be more likely to
work on the first try because a whole class of hardware bugs wouldn't be a
thing anymore. Any comments or am I just a plonker for believing this?

~~~
pjc50
> circuits would be more likely to work on the first try because a whole class
> of hardware bugs wouldn't be a thing anymore

This reminds me of 4GL hype.

I'd say it's the other way round. People have trouble learning synchronous
logic design because everything happens in parallel, but it's a set of well-
understood building blocks. Whereas asynchronous design is just entirely free
of familiar reference points and there isn't quite the same set of standard
idioms.

Yes, you get rid of setup/hold violations, but you're going to get a whole
_new_ class of problems with the addition that "timing closure" is now a
moving target.

Async design _in simulation_ is not some great hidden secret, it's just a
trackless jungle. People are welcome to play with it; if you find a great way
of teaching async design I'm sure we'd all like to hear about it.

~~~
kruhft
> synchronous logic design because everything happens in parallel, but it's a
> set of well-understood building blocks. Whereas asynchronous design is just
> entirely free of familiar reference points and there isn't quite the same
> set of standard idioms....Yes, you get rid of setup/hold violations, but
> you're going to get a whole new class of problems with the addition that
> "timing closure" is now a moving target.

Sounds like the argument about the problems and differences typed/untyped
programming languages. One set of problems is solved with typing at the cost
of linguistic expressiveness due to lacking and/or cumbersome type systems.

I'm thinking async hardware is the typeless of HW designs, but that also takes
off the safety guards used from years of experience in synchronous design.

------
jacknews
"The answer is that although the chip ran three times as fast and used half
the electrical power as clocked counterparts..."

I thought another advantage was that it modularized chip design to some degree
- ie you can improve individual sections to run much faster, without having to
make the entire chip run at that speed -
[http://www.cs.virginia.edu/~robins/Computing_Without_Clocks....](http://www.cs.virginia.edu/~robins/Computing_Without_Clocks.pdf)

In any case in a similar vein is TTA
([https://en.wikipedia.org/wiki/Transport_triggered_architectu...](https://en.wikipedia.org/wiki/Transport_triggered_architecture))
which I think Ivan Sutherland was also involved in, but maybe I'm wrong about
that.

~~~
nradov
Clock multipliers are already used to make individual sections run faster.

------
imode
one of the main issues in asynchronous design is the idea of gate delays, i.e
delays in individual logic gates that compound inside of a circuit.

there are ways of dealing with this. delay-insensitive circuitry, QDI
circuitry, micropipelines...

if you're interested as to the "state of the art", as I learned it, look no
further than Principles of Asynchronous Circuit Design[1], a fantastic book. a
PDF[2] is also available. another fascinating piece of work are
Micropipelines[3] by none other than Sutherland.

the main benefit for asynchronous architectures is their ability to be
expressed in software. the primitives used (as you'll read) function very much
like traditional software constructs based around control and dataflow. this
is very easy when your entire architecture is based around pipelines, which
can choose to hold and "re-transmit" data.

asynchronous design isn't dead or disadvantageous. we just forgot about it for
some reason.

[1]:
[http://www.springer.com/us/book/9780792376132](http://www.springer.com/us/book/9780792376132)
[2]:
[http://www.orbit.dtu.dk/files/2775719/imm855.pdf](http://www.orbit.dtu.dk/files/2775719/imm855.pdf)
[3]:
[https://pdfs.semanticscholar.org/b840/ae4b928964eff41206f89f...](https://pdfs.semanticscholar.org/b840/ae4b928964eff41206f89f0620820ea161d3.pdf)

~~~
mycall
Can typical FPGAs support async circuits?

~~~
consp
Yes, is it practical: No. You'd probably have to calculate (and know) all
interconnect path delays for all solutions you want to try. You'd need to take
into account not just the available gates but also their individual delays.
I'd stick with emulation for now, though there are async 8051 processors which
are extremely interesting due to their low power usage.[1]

[1]:
[http://ieeexplore.ieee.org/document/4519392/](http://ieeexplore.ieee.org/document/4519392/)

------
david-given
Chuck Moore's F18A Forth processor is async, and claims to execute a basic
instruction in about 1.5ns, which is about 700k MIPS.

[http://www.greenarraychips.com/home/documents/greg/PB003-110...](http://www.greenarraychips.com/home/documents/greg/PB003-110412-F18A.pdf)

------
n00b101
I asked an Intel chip designer about this and his opinion was that
asynchronous processors are a "fantasy." His reasoning was that an
asynchronous chip would still need to synchronize data communication within
the chip. Apparently global clock synchronization accounts for about 20% of
the power usage of a synchronous chip. In the asynchronous case, if you had to
synchronize every communication, then the cost of communication is doubled.

~~~
HelloNurse
What do you mean by "synchronizing data communication within the chip"? For
example, is there something that can be "synchronized" in a ring oscillator,
the simplest kind of unclocked logic?

~~~
pjc50
I think this has been garbled, but he's referring to synchronisation across
clock and power domains.

Normal D flip-flops require that, at the time of the clock edge arriving, the
inputs are not changing. If you violate this you get "metastability" and data
loss. Special structures are needed when you move data from a fast-clocked
area to a slower. On processors, usually the core is at one (maybe variable!)
speed while the peripherals and DRAM are at a lower speed (what used to be
called "front side bus").

As to the application for async, maybe he's right and maybe he isn't. There
would have to be synchronisation to fixed external bus speeds, but 20% seems
very high as a proportion of power consumption.

------
jcoffland
Anyone here working on async chips these days?

~~~
pjc50
I used to work at a startup (
[https://www.linkedin.com/company/azuro](https://www.linkedin.com/company/azuro)
) that was commercialising the founders' work on asynchronous design. They
eventually sold to Cadence.

Asynchronous design is a _huge_ sell; we scaled it back to just applying some
async techniques to the clock tree of synchronous designs for moderate
performance/power improvements.

There are three problems we found:

\- the existing toolchain is synchronous-orientated, so you'd have to replace
all of it and retrain your staff.

\- the chip developers and their managers tend to be older and more
conservative than the software world. They're also potentially large teams
(Intel are obviously _huge_ ). So the retraining is going to be difficult and
expensive.

\- it's risky. New toolchain and newly trained staff? There's going to be bugs
in the tools and errors in the design. Worse, there will be _new types_ of bug
that people aren't good at diagnosing. These will take weeks to resolve and a
couple of wafer sets. That's a very expensive proposition!

If async is to get a foothold it would be, like ARM, starting at the low end.
Low power consumption is an obvious pitch for microcontrollers, and the
simpler designs will be less risky.

~~~
phkahler
I'm starting to feel like a broken record but... A perfect opportunity with
risc-v.

~~~
pjc50
Why do you say that?

I can't see anything about RISC-V (or any ISA!) that makes a difference to
those four points.

~~~
phkahler
It's a very simple instruction set. Whatever the challenges are with asyc
design and verification, the task will be simplified with a smaller/simpler
processor design. The open source nature of it also means any company or
researcher is free to do what they want with it. Design tool companies for
example could create that alternative workflow for async and promote it by
saying "look at the results we got on RISC-V" and make all the comparisons
they want. They could also release the design if they wanted. It's hard to see
any of that happening with something like ARM or x86.

~~~
pjc50
Smaller ARM is quite simple, and as mentioned upthread there was already an
async ARM - AMULET.

Tool complexity isn't really to do with size or complexity of design, although
size affects runtime. It's a question of how accurate the physical modelling
is and how well manufacturers trust an OK from the tools. And whether the
engineers trust the tools and can use them effectively.

There aren't all that many design tool companies, too. Remember I worked for
one. The Cadence/Synopsys duopoly is quite strong for the usual reasons.

Fundamentally what you're asking is for someone to make a quarter-million-
dollar+ bet on async. It's easy to say "sure it'll be great" when it's not
your money.

(Maybe the easy way to do it is to build an app for sending "yo!" to your
friends, raise the $1.5m VC, and spend it on silicon instead...)

(Less snarky edit: you know that async isn't a magic dust to apply to existing
designs, and that someone would have to write an entirely new core targeting
RISC-V in an async style?)

~~~
phkahler
>> Less snarky edit: you know that async isn't a magic dust to apply to
existing designs, and that someone would have to write an entirely new core
targeting RISC-V in an async style?

I have no doubt that will eventually happen. There are lots of RISC-V designs
already and more underway. If async is really a win, someone wanting to prove
that will do a RISC-V chip with it.

------
joe563323
Just a dumb question. Will it affect the instruction set of the CPU. Does the
programming change fundamentally with async cpu's.

~~~
nine_k
Modern x86 with its out of order execution is already pretty "async". Little
changed for programmers, except those writing very low-level code, or
compilers.

~~~
joe563323
Does this mean not all instructions are treated equal ? Some instructions need
to make an extra check before or after the execution OR some instructions will
be deprecated and new instructions need to be added ?

------
lightlazer
Could asynchronous designs improve performance of future CPUs if and when
Moore's Law hits physical limits?

~~~
qznc
> when Moore's Law hits physical limits?

It already did. Sophie Wilson said [0] its 28nm for ever. Scaling further
makes no economic sense (unless you really need the space, e.g. in
Smartphones).

[0]
[https://youtu.be/_9mzmvhwMqw?t=34m4s](https://youtu.be/_9mzmvhwMqw?t=34m4s)

~~~
strictnein
> It already did. Sophie Wilson said [0] its 28nm for ever.

That's not quite what her slide said. It's that the transistors on 14nm are
_currently_ more expensive than those at 28nm, although that may change.

And then she states that only some things will make sense to do at less than
28nm. But a lot of the really big players are already at 14nm or will be there
very shortly. Apple, Intel, Samsung, AMD and Nvidia are at 14nm now, either
for their newest products or ones to be introduced later this year.

~~~
deepnotderp
fwiw, 14/16nm is now cheaper than 28nm due to wafer price cuts and improved
yields

------
mukundmr
I think it is a great idea for new devices. Hopefully someone is able to make
it viable commercially.

------
mozumder
Wave pipelining would be a more practical alternative to fully asynchronous
design. It keeps the design synchronous (helps in simulation workflows) while
removing the clock and pipeline registers for power and area reduction.

------
chemmail
Nearly all chips have frequency scaling nowadays, so it is close to async
tech. Intel got the turbo and XFR from AMD steps it up even more. You can only
do so much with general purpose processors. After that point you just gotta go
ASICS.

~~~
static_noise
Variable clock frequency is something very different from being clockless.

Clockless architecture is also not about the general processor design.

