
Why It’s So Hard to Create New Processors - Lind5
https://semiengineering.com/why-its-so-hard-to-create-new-processors/
======
artemonster
This article is spot on. Verifying hardware is a much more complex task than
designing it. I've spent ~1.5 years on verifying custom cpu core re-
implementation (next gen) that was almost binary compatible with previous gen,
we had own gcc backend and LOTS of legacy programs written in assembly. And
even then it was a nightmare. Even after tons of time spent on covering most
obscure cases, our FPGA platform, running our multi-task FW, would often
crash: cue in hours of back-stepping FPGA trace dumps, where you have a total
recording of program counter, load-store IF, internal registers for aroung
100k clocks and you have to trace back the execution and see where the fault
occurs. Usually it would turn out to be some weirdly obscure combination of
recently-fired interrupt, some pipeline stall with execution of some weird
instruction. fixed in 3 seconds when found. goddamn that was exciting. I
wonder if there are any people here on HN that work on advanced HW
verification techniques that are currently ongoing in the industy?

~~~
thechao
Yep — and I’ve already said too much. I get upset how secretive this industry
is; I truly believe it’s at least 30 years behind the curve (and falling
further) compared to SW. I also firmly believe it is the EDA vendors who sell
this mindset because $$$.

~~~
sweden
As a hardware designer (well, mostly a hardware verification engineer), I feel
that this kind of comment is a pure misconception from software people that
gets echoed time after time in HN.

I generally see a lot of more effort and good quality tools in hardware
verification than in software validation. The hardware design and verification
industry is not really 30 years behind the curve compared to software
development.

Hardware and software are two very different problems with their own set of
constraints. It's actually more often that you see software barely tested
being deployed in the wild to millions of users rather than hardware. Hardware
bugs tend to get more attention because you can't really deploy a patch to fix
the issue.

~~~
rcxdude
Having worked with both HDLs and software tooling, I don't think it's too far
off the mark. HDL tooling is terrible software: it's hard to use, buggy,
expensive, opaque, and the support is terrible unless you're a huge customer.
Hardware verification and development works still through sheer effort, and
because there's a strong business incentive to get things right the first time
due to the huge costs of a respin. With better tools this could be done faster
and cheaper with similar reliability, but perhaps there isn't enough to be
gained to make it worth the effort (and either way I don't think the companies
currently working on the tooling are capable of making anything else: this
would need disruption to really make a difference, and the barriers to entry
are huge).

~~~
tails4e
Hardware design flows are complex and do have bugs, but not in the way it
sounds. In general HDL synthesis produces a design that matches what the user
wrote, so a logic bug in a synthesis engine is rare. The bugs exist in the
'other stuff', like trying to convince the tool to be efficient about how it
created the design, or smart about how it's trying to close timing. Getting a
working though the tools design is easy, getting an optimal design is hard.
HDLs themselves have issues, but there is a wealth of quality checking tools
built around them that mitigate them, and indeed mitigate/flag poor coding
from inexperienced designers (to a degree). What's interesting is FPGA tools,
as much as they are derieded, are much better than their Asic counterparts in
terms of functionality and user friendliness.

------
StillBored
Its probably better stated, as its not hard to create a new processor, anymore
than it is to create a toy OS.

The hard part is creating something that is competitive with top of the line
commercial processors that have thousands of man years of R&D poured into
them. Its not just verification, but the huge effort that goes into eaking out
another couple percent on something like a branch predictor, or optimizing
some "edge case" that turns out to be a significant portion of a benchmark if
its not done correctly. Then there are all the general optimizations that give
you a 10% uplift here and there. Worse, yet if you go with something that
doesn't have a large installed software base (x86/arm/power?) because your
going to be spending crazy amounts of effort doing compiler+application
optimizations as well.

------
buzzert
If anyone is interested in this sort of thing, I would highly recommend
checking out the documentary Rise of the Centaur. It’s about a company that I
previously hadn’t heard of who was making an x86 compatible CPU based in
Austin TX.

They show a lot of the verification process throughout the film, including an
exciting moment when the chip boots Windows for the first time.

~~~
kqr2
On Amazon prime video: [https://smile.amazon.com/Rise-Centaur-Glenn-
Henry/dp/B01FSZU...](https://smile.amazon.com/Rise-Centaur-Glenn-
Henry/dp/B01FSZU6FK)

~~~
7373737373
Couldn't watch it outside of the US but it's also on Vimeo:
[https://vimeo.com/ondemand/riseofthecentaur](https://vimeo.com/ondemand/riseofthecentaur)

------
alain94040
For our software friends wondering what’s so special about processors: they
are the most parallel state machines designs out there. Most other state
machines have fairly well defined and narrow inputs and outputs. For
performance reasons, a cpu pipeline is the biggest collection of state
machines interacting with each other directly.

Therefore most formal methods blow up on cpu designs and random coverage is
really hard to define and even harder to reach.

------
bem94
I try to keep a list of useful open source hardware verification tools:
[https://github.com/ben-marshall/awesome-open-hardware-
verifi...](https://github.com/ben-marshall/awesome-open-hardware-verification)

------
mhh__
Anyone working with this stuff: what are some textbooks on verification etc.
Preferably a bit of theorem proving too but I'm not sure if that's in the same
area of study?

I'm coming from a physics background so I never really know where to start
when I inevitably start looking this stuff up at 4am.

~~~
Balgair
The super basics are here: [https://www.amazon.com/Art-Electronics-Paul-
Horowitz/dp/0521...](https://www.amazon.com/Art-Electronics-Paul-
Horowitz/dp/0521809266)

You'll first have to become 'fluent' in EE, but for a physicist, it's just
spending the time and getting used to things. Not terrible, long, but
straightforward.

As towards what the article is talking about, you need to be trained in it.
Honestly, you have to apprentice with the Greybeards (they are mostly men, but
not always). There are other ways, like reading through Intel docs or the
manuals for ICs or digging through forum posts from 2003. But those guys in
the basement with funny newspaper clippings from the 80s or old xkcd printouts
are a _much_ better return on your time. They have tons of knowledge about
specific chips and machines, stuff that is nearly impossible to recite unless
prompted. You just got to spend long lunches blabbering with them, despite
their strange political and societal views. Just listen to them, then write
down every little thing they said. They are _gold_ in terms of hardware.

~~~
rrss
Art of Electronics is a nice book, but pretty irrelevant to verification.

------
ur-whale
>“This brings a whole new set of challenges because they are speaking a
completely different language, both technically and mentally.”

Both true, but also an incomplete picture.

It's not just the mental models and the language but also the _culture_ that
is very different.

H/W guys are bred, born and raised in an environment that thrives on secrecy
and where nothing is ever free.

The way they transact with one another, the tools they use, and in the end,
the very thing they produce all exude that culture.

It is _exactly_ the software industry 40 years ago.

~~~
tsimionescu
It's not just HW, it's all big, niche industry, today.

Networking for example is the same. If you want to test high scale network
equipment OR virtualized network functions, you will buy hundreds of thousands
of dollars worth of closed-source testing hardware, software and/or
professional services from one of a few big vendors. You will not let anything
about your algorithms and designs slip to the outside world, and neither will
your test vendors.

Edit: the same is true of most of the software world in general. Sure, you
have Microsoft and Google and many others collaborating on Linux, or releasing
Kubernetes, VS Code, Go and so on. But the core IP that is key to their
business? That is staying in-house, fiercely guarded, developped and tested by
an army of engineers.

The main difference is that there are far fewer well-defined software classes
that can be tested generally, so it doesn't make too much sense to look for a
'software testing' industry, like you can for hardware. There are some tool
vendors, but they offer far fewer guarantees, since it's hard to imagine a
product that could find a large proportion of the bugs in both the Haskell
compiler and World of Warcraft.

------
tyingq
_" Verification of a processor is different from the verification of other
pieces of IP, or even an SoC."_

Not sure I understand this. An SoC is a processor, plus more stuff (memory,
I/O), right? Is the idea that it might be easier because the "more stuff"
abstracts away some inner details?

~~~
rrss
The processor is usually already verified on its own before it is put into a
SoC.

Using the raspberry pi example from another thread: broadcom designed the
BCM2835 SoC, which included an ARM1176 core. Broadcom probably didn't do a ton
of verification for the ARM1176 core itself, since ARM already verified it.

------
bgorman
Have there been any attempts to use generative testing (e.g. quickcheck) or
dependent types to verify processors? I am not sure how quite how this would
be integrated into the synthesis of the processor, but it seems to be in line
with the general "declarative" approach to building RTL through VHDL I
remember from undergrad.

~~~
9q9
There is a lot of random testing in processor design. To what extent you'd
call it property-based testing can be argued. Intersting factoid: Koen
Claessen, one of QuickCheck's inventors, also co-designed Lava, a circuit
designer DSL in the Haskell eco-system.

If by dependent types you mean theorem provers, then that is used, but rarely
-- hand-verification doesn't scale to modern processors, usually you model
check against some temporal logic formulas that the processor meets its
specification. If OTOH you mean using HDLs (= hardware description languages)
that use dependent types, then mostly not. Arm's ASL (= Architecture
Specification Language) has a tiny bit of dependency build in to reason about
length of bit vectors.

------
seemslegit
Genuine question - other than increasing the number of players, what is the
horizon of value for new processors to begin with ? Is there anything more
impactful than squeezing a bit more performance/watt ?

~~~
artemonster
cost. it is stupidly expensive to include any sort of commercial MCU core(s)
in your chip. Ancedata: ARM won't even talk with small ASIC fabless companies,
even if they are willing to shell out big upfront costs and royalties per chip
that ARM demands. Having free or affordable alternaties is a great driving
force for the industry.

~~~
lechienquipete
> ARM won't even talk with small ASIC fabless companies, even if they are
> willing to shell out big upfront costs and royalties per chip that ARM
> demands

[Citation needed] There's literally 0 upfront costs for a Cortex-M3 [0]

[0]
[https://developer.arm.com/products/designstart](https://developer.arm.com/products/designstart)

~~~
nrclark
That's pretty recent for ARM, as they face growing pressure from companies
switching to RISC-V for their internal MCUs.

------
Someone
I would say a modern CPU is a massively parallel program that has lots of
shared global state. No wonder that it’s hard to design them.

Or is that view to simple?

~~~
qppo
It's more like you have a massively parallel program that you compile ten
million times, but only five million complete the process usable afterwards
and of those five million, they may or may not have all the parts of the
program you intended in the binary. And you have very little insight into it
while the compilation takes place.

And you have to check/debug those 10 million compiler passes at various
stages, and each design change may require developing a new debugger or
disassembler from scratch to plug into the compiler at each stage of
compilation.

What I'm saying is that CPU designs aren't programs, because you can generally
trust the compiler to be infallible (and compiler bugs are there, but they're
rare). In a CPU process you have to consider the physical impact of the design
on manufacturing, what yields you get, how the product is binned, and so on.
There are feedback loops between the packaging, testing, and design teams to
alter the silicon before production ramps up to go to market. There are tons
of moving parts to the actual design process itself, let alone what is being
designed.

------
IshKebab
I don't think this is quite right. Based on my (admittedly limited)
experience, it takes a lot of work to design and verify a fast processor, but
unless your processor is very similar to existing ones (in which case why
bother?) it takes way way more work to write all the software needed to
support it.

I guess everyone underestimates how long it takes to write software - even
hardware designers.

------
tyingq
Interesting when coupled with how many we are losing. It wasn't that long ago
that PA-RISC, Sparc, Alpha, Power, MIPS, etc all credibly competed with one
another and Intel. Now it's almost all x86-64 and ARM.

~~~
gok
ISAs are consolidating sure, but the interesting parts of the chips are also
consolidating. A few years ago several companies were designing new Arm cores;
now it's pretty much just Arm and Apple.

~~~
rrss
There are still other companies doing new core designs: Huawei/HiSilicon
(Taishan v110 in 2019), Marvell/Cavium (Thunder X2 in 2018), Samsung (M4 in
2019), Fujitsu (A64FX is 2019), Nvidia (carmel in 2018).

If you count semi-custom cores derived from ARM designs, then add Ampere
Computing and Qualcomm as well.

~~~
gok
In the server space that's true, Arm cores are proliferating. But Samsung shut
their core group down and Qualcomm is moving that direction.

------
qwerty456127
Another important question is why is it so hard to use the old ones.

------
Havoc
>Brute-force solutions to verification closure aren’t feasible.

Perhaps a fuzzy/AI type approach?

Yeah does seem like a intractable problem for sure

~~~
bem94
Hardware verification engineers call it "constrained random verification", but
it's basically fuzzing. This has been the backbone of most commercial hardware
verification flows for a long time.

------
devit
Why is it supposed to be so complicated to do "processor verification"?

Why can't you simply upload the design to an FPGA, and then check that it can:

1\. Boot all available operating systems (Linux, *BSD, Windows, etc.)

2\. Successfully compile and run the testsuites for a bunch of open-source
software (several languages like Rust have a standardized repository and
method to build and run tests, so this is very easy)

3\. Correctly run stress testing software (Prime95, etc.)

4\. Correctly run several software unit tests that you write to exercise
instructions that may not be produced by LLVM/GCC

5\. Correctly run tests you write to exercise specific processor/cache states

6\. Properly handling fuzzed code without freezing the whole CPU (using afl-
fuzz)

Start with the simplest possible in-order core so that you get it working very
easily, and then evolve to your desired end-state with a series of small
commits, and if the verification fails use `git bisect` if needed to find the
offending commit, insert any instrumentation you might need to detect the
issue and fix it.

I don't see why you would need a specialized tool for that, or even what a
specialized tool could possibly do.

~~~
joosters
Have a look at the extensive errata Intel publish for their CPUs. There are
hundreds of mistakes in the chips’ behaviour, and yet each buggy CPU would
pass your set of tests with flying colours.

While you could never release a CPU that didn’t pass the tests you describe,
they don’t even begin to exercise all the corner cases for a chip. Multiplying
two specific numbers together, while the instruction crosses two memory pages,
when an interrupt arrives? How do you even test for that kind of thing?

~~~
wallacoloo
I feel the SW world is affected by some analogous bugs though. Any sort of
race condition between two different threads accessing the same resource,
maybe throw in some other piece like having the data always be valid unless a
third thread happens to free some downstream resource at the same time...

We know about techniques to reduce large classes of errors. Data races in
particular can be prevented by some languages statically. Other types of “once
in a blue-moon” errors that happen as a result of two coupled systems doing
something in tandem can be reduced by introducing stronger boundaries between
the systems, and then you can test each system independently and make sure it
works regardless of what the other system does (I.e. dependency testing, or
maybe even fuzzing).

These approaches aren’t bulletproof, but I think they do illustrate a point:
that there _are_ techniques to reduce the likelihood of the errors you
highlight. Whether they do it at a competitive cost to existing industry
practices or not, I have no idea.

~~~
TheCoelacanth
Hardware is usually orders of magnitude more reliable than software, so what
makes you think that they aren't already using those techniques or something
better?

~~~
wallacoloo
Hmm? Maybe they do, I _hope_ they do. I was replying more specifically to this
part: > Multiplying two specific numbers together, while the instruction
crosses two memory pages, when an interrupt arrives? How do you even test for
that kind of thing?

I.e. trying to despell the idea that large systems are intrinsically difficult
to test.

