
Compiling my own SPARC CPU inside a cheap FPGA - ttsiodras
https://www.thanassis.space/myowncpu.html
======
tverbeure
For those interested in hacking in this:

The Pano G2 FPGA is a monster, but prices on eBay have gone up a lot. My
cheapest buy was 25 of them for $85 (including shipping!). They now go for
around 1 for $30 if you’re lucky... or $200+ for many.

The Pano G1 (with VGA instead of DVI) is cheaper but has a much smaller FPGA,
though still large by hobby standards.

The benefit of the G1 is that all interfaces are working now, including DRAM,
USB, Ethernet.

Last week, Skip Hansen got a full CP/M system running on one:
[https://github.com/skiphansen/pano_z80](https://github.com/skiphansen/pano_z80)

USB on the G2 is hard. A bunch of people have tried and failed.

~~~
Annatar
I never was a fan of the Zilog Z80 (its instruction set reminded me too much
of the idiotic intel 80x86 family), but seeing someone running the Control
Program for Microcomputers on a whopping 25 MHz Z80 is incredibly cool... just
imagine what kind of scene demos could be coded on this monster for the next
demo party...

I wonder if it would be possible to retrofit an MMU onto a Z80 design and port
a real UNIX to it? 25 or even 50 or even a 75 MHz UNIX server on a Z80
processor, that'd be a nice perversion...

~~~
tasty_freeze
> [the Z80's] instruction set reminded me too much of the idiotic intel 80x86
> family

That is because the z80 was a (mostly) binary compatible superset of the 8080
instruction set, and the 8086 (while not binary compatible) was intentionally
modeled after the 8080 such that there was a program that could take an 8080
program and do the 1:1 instruction mapping from 8080 to 8086 and get a working
8086 program.

> I wonder if it would be possible to retrofit an MMU onto a Z80 design

Zilog made a number of Z80 follow-on CPUs, and some have MMUs. It is amazing
how far they took the instruction set; the Z80380 could run Z80 code as-is,
but it also extended instructions to support 32b operations and had a mode
with a flat 32b linear address space. I don't recall ever hearing any consumer
product using it though.

~~~
qubex
Z80380 user’s manual, for those interested:
[https://www.manualslib.com/manual/1237352/Zilog-Z80380.html](https://www.manualslib.com/manual/1237352/Zilog-Z80380.html)

------
johndoe0815
For a complete SPARCstation 5 implementation able to run SunOS, Solaris, BSD,
Linux or NeXTstep, see [http://temlib.org/](http://temlib.org/) \- this fits
on a Spartan6 XC6SLX45T FPGA, so you should be able to get it to work on the
larger Spartan 6 FPGAs (if you can manage to build a working memory
controller...).

It seems that the Panos are only available on eBay US, any idea where to get
them in Europe without all the shipping and tax hassle?

~~~
jacquesm
That's one way to get a computer where you're actually sure what runs on it.

~~~
nickpsecurity
You have to trust the FPGA, its toolchain, and any peripheral hardware and
firmware with priveleged access.

~~~
jacquesm
That's true in a technical way but it is false in a practical one. An
effective attack using compromised FPGAs or toolchains would be super hard to
carry out undetected because of the decree of scrutiny the output would
receive, besides, the attack itself would have to make substantial assumptions
about the way the FPGA would be wired into the resulting circuit. I'm not
saying it can't be done but it would extremely hard to carry out and I'm not
aware of any such attacks ever being discovered in the wild.

> any peripheral hardware and firmware with priveleged access.

That would be a more feasible vector. But it all still will be _much_ more
secure than your average computer with a BMC.

~~~
nickpsecurity
"An effective attack using compromised FPGAs or toolchains would be super hard
to carry out undetected because of the decree of scrutiny the output would
receive"

FOSS vulnerabilities countered the many eyeballs argument a long time ago.
There's even fewer people who know how to review hardware for flaws. I'm
assuming it would be a targeted attack by default. That raises the bar.
However, they could also leave a trigger that looks like a hardware flaw in
the I/O interface. Intel has basically been doing that subversion with their
ME flaws for some time now. Then, the only targeted part is just how to aim
what's already there.

"That would be a more feasible vector. But it all still will be much more
secure than your average computer with a BMC. "

True. Especially if you use an architecture like crash-safe.org or Cambrige's
CHERI. I already advocate secure CPU's on FPGA's with dumb-as-allowed hardware
if one can't get actual silicon. Also lets you throw in extra reliability
features, too.

------
alain94040
Great article, I hope it helps motivate more people to give hardware design a
try.

 _the truth is that most of the HW designers I know are editing inside their
Vendor-provided IDEs._

Maybe true for FPGA designers, but not for ASIC designers in my experience.

 _Another crazy difference I experienced was that builds are NOT
deterministic_

Yes, hardware generation (synthesis, but mostly optimizations, placement and
routing) are not deterministic. SW people are starting to experience that
phenomenon with ML as well: you don't fully control what you get, but it
works.

~~~
bpye
With HW generation given the same HDL and tools you should be able to get
reproducible builds if you provide the same seed at the very least.

~~~
tverbeure
In theory, you should. In practice, don't count on it.

------
ur-whale
This article is the perfect summary of everything that's wrong with the
existing HW development toolchains for FPGAs.

The best bit is the windows-only version of the Xillinx gooware that in fact
installs a Linux virtual box on windows to finally get to run the tools it
needs.

Oh, and yeah : there's a lame protection in there that checks it's running on
a specific virtual box with a specific MAC address.

Amazing (not in a good way).

------
LeonM
> Alas, I am told by my friends that DDR controllers are no joke; they are not
> the playground of bored SW engineers.

No, they definitely aren't funny. I've worked with FPGAs and DDR controllers
at college and their can be a big PITA. Even with DDR controller libraries you
can still run into all sorts of timing issues.

~~~
ur-whale
When a SWE really starts to understand how complex a modern DDR controller is
at the silicon level, two things happen:

1\. You start to wonder how very, very far your code runs from the theoretical
capabilities of the hardware and you start experiencing existential doubts.

2\. You are overcome by a deep and immense feeling of gratitude towards
whomever managed to force all that complexity to remain hidden underneath the
simple memory abstraction your rely on to write day to day code.

~~~
jacquesm
I try really hard to not go down that particular rabbit hole as it will always
end up being 100% sure that software can't possibly work reliably or even at
all. The degree to which we assume that our hardware is able to move stuff
from point 'a' to point 'b' at a couple of billion times per second is scary.

~~~
nickpsecurity
Although I don't _know_ hardware, I read lots of stuff about developing it to
get a better idea. Among the more interesting reads were slides about each
process shrink along with challenges they brought. Especially from beginning
of Deep Sub-Micron toward 28nm.

The impression I got, esp by 28nm, is the hardware is inherently broken in
quite a few ways. They have to correct the masks with algorithms, they do
image recognition on circuits to spot patterns that act up, extra latches,
variance across the chip/wafer, aging effects... list goes on. Miracle they
even work at all.

These things are also why I only trust old nodes for security. Sort of.

------
peter_d_sherman
This article brings up an interesting question...

What other cheap hardware products contain FPGA's that are potentially user
accessible?

I started a new message chain for this:

Ask HN: What other cheap hardware products contain FPGA's?

[https://news.ycombinator.com/edit?id=21305355](https://news.ycombinator.com/edit?id=21305355)

------
qubex
On a side-note of pedantry, we really need to stop using the term ‘
_compiling_ ’ in the context of FPGAs and HDLs. To ‘compile’ is to assemble a
dossier of documents and/or fill in forms - this is why Grace Hopper called
her automatic code generation contraption a ‘compiler’: because, quite
appropriately, it took the description of actions to be undertaken by the
machine and fleshed them out in a ritualistic fashion in lower-level
instructions.

HDLs and FPGAs have very different principles and objectives. The best term is
‘ _instantiate_ ’, because one creates an instance of a given hardware
description upon the substrate of gates provided by the array.

I’m sure I’ll be told I’m nit-picking, but those who do so would probably
recoil in horror at the _faux pas_ of some n00b saying a browser “compiles
HTML” and tell them the correct term is ‘ _render_ ’, and they’d be right.

Please, let’s be careful and deliberate about the terms we use, can we please?

~~~
ynx
I think you're not quite incorrect but your nitpicking isn't so much
clarifying as redefining terms.

Lowering HDLs into logic that can be flushed onto LUTs does involve an
intermediate compilation step, even though it does also involve
placing/relocating/routing elements on the FPGA, ultimately producing a file
to be flushed to the chip.

One could argue that 'compiling' is often conflated with 'linking' in
producing application binaries, and in fact 'linking' also involves
positioning/relocating elements in some fashion.

'compiling' is suitable and conceptually compatible enough that it clarifies
rather than confuses.

~~~
qubex
It’s precisely this notion of ‘ _lowering_ ’, as you call it, that preoccupies
me. If unchecked, the idea that all forms of ‘lowering’ through abstraction
layers is a form of ‘compilation’ (of sorts) will become the norm, and we’ll
no longer have a unique term for what _compilation proper_ actually means. And
you’re entirely right in noting that nowadays ‘compiling’ has come to
encompass and conflate the conceptually very distinct steps of compilation,
linking, and assembly... a process I’m more comfortable with collectively
referring to as ‘building’.

It’s entirely possible that I might’ve (accidentally) redefined one or more
meanings, and if I have, I apologise. It’s almost four in the morning and
insomnia prevents me from sleeping but doesn’t necessarily maintain me at full
alertness (tomorrow/today will be hell).

------
rjsw
Another design to target at one of these boxes could be Milkymist [1], uses
the lm32 CPU and various peripheral cores.

[1] [https://github.com/m-labs/milkymist](https://github.com/m-labs/milkymist)

------
Annatar
This is awesome!!! (Ultra)SPARC CPU's are a joy to code for in assembler, and
modern T3, T4 and T5's are number crunching monsters.

How about synthesizing the GPL-licensed OpenSPARC T2 now?

[https://www.oracle.com/technetwork/systems/opensparc/openspa...](https://www.oracle.com/technetwork/systems/opensparc/opensparc-t2-page-1446157.html#t2-to-
use)

I'd love to have SmartOS backported on a FPGA-based, OpenSPARC T2, 19" 1U rack
mountable server someday. Free hardware and software all the way.

~~~
wolfgke
> This is awesome!!! (Ultra)SPARC CPU's are a joy to code for in assembler,

Dr Jack Whitham disagrees:

> [https://www.jwhitham.org/2016/02/risc-instruction-sets-i-
> hav...](https://www.jwhitham.org/2016/02/risc-instruction-sets-i-have-known-
> and.html)

~~~
4ad
Of course that anyone is free to disagree for any reason, but after having
written assembly for about 10 architectures, including several compilers, I
definitely prefer SPARC to anything else. The current SPARC V9 ISA spec is
about 1/10 of the arm64 ISA (I wrote compilers for both). The instruction
encoding is so simple I can assemble in my head.

There are some spectacularly annoying things like the stack bias, but those
are easy to hide in the assembler (and those are problems with the System V
ABI, not SPARC, embedded ABIs ignore them).

~~~
drudru11
Can you elaborate on “stack bias”? What are your thoughts about register
windows? I am open mindedly curious as to why you prefer that assembly? Also,
what SPARC hardware are you running?

~~~
4ad
On SPARC V9 the stack and frame pointers don't point to the top of stack, they
point 2047 bytes (0x7ff) away from it. SPARC has 13-bit signed immediates, and
much of that range would be wasted if the stack and frame pointers pointed to
the top of the stack. Having a stack bias allows more of the immediate range
to be utilized at the cost of the assembly (and runtimes) having to keep track
of the stack bias.

The offset is chosen odd as to trivially identify 64-bit code in things like
register window overflow trap handlers[2].

When I say I prefer some assembly vs. some other assembly, I speak from the
perspective of a kernel/compiler writer. Realistically you will never use
assembly to write general purpose code. You either target it from a compiler
(in which case it better be easier to target), or write some specialized
runtime/driver code, in which case it would better have easy to understand
semantics and behavior. If you were to write general purpose code in assembly,
say write assembly instead of C, an orthogonal CISC architecture would be way
easier to write code for. But nobody does that anymore. People only use
assembly when they have to (or target it from a compiler) and for that
particular case the features of the ISA that matter most are quite different
than what a general purpose programmer would want.

SPARC really excels for me because of how easy it is to target and how easy it
is for me understand when I am writing runtime code.

I have various SPARC hardware, the most interesting (and the one eventually
used for the Go port) being a single-digit serial number S7-2 system[3]

[1]
[https://docs.oracle.com/cd/E18752_01/html/816-5138/advanced-...](https://docs.oracle.com/cd/E18752_01/html/816-5138/advanced-2.html)

[2] [http://src.illumos.org/source/xref/illumos-
gate/usr/src/uts/...](http://src.illumos.org/source/xref/illumos-
gate/usr/src/uts/sun4v/ml/trap_table.s#597)

[3]
[https://www.oracle.com/servers/sparc/s7-2/](https://www.oracle.com/servers/sparc/s7-2/)

~~~
Annatar
I grew up around cracking and the software piracy scene on the Commodore64 and
Amiga so I write UltraSPARC assembler for fun (and profit, since I've always
been able to utilize my knowledge and insights when landing job gigs). Does
pouet.net tell you anything?

50% of servers in my basement data center are SPARC based.

------
tpmx
Hmm, yeah, that does look affordable. Lots of ads on ebay.

"Lot of 25 Pano Logic Thin / Zero Desktop Client Black w/ Power Supply

Buy now: US $170.00"

I wonder what the thinking was that lead Pano Logic to put expensive FPGAs
inside these units, instead of some more typical cheap ARM SoC?

Edit: Ah, they operated in 2006-2012. I guess that was just before the rise of
the very cheap/fast SoCs.

~~~
lnsru
Exactly. These are far far away from cheap FPGAs, Spartan 6 chips are just
old, but not cheap. When thinking about cheap FPGAs I imagine these cheap
things from Lattice for interface bridging.

Cheap ARM SoCs were on ARM’s roadmap back then. Cortex A9 was huge innovation
these days.

~~~
tpmx
I guess they were thinking they'd be using very expensive FPGAs in the the
first VC-funded iterations, and then move on to something cheaper. But they
never got there...

[https://www.bizjournals.com/sanjose/blog/2012/11/pano-
logic-...](https://www.bizjournals.com/sanjose/blog/2012/11/pano-logic-shuts-
down-leaving.html)

"... the company had doubled or tripled its revenue every year since 2008,
when it had about $1 million in sales.

...

The company was backed by about $38 million in funding from investors
including Goldman Sachs, ComVentures, Foundation Capital, Fuse Capital, and
Mayfield Fund."

~~~
panpanna
I think you are correct.

Fpgas are often used as a stepping stone towards cheaper asic or hybrid
solutions. asic requires a high volume and a working design, neither of which
you have early in the project.

------
jhallenworld
Well, for comparison, the cost of a current 100K LUT FPGA on a board is:

$250 for Xilinx Artix-7:

[https://store.digilentinc.com/arty-a7-artix-7-fpga-
developme...](https://store.digilentinc.com/arty-a7-artix-7-fpga-development-
board-for-makers-and-hobbyists/)

$100 for Lattice ECP5 (85K LUT):

[https://www.latticestore.com/products/tabid/417/categoryid/5...](https://www.latticestore.com/products/tabid/417/categoryid/59/productid/122774/default.aspx)

~~~
freemint
Also Lattice only does LUT4. (Their internal look up tables have 4 inputs) The
Artix-7 has LUT6. From what i heard that means you need more than 1.6 times
more LUT4 than LUT6. Not including the fact that Artix-7 includes other
hardware features which are sometimes wired in to a desgin

------
panpanna
This use of expensive surplus hw reminds me of the nsa@home project

[http://nsa.unaligned.org/](http://nsa.unaligned.org/)

------
roryrjb
In one of the bash scripts in this article there's a lot of " || exit 1", you
can enable this automatically by using "set -e" at the beginning of the
script, this works for POSIX shell too. You can get extra safety in bash
specifically by doing something like "set -euo pipefail" that will exit on
errors, including pipeline failures and also on undefined variables.

~~~
Annatar
This will make the program instantly unportable, because not all Bourne shells
are POSIX-compliant. Please don't do that. || exit 1 is portable.

~~~
mlyle
Where are we going to come across a Bourne shell that doesn't like `set -e`? I
accept that this choice might have been dubious 20 years ago, but we don't
need to support SunOS 4.1.4 or SCO anymore...

We're talking about standards that were published in 1992-1994 and that the
overwhelming mass of the industry followed relatively quickly. To hold back 25
years later is nuts.

~~~
sitkack
I see this crop up as "wisdom", for the sake of some quality, like that
quality is an absolute. What the parent is literally advocating is that we
don't ever fix the mistakes of the past.

If you are generating a script that has to run everywhere, yes have the
backend emit the verbose or tricky or unergonomic construct. But most of time,
breaking portability to have something that is easier to understand, or more
regular, that uses modern facilities is almost always the correct solution.

~~~
Annatar
How is || exit 1 harder to understand?

With the exception of GNU autotools, I know of no backends which would emit
shell programs. People write programs in shells.

~~~
mlyle
????

Autotools is written in M4.. M3/M4 were developed by K&R to help automate
generating mostly shell programs. In 1974-1977.

For someone who purports to be so oldschool, you're sure missing a lot.

~~~
Annatar
Please take a moment to pause and consider what you're writing. Someone here
wrote "let the backend generate correct code". I responded that people usually
write shell programs, not backends.

~~~
mlyle
You wrote that you know of no backends--- but we have a long history of macro
preprocessing/expansion of shell scripts.

M4 would excel at this adaptation...

Which isn't even necessary, because everything posix conformant has "set -e".
And if "posix" isn't good enough, the Bourne Shell back to 1978 has it
(probably earlier).

You are hilarious, bro. :D

~~~
Annatar
M4 is a generic preprocessor, not a backend shell program compiler.

And I'm not your bro.

