Hacker News new | past | comments | ask | show | jobs | submit login
The Gray-1, a homebrew CPU exclusively composed of memory (2x-1.net)
293 points by jsnell on Feb 3, 2017 | hide | past | web | favorite | 48 comments

Ah, memories of one of my very first lab assignments in digital circuits engineering class (back when computer science was still busy splitting off from applied mathematics and electrical engineering). We were given nothing but a breadboard, some chips with basic logic gates (NANDs, multiplexers/demultiplexers), an EPROM, access to an EPROM burner (w/ hex keypad!), some LEDs, switches, resistors, jumper wires and a power source.

The assignment was to build a traffic light simulator, set the whole thing running, change the traffic lights as a result of switch inputs acting as sensors and a simulated interval timer.

Some students were baffled by this (lectures hadn't caught up with lab assignments at that point): how could you build a small processor using only logic and an EPROM? There's no memory or registers to keep state!

This is what differentiates combinatorial logic from sequential logic: feedback. Use some of the EPROM's data outputs along with logic gates and switch outputs (using the multiplexers / demultplexers) as address inputs to the same EPROM.

Sweet memories of solving Karnaugh maps, Quine-McCluskey minimization, logic hazard mitigation, etc. Good times.

I took a similar class. I loved it. we started out with a project that would make leds perform a cylon style pattern [1]

For the final project we had to implement a multiplier (booths algorithm) using nothing except breadboards, wires, and a few very basic ICs that had Ands nors etc.

I loved that class and it really helped me understand how a processor actually works.


> I loved it

You misremember ;)

I took a similar class, and like for everybody else I ever talked to, half of the wires were somewhat broken and caused intermittent connections. Solving the puzzle of building the system was fun, and quickly done - finding the &#(@ broken cables and replacing them was weeks of tedium.

I suppose in a way it was perfect preparation for a career in software engineering. Solve the core problem, and then spend significantly more time on actually making it work in a world the adamantly refuses to be pure and perfect.

My solution? Dumpster diving for wire.

> Quine-McCluskey minimization

haha, I'm remembering how I felt like I was getting away with murder because I took 4 courses in college that spent time covering truth tables and complex boolean statements, and 3 of those taught Quine-McCluskey. I felt like an adult visiting an elementary school. (briefly, until we finished that chapter)

I'm doing ECE in my first year on university now, and so far the labs are pretty boring, except for where we are making our own CPUs using xilinx then loading those onto an fpga. We're way ahead with state machines etc. in the digital lectures.

Analogue is the hard part for me, I guess you didn't do so much of that in digital systems engineering.

Is this how Shenzhen I/O was done back in the day?

Tangential to the article itself: webrings! I forgot that those once existed!

Homebuilt CPUs webring's home: http://members.iinet.net.au/~daveb/simplex/ringhome.html.

If search engines had not gotten better as fast as they did I suspect webrings would have been a much larger thing. Who knows maybe one day we'll see a revival of webrings as the ultimate anti-spam tool, monitor your neighbours in the ring and cut them out of the ring if they go spammy. It would scale.

Yeah, took me back to 1999.

There's quite a few awesome projects in the ring!

This is a great project, but the idea is not as exotic as it may sound. This is more or less how an FPGA works.

Edit: Should also add that there were commercial products built like this in the 80-90's.

Yep! Only difference is that FPGAs use much smaller lookup tables (LUTs) -- typically 4 or 6 bits in and 1 bit out. For comparison, this project's ALU EPROM has 21 bits in and 16 bits out.

Also, FPGAs have real registers. This project is sort of faking registers in some places by wiring output to input in some ROMs. (Which is a clever idea, but flaky and eats a lot of space in the ROM.)

> typically 4 or 6 bits in and 1 bit out

So these are reducers in algo talk? What about mapping more than 1 bit at a time? Chaining?

Routing networks in FPGAs are complicated, but essentially unrestricted. You can emulate a multi-bit lookup table by hooking up multiple LUTs to the same inputs, and you can emulate a wider lookup table with a tree of lookups and multiplexers.

The lookup table part, sure, many fpgas use that. The clever part of this was turning a ROM into a flip flop.

Fpga and co have not always come with registers. Back in the days this was more or less how it was done.

Are there any particular counterexamples you have in mind? The first reprogrammable logic device, the Altera EP300 [1], included a register in each macrocell. So did the first FPGA, the Xilinx XC2064 [2]. I'd be very surprised if there were any FPGAs with no registers.

[1]: http://www-inst.eecs.berkeley.edu/~cs294-59/fa10/resources/A...

[2]: http://www.applelogic.org/files/XILINXXC2018LCA.pdf

You can always simulate gates with memory. And by adding feedback, you can build memory with gates. Which means you can simulate non-fed-back gates with fed-back gates.

What was described with the ROM sounded more like a latch (transparent for one value of clock, opaque for the other). Did the author actually compose these into flipflops? It read to me as if it that wasn't the case, which just makes it even more cool.


(yeah I'm being a bit pedantic but FPGAs are pretty monolithic these days and contain stuff like registers, block ram and DSPs).

I'm in the process of writing a tutorial on this/hardware synthesis from first principles, but general purpose computers are way less magical when you think of components such the ALU as a mapping of binary inputs to binary outputs: a truth table. From there you can explicitly enumerate all possible inputs and outputs (for small word sizes at least) and go directly from said truth table to (inefficient) hardware.

Related: mov is Turing-complete [1] and X86 MMU fault handling is Turing-complete [2]

edit: formatting

[1] https://www.cl.cam.ac.uk/~sd601/papers/mov.pdf

[2] https://news.ycombinator.com/item?id=5261598

I suspect it would be difficult (impossible?) without relative position addressing modes and/or other tricks. A conventional stack would need some way to access an offset away from the base, but perhaps you could use self-modifying code. Perhaps relative from current position like brainfk.

All general computation needs either a direct or convoluted means of achieving conditions too. A tricky formula exploiting abs(x) could be used to produce {0, 1} but...

Right, my comment retarding truth tables was only in reference to combinatorial logic. That said, pure circuit models of computations exist, and are interesting from a theoretical standpoint at least. They don't translate to hardware, though. As you noted doing arbitrary computation with (finite) circuits requires some type of feedback/sequential logic.

An amusing application of mov's turing completeness https://github.com/xoreaxeaxeax/movfuscator

It's funny I was wondering recently if there is a fundamental difference between logic and memory. I mean it's obvious that memory devices can do logic, since after all Boolean tables are just data in the end, aren't they? And reciprocally transistors can be used to store memory (with a flip-flop for instance).

So is it true? Are logic and memory fundamentally equivalent?

Theoretically yes. One is stateful and the other is not, but you can always simulate one with the other. Practically no. DRAM is built with capacitors as its storage elements. Each bit has no feedback and requires refresh, which makes it slow. But it only requires one transistor per bit, so it's cheap and every bit takes up very little physical space. Static RAM, OTOH, (registers and fast cache) is built out of flip-flops. It has feedback so it doesn't need refreshing and it's fast. But every bit requires 4 or more transistors, so it's physically large per bit and expensive.

Can you elaborate on this:

>"Static RAM, OTOH, (registers and fast cache) is built out of flip-flops. It has feedback so it doesn't need refreshing and it's fast"

DRAM also has feedback no? I mean thats what differentiates is from combinatorial circuits correct?

I understand SRAM is flip flops vs single bit transistors. I guess I'm not sure why you are saying it has feedback to differentiate SRAM(registers) from DRAM.

The individual bits of DRAM don't use internal feedback; they work by storing charge on a capacitor that slowly leaks away. Their outputs don't help them maintain their state (except during a refresh cycle, and in that sense one could say they use feedback). The flip-flops in SRAM always have internal feedback that causes them to a) switch state quickly and b) maintain their state after switching. This is an example of positive feedback, as opposed to the more well-known negative feedback used in control systems. I was trying to say that there's more than one way to save state, and it's state that distinguishes memory from combinational logic, not feedback. https://en.m.wikipedia.org/wiki/Memory_cell_(binary)

I see, sure, that makes perfect sense. Thanks for clarifying.

As long as you're talking about non-sequential logic, then yes. You could describe any ROM data as a boolean expression mapping address to data lines, and you can iterate through all possible inputs of a boolean expression to generate an equivalent ROM image.

I've done exhaustive iteration like this to generate truth tables (with feedback -- sone outputs wired to inputs) from a bunch of C if statements for controlling room combining in audio systems that supported truth tables but not code. It does work.

Those LUT's have implied indexing internally, I suspect something akin to indexing is essential.

Logic functions can be composed using any one of: NAND, NOR, XOR, IMPLY, or other units like full-adders etc.

All memory depends on some form of persistance and/or refreshing.

Sometimes this is a physical phenomenon like magnetism or electrical compatence or after-glow in a willian's tube. Short-lived effects need to be refreshed.

Sometimes there is a feedback loop that continuously sustains latching, such as flip-flops.

Magnetic core memory could be less transient than say, a mercury delay line, but needed to be rewriten after being read.

But regarding your question, I think so. If it is possible to impliment one with the other (but memory usually has internal indexing).

Indexing is just multiplexing (more properly demultiplexing) and multiplexing/demultiplexing is just more combinatorial logic.

Yeah. But for a function with parameters there needs to be a way to associate values with parameters. There are different ways to do that, but even a static global would be accessed by address.

At the gate level, "accessing by address" is demultiplexing. It selects one of n output lines with log2(n) input signals. https://en.m.wikipedia.org/wiki/Multiplexer#Digital_demultip...

The scroll highjacking is making it hard for me to properly read the article on Google Chrome.

+1. I cannot bear it and can only close the page.

You can implement any logic function with a large enough lookup table. That doesn't mean it's necessarily a good idea.

It wasn't done this way then and isn't done this way now not because it's impossible (or even that hard), it's because it usually doesn't make sense compared to the alternatives. It's slow and burns a comparatively large amount of resources.

That said, the project is interesting precisely for the reasons that it renders these lessons concrete, and apparently was fun to implement.

...now back to my Minecraft RISC-V....

I'm actually gearing up for a RISC-V processor in Fallout 4. All of the design work is done, but building it with the vanilla logic gates (introduced in the Contraptions DLC) actually isn't possible. Luckily I already had a mod with "real" components when Contraptions was announced.

Assuming you weren't joking, how sophisticated is your design? (as in, "is it pipelined?")

I was joking :-)

I've made a few simple microcontrollers and microsequencers in Minecraft (as 'micro' as you can get with a 1M feature-size) and I've often fantasized about bigger projects, but it's only been fantasy.

Still, it lends itself reasonably well to old NMOS-style layouts...

An interesting project. I did see one error in his description of how a register works. What he describes is the operation of a latch. A register does not change outputs when the clock goes low. It only changes on the rising edge of the clock. See the datasheets for 74373 vs 74374 to see the difference.

This is really neat. I have question, how hard and what is the procedure for writing the ISA and subsequent microcode to implement your ISA? Do you do this in Verilog or an equivalent? And how about writing the assembler? I would love to hear more about what that entails in a homebrew set up like this.

Actually designing an ISA is voodoo, but doing the mircrocode is fairly easy. On a tiny little 4-bit design with only a few control lines you can do it by hand. Past a certain point, though, it is essentially impossible to do manually without going insane.

It isn't very sophisticated, but the project I mentioned above might satisfy some of your curiosity: https://hackaday.io/project/18859-risk-vee / https://github.com/cadpnq/risk-vee

My methodology was to define each microinstruction that made up every instruction (such as "move pc to memory address register"). I then took all of the control lines in my design and assigned them a bit position in the control store. From there it was just a matter of defining each control line symbolically and "assembling" each microinstruction by ORing the appropriate things together.

This looks great! I am going to follow your project. What emulator are you using to design this? I see all the assembler files are js?

Can you elaborate on this? From there it was just a matter of defining each control line symbolically and "assembling" each microinstruction by ORing the appropriate things together"?

This is basically your decoder then?

So far everything is in Logisim[1], but I'm actually thinking of writing an emulator so I can debug programs quicker. I started a lisp interpreter in risc-v assembly, but testing my code at 3khz (the max logisim will run my design at) was too painful. The assembler and microcode generator are js+node.

risk-vee is a really basic (read: dumb) "common bus" design. All of the internal components of the CPU share the same 32-bit data bus. The heart of the control unit is a ROM that is 32-bits wide. Each bit is connected to one of the various things in the CPU. To make generating the microcode easier I defined a mask for each line.

Taking "move the PC to the memory address register" as an example: The register that holds the PC has an output enable line, and the memory address register has select/enable and both would need to be high. Pulling a few lines from microcode.js:

  pc_read =           parseBin("0000-0000 0000-0000 0000-0000 0100-0000");
  mem_address_write = parseBin("0000-0000 0000-1000 0000-0000 0000-0000");
So the microinstruction we want ends up being

  pc_read | mem_address_write
I feel like I'm simultaneously under- and over-explaining it.

[1] http://www.cburch.com/logisim/

I understand. Thanks for the explanation. I didn't know about Logism, this is cool. I can't wait to play with this. Thanks for the link!

I've heard rumours that HP was developing something known as memgisters; where instead of moving data from memory to a CPU register - operations could be preformed directly on memory.

Would be interesting to see how this would translate into a memristor-based CPU in the future.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact