Hacker News new | past | comments | ask | show | jobs | submit login
I don't know how CPUs work so I simulated one in code (2019) (djharper.dev)
181 points by azefiel 5 months ago | hide | past | favorite | 47 comments



So, what this project misses, which is quite hard to capture if you think of gates being just on off switches, is the fact that signals are not instantaneous, and everything runs in parallel.

As the AND gate 4 gates up the chain switches the NOT gate 4 gates down the chain starts to send different and unstable signals which may or may not be interpreted as a 1 or 0 in the downstream gate.

That's the reason computers have a clock, to make sure all transistors in a given stage of a CPU reach a steady state before moving on to the next instruction.

This is why it's probably a good idea to work with a HDL instead of just trying to wing it.


The game Silicon Zeroes ( https://store.steampowered.com/app/684270/Silicon_Zeroes/ ) teaches this. It starts out with components computing their outputs instantaneously, but then introduces the concept of microticks such that the output of a component is unstable until its inputs are stable enough for some time, so the clock speed must be adjusted according to the largest delay in all circuit paths. The game starts off with simple circuits but very quickly becomes about making a CPU, although the ISA is hard-coded into the game and very small.

Another game Turing Complete ( https://store.steampowered.com/app/1444480/Turing_Complete/ , https://news.ycombinator.com/item?id=38925307 ) lets you build a CPU from basic gates and a much larger (and customizable) instruction set. It also has the concept of gate delays, thopugh it doesn't visually show the unstable output as Silicon Zeroes does.


Turing Complete is fantastic! It's a very much one of those games I'd say is like Kerbal Space Program, in the sense that you can be technical and have encountered all the concepts before, but bridging the gap where you actually grok what's happening intuitively isn't quite a leap you can make.


I found that Silicon Zeroes is a pretty crappy emulation of a Zachtronics game. IIRC each level has an intended solution and only gives you the exact parts to create that solution, even if a better solution would be possible with different parts you've used before.


While you're right that Silicon Zeroes is not the best as a zach-like puzzle game, I actually found it did a better job of teaching me about building a computer than most similar games.

However, I played it years before Turing Complete was released, and I'd probably recommend that over SZ. But SZ does a better job at introducing some of the timing concepts.


The earlier levels are like that, but the later levels have a lot more leeway.


Propagation delay, besides being physically unavoidable, is actually necessary for things like latches (and thus memory) to work, since they rely on feedback loops. The https://en.wikipedia.org/wiki/Ring_oscillator is another example.


The thing that got me into CS was building a CPU in Minecraft redstone, which is surprisingly good at being a logic simulator.

You only got (back in my day) NOT (the torch) and OR gates (wiring next to each other), and everything was built out of them. Signal repeaters had delay, the gates had delay, so you naturally had to build a clock to synchronise it all.

The main benefit that I enjoyed was that you could see the signals physically propagating from one end of your cpu to the other in real time (clock cycles were in the range of 1-4 seconds), flowing from the instructing fetching, decoding, dispatch, logic, writing to back memory, etc. Seeing signals slowly crawl from one end to the other naturally introduced you to pipelining (it even happens naturally if you increase the clock without thinking about what will actually happen: the next instruction starts decoding before the previous one is done, more parts of your cpu start lighting up at once, and oops now you have a pipelined cpu).

Even the scales match; many learners are surprised that the actual ALU is the tiny thing in the corner that you can barely see, and all the giant stacks of rows you actually saw is memory and cache. Even in minecraft, most of your CPU is not logic :)

Also really taught me how asics are much faster: you could build a tiny compact multiplier that multiplied hundreds of times faster than your giant cpu running a multiplication algorithm.

Looks like they community I learnt from is still around actually https://openredstone.org/, even if all the old forum posts seems to be gone now. There were some geniuses on that place building computers with multi-tier caches, full pipelining, out-of-order (albeit primitive) execution, SIMD-capable CPUs, all in redstone.


Shameless plug, I made this website which simulates Minecraft redstone as a hobby project many years ago:

https://mordritch.com/mc_rss/

Make sure to press the "Play" button at the top for the simulator to actually run.


The state of the art these days is https://github.com/MCHPR/MCHPRS

It's an implementation of the minecraft server that compiles redstone into a graph representation, and then does optimisations like constant folding, tree shaking, etc. Very cool project. If you ever see timelapses of minecraft computers running game of life, mandelbrot, etc it's most likely with this project.


This is awesome, thanks a lot!


> That's the reason computers have a clock, to make sure all transistors in a given stage of a CPU reach a steady state before moving on to the next instruction.

Here I was thinking[1][2] the reason computers had clocks was merely a consequence of the synchronous architectures that characterize them.

[1] https://en.wikipedia.org/wiki/Metastability_(electronics)

[2] https://en.wikipedia.org/wiki/Quasi-delay-insensitive_circui...


What point are you trying to make?

You are correct that clock free designs exist. But calling it a mere consequence of sync design seems to be a misunderstanding of why sync design has a clock in the first place.


In the case of synchronous design patterns that handle metastability, it was to point out that the clock clearly isn't there "to make sure all transistors in a given stage of a CPU reach a steady state". The clock is there to invoke state transition, whereas achieving steady state is a function of satisfying setup/hold times; the former is fundamentally constrained by the latter.

In the case of QDI circuits, it was to point out that there exists CPUs which do not contain clocks, again challenging the assertion that the reason computers have clocks is "to make sure all transistors in a given stage of a CPU reach a steady state".


The purpose of synchronous designs is to minimize the overhead of the handshaking that asynchronous designs bring with them. There is no other meaningful difference between them, really.


The netlist simulation of visual6502.org [1] works like this, it basically runs the simulation in a loop until the node network reaches a steady state, which means the current clock cycle is 'done'. This is a C port of that inner loop [2].

[1] http://visual6502.org/JSSim/index.html

[2] https://github.com/mist64/perfect6502/blob/268d16647c6b9cb0c...


>This is why it's probably a good idea to work with a HDL instead of just trying to wing it.

During my CE studies I got hooked heavily on FPGAs and HDLs, but the amount of times I actually ran into a timing problem, which only occured on hardware, because of these delays, can be counted on one hand. Playing around with this crap since about 2018. The worst one was where I got clock domain crossing wrong while trying to get DDR3 RAM working for my master thesis. It worked flawless in simulation. Took me weeks to find the wrong status signal

Yes, they can be a serious problem. Yes, you should know about it. But no, it's not necessary


I'm trying to think of how you could simulate the non-instantaneous/clocked aspect of it in code without using an HDL, so you could write a more accurate simulator in a familiar language. I'm thinking you'd want each gate to store its value on the current clock tick, then you'd have a RunTick() function or something which uses the previous values to calculate the new values on the current tick for everything, then stores those new values. I'm not sure how that'd work for an SR latch, a gated SR latch, or a D flip-flop, if you wanted to model those at a gate level; would you have to model those as primitives?

EDIT: After a bit of thinking and bouncing ideas off of ChatGPT, one approach might be to have an SR latch that gets initialized to some arbitrary valid output, which can then use that starting output to calculate future states. From there, you could build up more complicated elements.


Or you can build a computer on a breadboard: https://www.youtube.com/playlist?list=PLowKtXNTBypGqImE405J2...

I did this and not only was it great fun it taught me loads about clocks and busses etc. that was all magic before.


But there exists clockless architectures.


Is clockless useful in the face of modern ASIC tools? Not that you were suggesting in either direction, but it's been something in the back of my head for a while now.

Sure, there's a slight power and latency advantage in avoiding flip flops themselves but every tool has slack stealing and so the core advantage everyone seems to claim clockless (that is, the ability to have different latencies for all your different pipelines) isn't that unique anymore and hasn't been for a long time. Is it dead, can I stop worrying about the async demons :)


There's also things like a CMOS Z80 coupled with static ram. Where you still deal with propagation delay, but you can single step the clock and everything works like you expect. Mentioned, as it would fit pretty well in the code model the linked project implements.


Discussed at the time:

I don't know how CPUs work so I simulated one in code - https://news.ycombinator.com/item?id=19969321 - May 2019 (172 comments)


Ben eater did an excellent series on YouTube where he built a CPU from scratch on breadboards: https://m.youtube.com/playlist?list=PLowKtXNTBypGqImE405J256...

His channel has a lot of other really great stuff on general electronics as well.


One of the most enlightening courses I took in university back in the day was digital electronics. Not because I ever wanted to muck about with it, but because we actually got to build our own super-simple physical 8-bit CPU. We had registers and an ALU and RAM and eight output leds, and we got to write the microcode for the fetch-execute cycle. Clock? There was a physical switch you would toggle on and off to make it step through the cycles to slowly execute the program we wrote in our own machine code. Realizing that instructions are just a bit-pattern saying which unit should write to the bus and which unit should read from the bus was quite eye-opening.


Old minicomputers had toggle switches on the front panel where you could set the program counter, enter machine code instructions, and other things, as well as step the clock.

https://thumbs.worthpoint.com/zoom/images2/1/0619/05/vintage...


Image no longer works. Reupload?


It's not my site. It's the front panel of a Texas Instruments model 980. It's from this page.

https://www.worthpoint.com/worthopedia/vintage-ti-980-ttl-ba...


I really love projects like this, where you learn a lot by just getting stuck in.

I built a simple 8-bit computer using a Z80 chip. You can read about it a bit more here https://www.jake-reich.co.uk/zx-jakey


Love to see people doing CPU projects!

I just recently wrote a JavaScript emulator [1] for a simple 8-bit CPU I had built with logic gates. Both were fun projects, but one took a weekend whereas the other took about 4 months. Having skimmed through the author's code, I totally understand why it took them so much longer to get their's working. It's implemented with gate-level simulated logic gates, which is an impressive achievement. It's so much easier to just emulate whole components like I did.

[1] https://naberhausj.com/miscellaneous/8-bit-computer/page.htm...



This looks super cool. Thank you for the link


> "The only cheat bit is to get the keyboard input and display output working I had to hook up go channels to speak to the outside world via GLFW..."

What I learned from Ben Eater is that a non-cheaty solution would have been to write a simple UART serial "device" and then interacted with the CPU via serial communication with a terminal.


Articles like this remind me of that one time I wrote a 6502 simulator in Turbo Pascal (complete with a primitive text mode disassembler where you could view and edit the machine code stored in memory) as a semester break project. It's of course very far from the complexity of today's CPUs, but basically they still work the same, and all of the optimizations of modern CPUs (pipelining, various cache levels, parallel execution, prefetch, branch prediction etc. etc. etc.) should be transparent to the user - of course, if you want to eke out the last bit of performance (targeting a specific CPU), it helps to be aware of all of that, but you don't need it for understanding how a CPU works.


Neat, I've seen a few logsim systems. After watching Ben Eaters videos (and not having enough time to tinker with hardware) I built

https://git.sr.ht/~vhodges/cputhing


It can be fun to look at the various "achievements" or things that are generally possible for determined people (no geniuses allowed).

CPU just happens to be one of them. Fascinating that such a seemingly complex marvel of innovation is just something that everyone can do.

Of course, there's lots of groundwork and you need to know "the trick" (eg. that you can use transistors to make computation machines)... but it's easily within your grasp.


I was once shown a dos-based CPU simulator back in the mid/late 90s.

From memory it showed instruction decode, execution, cache and memory.

Unfortunately I've never been able to find it, because all the google results are about running DOS games and/or DOSBox.


It is probably on one of the SIMTEL shovelware cdroms. https://archive.org/details/SIMTEL_0692

A search engine won't help you, but a local llm might.


Good suggestion, thanks.

ChatGPT came up with what I thought was a halucination: "CPU Sim".

After some searching, I found a Facebook post[1] with a photo of it running. That looks mostly like what I recall. The screenshot is from a 1987 version.

I can also find references to a Windows version, but no copies of either. I thought there were more features to it, but perhaps I'm misremembering or I've also thinking of Mikrosim that Tomte mentions in the sibling comment.

[1] https://www.facebook.com/RMUserGroup/posts/pfbid0uVrCYMsWmJa...


Mikrosim maybe? The name is because it simulates microcode. That was a huge a-ha moment for me in school.

https://en.wikipedia.org/wiki/MikroSim

http://www.mikrocodesimulator.de/index_eng.php


Good find, thanks.

That one is looking more familiar the more screenshots I see of it, but I think I might be mixing memories of two programs - this, and "CPU Sim" which I mention in a reply to the sibling comment.


This is cool. Reminds me of my internship days when I had to build a custom 8086 emulator in Java, and I went down the rabbit hole of trying to figure out the best way to represent the architecture.


If, for fun, I wanted to train an ML model on a ton of CPU instructions (which each predicted state/label being the state of the registers), does anyone have any clue how to gather that kind of data?


QEMU isn’t cycle-accurate, but would be a good start (and probably good enough). Just run some benchmarks and whatnot there, and use a tracing tool like Cannoli to capture instructions.

If you need real instructions (without an emulator like qemu doing its own translation and messing up timing), you could use a simulator like Gem5. That’s a bit more work and a lot more compute per simulated instruction.


Maybe a game console emulator?


Reminds me of some scenes from the first book of the three body problem.


I also don’t know and thinking about this makes me claustrophobic because I enjoy programming.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: