From their github page :
>Traditional emulators on CPUs execute code sequentially. This is a tricky method of emulation because real hardware has many chips and all of them work in parallel...This requires a lot of CPU power to emulate even an old and slow retro computer. Sometimes even a modern CPU working at 100 times the speed of the retro computer is not enough, so the emulator has to use approximation, skip emulation of some less important parts, or assume some standard work of the emulated system without extraordinary usage.
> FPGA doesn't need high frequencies to emulate retro computers; it works at much lower frequencies than traditional emulators require. Since everything in FPGA works in parallel, it is no problem to handle any possible usage of the emulated system.
(Edited for formatting)
If you've actually decapped the original chips and duplicated them exactly in an FPGA, that's pretty cool. But otherwise it's just another approximation. The lower power requirements are nice, of course.
Say you have two implementations of an LED controlled by a switch: one which uses an FPGA and one which uses a microcontroller. The uC implementation must continuously poll peripherals connected to its GPIO pins at a set frequency; it must check the state of the switch, and then change the state of the LED. The FPGA, on the other hand, physically wires the switch to the LED; there is no lag when the state of the switch changes.
The FPGA implementation can be scaled to connect however many additional lights and switches you want (limited by the size of the fabric), with zero overhead lag. This is the parallelization benefit of FPGAs that you may hear about. For the uC implementation, you must add additional switches and lights to the polling loop, which brings down performance in linear time, O(n). This is the drawback of sequential processing.
On the NES and SNES, the buttons are connected to a shift register (e.g. 4021). The CPU triggers a latch and then reads out the shift register one bit at a time.
See for example the tale of an absolutely wild mGBA investigation that was posted here a while ago:
"What happens if an interrupt gets raised between prefetch and the data load? Will it start prefetching the interrupt vector before the invalid memory access? I quickly mocked this up in mGBA, turned on interrupts in the test ROM, and sure enough it broke out of the loop. So I tried the same test ROM on hardware and…it did not break out of the loop. So there goes that theory. Eventually I realized something. You saw that asterisk earlier I’m sure, so yes, there is one thing that can happen in between prefetch and the memory access, but only if the memory bus gets queried by something other than the CPU between the prefetch and invalid memory access."
In general, software has a much lower cost of development than the cost of developing something for FPGA, and if you had something like a Verilog implementation of your emulator, it is not necessarily true that you need to run it on an FPGA—you can run it in software, or use it to verify a software implementation.
I think the real argument here for FPGAs is that some things are tricky to emulate with reasonable speed and accuracy in software. I don’t think the other arguments hold up—for example, arguments about latency—since the time scales involved are fairly generous (16ms to generate a frame of video).
Definitely not. Per your example, "what happens if an interrupt gets raised between prefetch and the data load" is not a question that the type of emulation can answer. You can implement hardware as well as software that gets details like this right or wrong. In both cases you usually need an extensive catalog of observations or a full description of the original hardware to correctly emulate it functionally.
But at least in the FPGA case you're directly simulating the behaviour of the discrete components such that their parallel, real-world interactions should tend toward accuracy. It should be similar to the step from HLE to LLE, where LLE is simpler, more accurate, and less hacky, but way less performant. LLE to FPGA would be a similar transition but without the performance penalty.
The particular example put forth was a good example, and it's worth responding to. If you can come up with a better example, I'd love to hear it.
My argument is that the timing of inputs and outputs to consoles is heavily quantized, which gives you a lot more freedom if you want to create an accurate software implementation--depending on whether your goals are to emulate existing games or serve as a platform for experimentation. For example, the NES has lots of "timing tricks" but they are internal to the console, and at the scales it takes to render a frame of video, you can do a lot of work in software.
> In Comments
> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.
I don't have a side in this discussion. I just stepped in to note one reason why you might have gotten the reply you did, since you specifically noted "Don’t know what you’re getting at." I was merely attempting to illustrate why I thought you might have received the response you did.
> My argument is that the timing of inputs and outputs to consoles is heavily quantized...
That's probably a good argument. That's not what you replied with though. If you had, I don't think you would have gotten the response you did.
>> > Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.
That's as much a point against your comment as against the person that responded to you. Responding to a specific example instead of the position is not responding to the strongest possible interpretation. Then you get responses like you did.
Again, I'm not taking a position on this. I only bothered to comment at at because you seemed to indicate you didn't know why you got the response you did. Honestly, I thought you were likely presenting it as a slight tangent, not an argument, since it's fairly common here for people to view any response as someone arguing the other side once they get slightly heated in an argument.
Oh, that's completely unnecessary. When I said, "Don't know what you're getting at," I was hoping that the commenter understood something that I didn't.
> That's probably a good argument. That's not what you replied with though. If you had, I don't think you would have gotten the response you did.
This is a good illustration of why the guidelines are there in the first place. Comment #1 was the weaker argument, comment #2 was the stronger comment. If you have a discussion with someone, you can get from comment #1 to #2 and beyond.
Telling me that if I had written comment #2 first is... not helpful! Because the process of writing comment #2 involved reading replies to comment #1!
To give you an example: I don't remember which game it was, but it was a platformer, and when the character jumped, the game put the starting pitch of the jumping sound effect in the sound chipset, with the DSP doing the loss and pitch change on each cycle. The game moved the character according to certain value in some register of the DSP, so the game was working normally until you jumped, then the DSP emulation code did something like returning the original value or 0 (because 99.9% of the games just write there), and the game crashed.
In the best world, I could use an emulator as a reference, but for now I have to keep the old hardware around for that reason and always make sure to test my code on it regularly. Even differences that have no impact on the entire software library are important in that sense.
For example, if I write an address to an OPL chip I normally have to wait a few cycles for the address change to be effected before I write a value. You can remove that limitation in an emulator and all software written for the original platform will still work, but the emulator will no longer be useful as a reference, because now I can write software that works in the emulator but not on the real system.
There's of course a use case for emulators that take these kinds of shortcuts for the purpose of running an existing software library, and that's how people were able to play SNES games on Pentium class hardware, but as a point of reference and complete functional preservation of the hardware platform itself, that isn't good enough.
It's similar with CPUs. CPU manuals from the 1970s and 1980s often specify the exact cycle counts for instructions while having simple bus interfaces and no caches. Coming up with cycle accurate hardware for these CPUs is not all that hard.
The Amiga is the last machine I ever had for which I was able to get complete specifications and schematics for and understand completely. Once the 386/486 came along, documentation of SMM mode and other internal details became restricted. Modern computers have so much hidden firmware that understanding the entire system is virtually impossible. In contrast vintage 8/16 (and some 32) bit machines can be completely understood by a single determined individual.
This is either incredibly badly explained or wrong. Verilog compiles to gate level.
Also, I think most decapping is done to read data tables out of ROMs, not to figure out what all the gates in a chip do.
FPGA cores are literal CPUs.
This might come down to the emulator also running on a modern OS, which cannot guarantee at all times smooth framerates, whereas a dedicated FPGA can promise you that it's not running much more than the actual core needed to emulate that system.
Especially on a 50/60Hz CRT monitor the difference and latency from the controller to pixels to screen is noticeably faster and stuff like scrolling the screen and sprites are buttery smooth, just like in the original hardware.
Theoretically it would be possible to automate this with a couple things:
- USB electron microscope to image the transistor topology
- CV lib to identify connections and generate corresponding Verilog code
My understanding is that there are people who do it often enough that it is automated in the way you describe, but you still need someone with a lot of skill to spend serious time on it. Computer vision works wonders but there are errors which must be identified and fixed.
A lot of the chips people care about are can just be done optically, no electron microscope needed.
One thing I noticed from owning a C64-Maxi is how much of the experience is the physical aspect - We interacted with these machines through keyboards of different designs and that was a very important part of the overall experience.
Running VALDOCS on an Epson QX-10 without a VALDOCS keyboard is not quite the right feel.
The usb controllers made for the recent mini systems (NES, SNES, Genesis, etc) make great accessories for the mister. Add a couple of usb arcade sticks and you can really play almost any classic retro games as it was meant to be played.
And then there are all the classic computer cores even including the PDP-1!
The Amiga core is fun, AGA, 2 MB chip, 384 MB fast. It supports hard disk images, so you can do a hard disk based Workbench installation and load games and demos practically instantly (and safely exit to Workbench) using WHDLoad.
Arcade cores are fun as well. Just like in childhood, but less hungry for quarters. :) Recently played with arcade Gauntlet core a bit for example.
They're basically similar efforts but approaching the problem from a different point of view.
Given how hard FPGA programming is, the work in MiSTer is quite magnificent.
But it is an amazing project. Instead of emulating, they actually rebuilt the old custom ICs (which 8-bit computers were full of) in an FPGA. Really impressive.
But in most cases, it is emulation, as the lead developer will attest.
"From my point of view, if the FPGA code is based on the circuitry of real hardware (along with the usual tweaks for FPGA compatibility), then it should be called replication. Anything else is emulation, since it uses different kinds of approximation to meet the same objectives. Currently, it's hard to find a core that can truly be called a replica – most cores are based on more-or-less functional recreations rather than true circuit recreation. The most widely used CPU cores – the Z80 (T80) and MC68000 (TG68K) – are pure functional emulations, not replications. So it's okay to call FPGA cores emulators, unless they are proven to be replicas."
But there's nothing wrong with emulation for preservation, until we get to a point where we can wide-scale clone these older chips down to the transistor level through analysis of delayered decap scans. And even then, emulation will be useful for artificial enhancements as well as for understanding how all those transistors actually worked at a higher level.
It's also not a total solution: by taking many more transistors to programmatically simulate just one, it limits the maximum scale and frequency of what it can support. N64/PS1/Saturn has not yet been fully supported and is still theoretical, but likely, to be possible. Going beyond that is not possible at this time.
Software emulation and FPGA devices should be seen as complementary approaches, rather than competitive. The developers of each often work together, and new knowledge is mutually beneficial.
And yeah I hope we can easily order small batches of ICs (at big pitch of course) in a few years, in a similar way to how creating PCBs has become so simple now.
I mean I remember how much of a PITA it was in the 80s. Drawing on overhead sheets. All the acids and other chemicals. Drilling. And now we get super-accurate 10x10cm boards dual-layer, drilled, soldermasked and silkscreened for a buck a pop with a minimum of 10. Wow. I really hope this trend continues down to the scale of ICs (or that FPGAs simply get better/easier).
By the way, emulating a CPU is pretty easy and very accurate anyway. The big problem with accurate emulation is with some of the peripheral ICs which used hard to emulate stuff like analog sound generators.
The limiting factor here is the amount of stuff you can throw into a single FPGA, correct?
So in theory, shouldn't it be possible to tie a bunch of FPGAs together, with two beefy ones being responsible for replicating CPU / GPU functionality, a couple smaller ones for sound and other "helper" processors, and some bog-standard ARM SoC to provide the bitstreams to the FPGAs and emulate storage (game cartridges, save cards) and input elements (mainly "modern" controllers)?
And the speed that you can get your design to run at. Something like the Game Cube (PPC750 @ 485 MHz) would be difficult to implement in an FPGA, for example.
If you take the SNES core, my software emulator has 100% compatibility and no known bugs, and synchronizes all components at the raw clock cycle level. It also mitigates most of the latency concern through a technique known as run-ahead. But it does require more power to do this.
But people keep presuming an improved accuracy that there's no basis for.
The caveat is that it doesn't always work, and it makes the power requirements even more unbalanced. Some might also see it as a form of cheating to go below the original game's latency. If you want to match the original game's latency precisely, FPGAs are the way to go right now for sure.
> Computers - Classic
• Acorn Archimedes
• Acorn Atom
• Alice MC10
• Altair 8800
• Amstrad CPC 6128
• Amstrad PCW
• ao486 (PC 486)
• Apple I
• Apple II+
• Apple Macintosh Plus
• Atari 800XL
• Atari ST/STe
• BBC Micro B,Master
• Color Computer 2, Dragon 32
• Commodore 16, Plus/4
• Commodore 64, Ultimax
• Commodore PET
• Commodore VIC-20
• DEC PDP-1
• Jupiter Ace
• Laser 310
• Oric 1 & Atmos
• SAM Coupe
• Sharp MZ Series
• Sinclair QL
• TRS-80 Model 1
• Vector 06C
• ZX Spectrum
• ZX Spectrum Next
> Consoles - Classic
• Atari 2600
• Atari 5200
• Atari Lynx
• ColecoVision, SG-1000
• Gameboy, Gameboy Color
• Gameboy Advance
• SMS, Game Gear
• TurboGrafx 16 / PC Engine
> Other Systems
• Epoch Galaxy II
• Flappy Bird
• Game of Life
• TomyTronic Scramble
Seems pretty well documented. Considering the simplicity of the computer, feels like it would be relatively easy project to get to MiST
But with some effort, it would be awesome.
MiSTer: Run Amiga, SNES, NES and Genesis on an FPGA - https://news.ycombinator.com/item?id=18721594 - Dec 2018 (30 comments)
Which it is not, but still very awesome.
Anyway - if one would try to actually use FPGAs for developing a new game - would there be any benefits?
I mean programming it, will probably the main hurdle.
But if one would be good at it - and ignoring for a moment that the typical gamer does not have an FPGA around, would there be actually benefits when programming a new game?
I thought about in the combination with a normal gaming computer. Using the FPGA for complex simulations like physic? Would that make any sense?
The graphics card is specialised for this and has better cost/performance. Graphics cards basically are specialised gaming hardware optimised as much as possible for that use case.
Trying to teach games programmers to use FPGA tooling would be a spectacular disaster. Oh, and FPGA toolchain-targeting is to a specific device; you probably don't remember the early days when games would target a specific model of 3DFX card, but that's what you'd go back to. Upgrade your hardware and you need recompiled versions of all your games.
The one case where it might make sense is if you focus on latency to a brutal extent. DisplayPort on one side of the FPGA, wired game controller (NOT USB) on the other. A unique experience that's hard for normal systems to replicate.
I know that gpu's are mainly optimized for games, but aren't they still general purpose computing units?
(which is why nvidia put a special physics unit into their GPU's)
So the idea as I understand it, is with FPGA's your logic is directly in circutry - which in theory beats general purpose units.
So my actual question would have been, if this would allow for more complex simulations. Because most advancement in games just seem to be in making them look prettier, but not more complex and just faking complexity. But I really like the idea of an advanced voxel engine, for example.
But by your input it seems FPGA's as of today, are probably not the way to get practical improvements there and rather focus on the GPU.
Edit: it was erroneous of me to state the board was being sold at a loss, rather I meant that the board was being definitely being subsidized by companies such as Intel and their partners such as Panasonic. My mistake.
I also wasn't meaning to convey that the consumer Digikey pricing was the same as the large volume manufacturers such as Terasic. Rather I meant to demonstrate and agree with the OP on the astounding situation that MiSTer currently exists in, owning to the lack of economic viability for someone to produce a low volume commercial FPGA emulation machine for a niche audience without any subsidization.
A better way to approach this is as follows: what’s the die size of an FPGA like this? What’s the production cost of the die? Then check the historic gross margin percentage of FPGA companies. Xilinx is around 68%, and that includes high-end products which carry the highest markups, unlike this cookie cutter thing.
That should give you a good ballpark number.
DigiKey charges what they do because nobody else is willing to sell these things in low volume, and they have very high inventory costs.
Anyone who thinks that Terasic sells these at a loss doesn’t have a clue about volume pricing of FPGAs. And as a special Intel partner, there’s little doubt that Terasic has access to this kind of pricing.
Not even a little bit. Retro gaming emulation accounts for basically zero revenue compared to other markets that Xilinx serves, like HPC, networking, and computer vision.
I have a couple of FPGA boards on their way to me in the mail which I intend to use for some homebrew video game projects. Besides getting the development environment working, it can be tricky outputting video from an FPGA because of the precise timing involved. I will have to look through these resources to see if there are any good tricks to use here.