Hacker News new | past | comments | ask | show | jobs | submit login
MiSTer, an open-source FPGA gaming project (github.com/mister-devel)
343 points by tediousdemise 36 days ago | hide | past | favorite | 107 comments

I don’t really know much about game emulation so I was curious about what differentiates this FPGA game project vs traditional CPU emulation.

From their github page [1]:

>Traditional emulators on CPUs execute code sequentially. This is a tricky method of emulation because real hardware has many chips and all of them work in parallel...This requires a lot of CPU power to emulate even an old and slow retro computer. Sometimes even a modern CPU working at 100 times the speed of the retro computer is not enough, so the emulator has to use approximation, skip emulation of some less important parts, or assume some standard work of the emulated system without extraordinary usage.

> FPGA doesn't need high frequencies to emulate retro computers; it works at much lower frequencies than traditional emulators require. Since everything in FPGA works in parallel, it is no problem to handle any possible usage of the emulated system.

[1] https://github.com/MiSTer-devel/Main_MiSTer/wiki/Why-FPGA

(Edited for formatting)

There was a good article from Arstechnica a decade ago that pointed out why you need so much more power to get perfect emulation. To get exact emulation takes a lot of power because there are a few games which use odd tricks that are hard to document and precisely reimplement in software. FPGA emulation gets around that by more directly emulating the hardware.


byuu wrote a good article about this, unfortunately it's no longer available, but basically it should be self-evident that there's nothing inherently more accurate about hardware emulation.

If you've actually decapped the original chips and duplicated them exactly in an FPGA, that's pretty cool. But otherwise it's just another approximation. The lower power requirements are nice, of course.

On FPGAs (depending on the hardware mapping), you get the benefit of lower latency. I consider this to be timing accuracy.

Say you have two implementations of an LED controlled by a switch: one which uses an FPGA and one which uses a microcontroller. The uC implementation must continuously poll peripherals connected to its GPIO pins at a set frequency; it must check the state of the switch, and then change the state of the LED. The FPGA, on the other hand, physically wires the switch to the LED; there is no lag when the state of the switch changes.

The FPGA implementation can be scaled to connect however many additional lights and switches you want (limited by the size of the fabric), with zero overhead lag. This is the parallelization benefit of FPGAs that you may hear about. For the uC implementation, you must add additional switches and lights to the polling loop, which brings down performance in linear time, O(n). This is the drawback of sequential processing.

Most game consoles don’t do any of this, though. The gamepad is polled by software.

On the NES and SNES, the buttons are connected to a shift register (e.g. 4021). The CPU triggers a latch and then reads out the shift register one bit at a time.

This would be less about a user peripheral like the gamepad (which is obviously going to be read out exactly once per frame anyway) and more about getting subtle interactions between the CPU, memory/DMA, and specialized systems for graphics/audio correct. And not just correct after thousands of hours of work to smoke out the exact sources of specific title bugs, but correct essentially for free.

See for example the tale of an absolutely wild mGBA investigation that was posted here a while ago:

"What happens if an interrupt gets raised between prefetch and the data load? Will it start prefetching the interrupt vector before the invalid memory access? I quickly mocked this up in mGBA, turned on interrupts in the test ROM, and sure enough it broke out of the loop. So I tried the same test ROM on hardware and…it did not break out of the loop. So there goes that theory. Eventually I realized something. You saw that asterisk earlier I’m sure, so yes, there is one thing that can happen in between prefetch and the memory access, but only if the memory bus gets queried by something other than the CPU between the prefetch and invalid memory access."


I don’t think that the FPGA based emulation will be “correct essentially for free”. The FPGA emulation itself is a lot more labor-intensive to create in the first place. In order to make it work equivalently to original hardware, you will need to accurately understand how the original hardware works, which is something you would need for software implementations too.

In general, software has a much lower cost of development than the cost of developing something for FPGA, and if you had something like a Verilog implementation of your emulator, it is not necessarily true that you need to run it on an FPGA—you can run it in software, or use it to verify a software implementation.

I think the real argument here for FPGAs is that some things are tricky to emulate with reasonable speed and accuracy in software. I don’t think the other arguments hold up—for example, arguments about latency—since the time scales involved are fairly generous (16ms to generate a frame of video).

> but correct essentially for free.

Definitely not. Per your example, "what happens if an interrupt gets raised between prefetch and the data load" is not a question that the type of emulation can answer. You can implement hardware as well as software that gets details like this right or wrong. In both cases you usually need an extensive catalog of observations or a full description of the original hardware to correctly emulate it functionally.

Okay, sure— that was sloppy. Obviously you're going to have bug fixes and unexpected side effects, behaviours, whatever which need to be tracked down in both cases.

But at least in the FPGA case you're directly simulating the behaviour of the discrete components such that their parallel, real-world interactions should tend toward accuracy. It should be similar to the step from HLE to LLE, where LLE is simpler, more accurate, and less hacky, but way less performant. LLE to FPGA would be a similar transition but without the performance penalty.

This was given as an example, not for you to straw man about the gamepad.

Don’t know what you’re getting at. You seem to have taken offense, and I think it’s because you misinterpreted what I wrote.

Your comment seemed to be positioned as a refutation of the prior comment (it could have been meant as an aside, but without something to indicate that it can be hard not to assume it's a counter). Unfortunately, if it was meant as a counter to the prior argument, countering just the example put forth does not actually address the argument put forth, so it it looks like a straw man argument itself (that is, addressing an example, even if it was presented by others, without making any attempt to address the position itself).

> ... countering just the example put forth does not actually address the argument put forth ...

The particular example put forth was a good example, and it's worth responding to. If you can come up with a better example, I'd love to hear it.

My argument is that the timing of inputs and outputs to consoles is heavily quantized, which gives you a lot more freedom if you want to create an accurate software implementation--depending on whether your goals are to emulate existing games or serve as a platform for experimentation. For example, the NES has lots of "timing tricks" but they are internal to the console, and at the scales it takes to render a frame of video, you can do a lot of work in software.

From https://news.ycombinator.com/newsguidelines.html

> In Comments

> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.

> If you can come up with a better example, I'd love to hear it.

I don't have a side in this discussion. I just stepped in to note one reason why you might have gotten the reply you did, since you specifically noted "Don’t know what you’re getting at." I was merely attempting to illustrate why I thought you might have received the response you did.

> My argument is that the timing of inputs and outputs to consoles is heavily quantized...

That's probably a good argument. That's not what you replied with though. If you had, I don't think you would have gotten the response you did.

>> > Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.

That's as much a point against your comment as against the person that responded to you. Responding to a specific example instead of the position is not responding to the strongest possible interpretation. Then you get responses like you did.

Again, I'm not taking a position on this. I only bothered to comment at at because you seemed to indicate you didn't know why you got the response you did. Honestly, I thought you were likely presenting it as a slight tangent, not an argument, since it's fairly common here for people to view any response as someone arguing the other side once they get slightly heated in an argument.

> I was merely attempting to illustrate why I thought you might have received the response you did.

Oh, that's completely unnecessary. When I said, "Don't know what you're getting at," I was hoping that the commenter understood something that I didn't.

> That's probably a good argument. That's not what you replied with though. If you had, I don't think you would have gotten the response you did.

This is a good illustration of why the guidelines are there in the first place. Comment #1 was the weaker argument, comment #2 was the stronger comment. If you have a discussion with someone, you can get from comment #1 to #2 and beyond.

Telling me that if I had written comment #2 first is... not helpful! Because the process of writing comment #2 involved reading replies to comment #1!

Sure, and for hardware this old the FPGA could re-implement the CPU (which ran at <4MHz) and the shift register at almost the exact frequency (or whatever frequency is necessary to reproduce the original behavior) of the original as necessary. Then you can run the original software without modification and get much closer to identical behavior and performance characteristics with much lower resource utilization. Granted, it won't be nanosecond perfect, but it will be a lot closer than you'll get on any current CPU.

I don’t think reproducing the exact behavior here is necessary, useful, or even interesting. You could use a USB gamepad and get behavior that is close enough that humans can’t tell the difference.

While I don't know the current status, with MAME in the mid 2000s, an update that added support for certain games usually involved changes... that broke a different set of previously working games. I remember having 3 MAME versions installed to play all the games that I wanted. That kind of emulation hell is greatly reduced when you simulate every bus and chip of the original hardware.

To give you an example: I don't remember which game it was, but it was a platformer, and when the character jumped, the game put the starting pitch of the jumping sound effect in the sound chipset, with the DSP doing the loss and pitch change on each cycle. The game moved the character according to certain value in some register of the DSP, so the game was working normally until you jumped, then the DSP emulation code did something like returning the original value or 0 (because 99.9% of the games just write there), and the game crashed.

That's because MAME is distributed as a single executable which combines support for all the machines it emulates. If there was a mega 'multiple arcade machine FPGA core' that supported thousands of machines in one then you would need to do the same kind of installation management as different parts of that codebase matured. But when using FPGAs you normally have a seperate core for every machine being simulated. By the emulation analogy, you're actually switching between many different 'MAME versions', or single-game emulators when you switch between games on your FPGA.

For me as an occasional hobby developer for old platforms it's interesting. Subtle differences in how emulators work compared to the original hardware has sometimes resulted in nasty surprises.

In the best world, I could use an emulator as a reference, but for now I have to keep the old hardware around for that reason and always make sure to test my code on it regularly. Even differences that have no impact on the entire software library are important in that sense.

For example, if I write an address to an OPL chip I normally have to wait a few cycles for the address change to be effected before I write a value. You can remove that limitation in an emulator and all software written for the original platform will still work, but the emulator will no longer be useful as a reference, because now I can write software that works in the emulator but not on the real system.

There's of course a use case for emulators that take these kinds of shortcuts for the purpose of running an existing software library, and that's how people were able to play SNES games on Pentium class hardware, but as a point of reference and complete functional preservation of the hardware platform itself, that isn't good enough.

That's exactly what's happening. There are loads of projects going on right now decapping old chips and reverse engineering them. From old CPUs like the 6502 to the Amiga Alice chip. It's just a matter of time before most of the retro systems are fully reverse engineered and documented.

That's generally not what's happening with the mister cores. I haven't looked at all of them, but the ones I have studied are very high level in implementation.

To the best of my knowledge, the Neo Geo core has certain chips based on a full decap and implementation in FPGA. Not the entire system though.

Neat. I'll have to look at that one.

FPGA cores are based on CPU emulators, so they aren't more accurate than the emulator, and decapping a chip doesn't help. FPGAs are written in languages like Verilog that are still high-level and do some probably misguided tricks to make it look like you're programming them with a vaguely C syntax.

No, they are not generally based on CPU emulators. FPGA emulators are more accurate than software based emulators, in part because they don't have to do gymnastics to awkwardly emulate things that are easily done in hardware. For example, the documentation for Amiga's custom chips specifies which cycles various DMA activity takes place. Replicating that in hardware is quite easy (just a few comparators wired up to a counter), but it's hugely expensive in software which leads to all kinds of nasty hacks to replicate behaviour games rely on.

It's similar with CPUs. CPU manuals from the 1970s and 1980s often specify the exact cycle counts for instructions while having simple bus interfaces and no caches. Coming up with cycle accurate hardware for these CPUs is not all that hard.

The Amiga is the last machine I ever had for which I was able to get complete specifications and schematics for and understand completely. Once the 386/486 came along, documentation of SMM mode and other internal details became restricted. Modern computers have so much hidden firmware that understanding the entire system is virtually impossible. In contrast vintage 8/16 (and some 32) bit machines can be completely understood by a single determined individual.

> FPGAs are written in languages like Verilog that are still high-level and do some probably misguided tricks to make it look like you're programming them with a vaguely C syntax.

This is either incredibly badly explained or wrong. Verilog compiles to gate level.

The difference between Verilog and gates is larger than the difference between C and assembly?

Also, I think most decapping is done to read data tables out of ROMs, not to figure out what all the gates in a chip do.

> FPGA cores are based on CPU emulators

FPGA cores are literal CPUs.

Meant to say "FPGA emulators" there, the last emulator project I worked on called them cores.

Unless you have seen MISTer running (especially on an fixed HZ old school CRT monitor) vs running that same thing emulated on on a modern cycle-exact emulator. The difference is really noticeable.

This might come down to the emulator also running on a modern OS, which cannot guarantee at all times smooth framerates, whereas a dedicated FPGA can promise you that it's not running much more than the actual core needed to emulate that system.

Especially on a 50/60Hz CRT monitor the difference and latency from the controller to pixels to screen is noticeably faster and stuff like scrolling the screen and sprites are buttery smooth, just like in the original hardware.

Actually decapping the original chips is very much a thing. See for example Chris Smith's work mapping out the innards of the ZX Spectrum ULA - http://www.zxdesign.info/book/insideULA.shtml .

It is, but these cores are almost exclusively not being done that way. Not yet at least. I hope that they will be, that would be really awesome. I paid $1200 last year for the SNES PPUs to be decapped for this purpose, but it's a truly enormous undertaking to map out those chips and then recreate it in Verilog. You're talking thousands of hours of work per chip. If anyone reading this is able to help with that effort, please do let me know, we could really use the help.

Not that this is necessarily helpful to you in the short term, but it strikes me as a good problem for machine learning (going from die pictures to transistor schematic.)

By decapped, do you mean delidded?

Theoretically it would be possible to automate this with a couple things:

- USB electron microscope to image the transistor topology

- CV lib to identify connections and generate corresponding Verilog code

“Decapping” is a more intense version of delidding where you use chemical agents or something similarly extreme (laser, plasma, milling) to remove the package (ceramic, plastic).

My understanding is that there are people who do it often enough that it is automated in the way you describe, but you still need someone with a lot of skill to spend serious time on it. Computer vision works wonders but there are errors which must be identified and fixed.

A lot of the chips people care about are can just be done optically, no electron microscope needed.

Ah, that’s a good distinction. I’d be pretty scared of damaging the hardware by doing that, but I’m sure there are some really experienced folks out there that would appreciate the hardware donation.

Decapping destroys the hardware.

I think big differentiator is that it is easier to get predictable latencies with FPGA where you control almost everything, compared to general-purpose PC which is not really that well optimized for hard real-time operation. So I believe "race the beam" style things are more easily accomplished with FPGAs, and also having tight audio-video sync. Although the PC emulation scene has been also doing some fairly incredible things too.

you can find it here https://archive.is/fWosI

Quite a few of the FPGA soft cores related to 8 bit gaming are reverse engineered from either schematics, decapped chips, or both. Or they take pains to at least use the same number of cycles for each instruction, access SRAM or DRAM in the same way, etc.

That's true, but I think it's also true that you could trim a bit more lag if you do it well.

I like to emphasise that this is about much more than gaming. This is hardware emulation of old computers. I would love to see it emulating other machines - if it can do an Atari ST, it'd probably be able to do a Xerox Star or an Apollo Domain.

One thing I noticed from owning a C64-Maxi is how much of the experience is the physical aspect - We interacted with these machines through keyboards of different designs and that was a very important part of the overall experience.

Running VALDOCS on an Epson QX-10 without a VALDOCS keyboard is not quite the right feel.

Nice thing about fpgas is that at least in theory it should be relatively easy to interface pretty much anything to them, so you could grab your favourite retro keyboard and connect it to MiSTer.

I wish we could have easier/cheaper ways to build keyboards like that could function with the emulation. A Symbolics without circle, square and triangle keys is not a Symbolics

I regret not making the title “MiSTer, an open-source FPGA computing and gaming project,” maybe @dang can help.

I love my MiSTer build!

The usb controllers made for the recent mini systems (NES, SNES, Genesis, etc) make great accessories for the mister. Add a couple of usb arcade sticks and you can really play almost any classic retro games as it was meant to be played.

And then there are all the classic computer cores even including the PDP-1!

Yup, same! Can wholeheartedly recommend it for those who want something between emulation and real hardware.

The Amiga core is fun, AGA, 2 MB chip, 384 MB fast. It supports hard disk images, so you can do a hard disk based Workbench installation and load games and demos practically instantly (and safely exit to Workbench) using WHDLoad.

Arcade cores are fun as well. Just like in childhood, but less hungry for quarters. :) Recently played with arcade Gauntlet core a bit for example.

This whole setup is a game changer for me, I got a complete WHDLoad setup done in under 5 minutes after a bit of sleuthing on Internet Archive. Now I just need to get the proper cables to connect my Commodore 1084s.

Well, indeed, it literally is a game changer.

Thanks for mentioning the PDP-1 among other amazing cores! :)

One really great way to think of MiSTer is as a living documentation project that results in documented and working system hardware vs MAME which is focused on documenting working system software. It just so happens if you have working hardware you can run the software meant for it (MiSTer) and from the other end if you want software to work you need a pretty good idea what the hardware is supposed to be doing.

They're basically similar efforts but approaching the problem from a different point of view.

Given how hard FPGA programming is, the work in MiSTer is quite magnificent.

This isn't really new, right? I've heard of this years ago.

But it is an amazing project. Instead of emulating, they actually rebuilt the old custom ICs (which 8-bit computers were full of) in an FPGA. Really impressive.

It is indeed an amazing project, especially its open source nature. It provides some impressive power savings and latency reductions that are very hard to match with general purpose CPUs.

But in most cases, it is emulation, as the lead developer will attest.


"From my point of view, if the FPGA code is based on the circuitry of real hardware (along with the usual tweaks for FPGA compatibility), then it should be called replication. Anything else is emulation, since it uses different kinds of approximation to meet the same objectives. Currently, it's hard to find a core that can truly be called a replica – most cores are based on more-or-less functional recreations rather than true circuit recreation. The most widely used CPU cores – the Z80 (T80) and MC68000 (TG68K) – are pure functional emulations, not replications. So it's okay to call FPGA cores emulators, unless they are proven to be replicas."

But there's nothing wrong with emulation for preservation, until we get to a point where we can wide-scale clone these older chips down to the transistor level through analysis of delayered decap scans. And even then, emulation will be useful for artificial enhancements as well as for understanding how all those transistors actually worked at a higher level.

It's also not a total solution: by taking many more transistors to programmatically simulate just one, it limits the maximum scale and frequency of what it can support. N64/PS1/Saturn has not yet been fully supported and is still theoretical, but likely, to be possible. Going beyond that is not possible at this time.

Software emulation and FPGA devices should be seen as complementary approaches, rather than competitive. The developers of each often work together, and new knowledge is mutually beneficial.

Ah ok I wasn't aware of this. I thought it was spot on.

And yeah I hope we can easily order small batches of ICs (at big pitch of course) in a few years, in a similar way to how creating PCBs has become so simple now.

I mean I remember how much of a PITA it was in the 80s. Drawing on overhead sheets. All the acids and other chemicals. Drilling. And now we get super-accurate 10x10cm boards dual-layer, drilled, soldermasked and silkscreened for a buck a pop with a minimum of 10. Wow. I really hope this trend continues down to the scale of ICs (or that FPGAs simply get better/easier).

By the way, emulating a CPU is pretty easy and very accurate anyway. The big problem with accurate emulation is with some of the peripheral ICs which used hard to emulate stuff like analog sound generators.

> It's also not a total solution: by taking many more transistors to programmatically simulate just one, it limits the maximum scale and frequency of what it can support. N64/PS1/Saturn has not yet been fully supported and is still theoretical, but likely, to be possible. Going beyond that is not possible at this time.

The limiting factor here is the amount of stuff you can throw into a single FPGA, correct?

So in theory, shouldn't it be possible to tie a bunch of FPGAs together, with two beefy ones being responsible for replicating CPU / GPU functionality, a couple smaller ones for sound and other "helper" processors, and some bog-standard ARM SoC to provide the bitstreams to the FPGAs and emulate storage (game cartridges, save cards) and input elements (mainly "modern" controllers)?

There's both a cost and a speed barrier to it. FPGAs are often used to design, simulate, and test modern circuits at sub-realtime speeds. No amount of FPGAs will get you a PS2 emulator at playable speeds right now, let alone a PS3/Switch emulator. PCs can do that today by taking shortcuts such as dynamic recompilation and idle loop skipping.

Hmm... looking at the frequencies and gate counts, I think PS2 is well within realm of possibility to run on a not-so-cheap FPGA (or several). But PS3 generation consoles definitely not.

> The limiting factor here is the amount of stuff you can throw into a single FPGA, correct?

And the speed that you can get your design to run at. Something like the Game Cube (PPC750 @ 485 MHz) would be difficult to implement in an FPGA, for example.

Well, yeah, it's not replication if it's not an exact hardware replica, but the word "emulation" has very "software" connotations. I guess let's call it.. recreation? (That word is even in the quote above!)

"FPGA re-implementation" may be a better term

So it's not perfect but it's better than emulators...

In latency and power usage, yes. In compatibility and accuracy, no. Both are Turing complete, so there's nothing you can do with one that you can't do with the other.

If you take the SNES core, my software emulator has 100% compatibility and no known bugs, and synchronizes all components at the raw clock cycle level. It also mitigates most of the latency concern through a technique known as run-ahead. But it does require more power to do this.

I'm really curious where you got "better" out of the quoted text. Because it's not there or implied, but people keep reading this into anything about fpga recreations of chips. There's nothing inherently better about doing emulation on an fpga or a cpu, other than basically the amount of electricity involved in doing it.

But people keep presuming an improved accuracy that there's no basis for.

Lower latency is definitely a thing. With FPGA it's possible to 'chase the beam' like the original hardware, and have much reduced input latency from devices, etc. With an emulator you're going to be fighting the OS and the frameworks you built on top of. Even if you go "bare metal" (like my friend's BMC64 project which runs a C64 emulator like a unikernel on the RPi with no OS) you are still dealing with hardware built for usage patterns very different from the classic systems. You're always going to be one or more frames behind.

That is true. There are however techniques software emulators can use like run-ahead that can get you lower latency than even the original hardware on a PC: https://near.sh/articles/input/run-ahead

The caveat is that it doesn't always work, and it makes the power requirements even more unbalanced. Some might also see it as a form of cheating to go below the original game's latency. If you want to match the original game's latency precisely, FPGAs are the way to go right now for sure.

Run-ahead seems pretty cool, great technical write up. How would you compare this to the feature called frame-skipping that I often see implemented in software emulators?

Frame-skipping is just a speed hack of skipping rendering every other frame or so, and makes games very unenjoyable to play. It won't help with input lag at all.

Agreed about chasing the beam. With a SNAC addon and a CRT TV, you can even hook up original light guns to the mister and they work prefect.

Probably the marketing copy for Super NT and similar products... harder to get people to part with hundreds of dollars if your pitch is "lower power draw and reduced input delay"

Yeah, it really is an amazing application for FPGAs—preserving computing and gaming history. The list of cores available for MiSTer is simply staggering:

> Computers - Classic

• Acorn Archimedes • Acorn Atom • Alice MC10 • Altair 8800 • Amiga • Amstrad CPC 6128 • Amstrad PCW • ao486 (PC 486) • Apogee • Apple I • Apple II+ • Apple Macintosh Plus • Aquarius • Atari 800XL • Atari ST/STe • BBC Micro B,Master • BK0011M • Color Computer 2, Dragon 32 • Commodore 16, Plus/4 • Commodore 64, Ultimax • Commodore PET • Commodore VIC-20 • DEC PDP-1 • EDSAC • Galaksija • Jupiter Ace • Laser 310 • MSX • MultiComp • Orao • Oric 1 & Atmos • SAM Coupe • Sharp MZ Series • Sinclair QL • Specialist/MX • TI-99/4A • TRS-80 Model 1 • TSConf • Vector 06C • X68000 • ZX Spectrum • ZX Spectrum Next • ZX81

> Consoles - Classic

• Astrocade • Atari 2600 • Atari 5200 • Atari Lynx • AY-3-8500 • ColecoVision, SG-1000 • Gameboy, Gameboy Color • Gameboy Advance • Genesis/Megadrive • SMS, Game Gear • MegaCD • NeoGeo • NES • Odyssey2 • SNES • TurboGrafx 16 / PC Engine • Vectrex

> Other Systems

• Arduboy • Chess • CHIP-8 • Epoch Galaxy II • Flappy Bird • Game of Life • TomyTronic Scramble

Interesting project would be to dig out some old cassettes from, let's say, commodore 64. Try to load them into a present day computer by patching wires/cables? - and see if they run in this system. I remember writing for example: a mining program, to calculate, overburden, volume and tonnage, at different slopes, different rock types, etc. The science behind the calculations is still valid, but we could likely increase load times, and calculating times.

I'm still waiting for the KENBAK-1 core.

Is there good documentation or ICDs out there that adequately describe the architecture? Looks like there’s only 50 that were ever made, and only 14 believed to exist today.


Seems pretty well documented. Considering the simplicity of the computer, feels like it would be relatively easy project to get to MiST

Old projects get reshared many times. It's always new to someone.

I'd really like to see a portable/handheld leverage this technology for on-the-go gaming.

The Analogue Pocket[1] is exactly this (albeit proprietary). Out of the box it recreates GB, GBC, and GBA using the Altera Cyclone-V platform.

[1] https://www.analogue.co/pocket

I wonder, why is there no DIY Analogue Pocket-style MiSTer project? Is the DE10-Nano too large or inefficient for this?

The DE10-Nano itself is a bit large for a handheld device, and hasn't been optimized for power consumption. (It's designed as a development board, not as a component of a finished product.) There's nothing stopping someone from using the Cyclone-V SoC in a handheld device, though.

I’d reckon it’s the same reason that there isn’t much of a custom laptop scene. The open ended nature of stuffing a screen, battery, and input peripherals into a chassis seems an order of magnitude more difficult than just making a headless box to plug into your TV.

But with some effort, it would be awesome.

Physical design is also a lot more important. Getting "feel" just right is very hard and expensive, especially when it comes to game controllers.

I think it is because the motivations for retro console FPGA simulations are to play competitive action games sit down with highly accurate timings. “There is a 5 frames window after this triple input sequence” thing. Pocket gaming happens at much relaxed timings so less demands exist for low latency cycle accurate simulations.

The limited market is problably covered with Odroid Go / GPD XD / RG350M etc. Mister leverages an off the shelf FPGA board that would require a lot more work in a handheld form.

It was being worked on at one point. I forget who was doing it. They showed off mockups on smoke monsters streams, but I havent seen much of it in over a year.

I built the Gameslab around this concept, but haven’t worked on it much lately.


Related thread from 2018:

MiSTer: Run Amiga, SNES, NES and Genesis on an FPGA - https://news.ycombinator.com/item?id=18721594 - Dec 2018 (30 comments)

My first thoughts when reading the headline were: how to integrate FPGA power into making new games.

Which it is not, but still very awesome.

Anyway - if one would try to actually use FPGAs for developing a new game - would there be any benefits?

I mean programming it, will probably the main hurdle.

But if one would be good at it - and ignoring for a moment that the typical gamer does not have an FPGA around, would there be actually benefits when programming a new game?

I thought about in the combination with a normal gaming computer. Using the FPGA for complex simulations like physic? Would that make any sense?

> Using the FPGA for complex simulations like physic? Would that make any sense?

The graphics card is specialised for this and has better cost/performance. Graphics cards basically are specialised gaming hardware optimised as much as possible for that use case.

Trying to teach games programmers to use FPGA tooling would be a spectacular disaster. Oh, and FPGA toolchain-targeting is to a specific device; you probably don't remember the early days when games would target a specific model of 3DFX card, but that's what you'd go back to. Upgrade your hardware and you need recompiled versions of all your games.

The one case where it might make sense is if you focus on latency to a brutal extent. DisplayPort on one side of the FPGA, wired game controller (NOT USB) on the other. A unique experience that's hard for normal systems to replicate.

I have never really done low level programming, so excuse my ignorance. (for example I was not aware "FPGA toolchain-targeting is to a specific device", which is a very great hurdle only maybe enabling something like this in game consoles)

I know that gpu's are mainly optimized for games, but aren't they still general purpose computing units? (which is why nvidia put a special physics unit into their GPU's)

So the idea as I understand it, is with FPGA's your logic is directly in circutry - which in theory beats general purpose units.

So my actual question would have been, if this would allow for more complex simulations. Because most advancement in games just seem to be in making them look prettier, but not more complex and just faking complexity. But I really like the idea of an advanced voxel engine, for example.

But by your input it seems FPGA's as of today, are probably not the way to get practical improvements there and rather focus on the GPU.

You can't just ignore the fact that the average gamer doesn't have an FPGA. For such a game you would have to ship the FPGA with the game, making it prohibitively expensive.

Well, you could ignore it, if you would be developing a new gaming console with an integrated FPGA unit.

It's just never worked out to justify itself by the stakeholders who would approve such a thing. The standard gaming hardware configuration has always tended towards a little bit of general purpose computing capability plus a very focused specialization in graphics. And the standard incentive structure of game production tends towards a rapid deployment of assets authored in general-purpose content creation software, not a game driven by new algorithms. These things drive away from the experimental places where FPGAs could shine - unique ways of creating audio and video signals, customized inner loop optimizations, etc. The first thing everyone will be asking is "OK, but how do I make it portable?"

Sure, you’re completely right, but as a hobby project it would be pretty neat!

MiSTer is an amazing phenomenon. The MiSTer itself is just an Intel FPGA devkit, which many believe to be sold at a loss (because it's a training tool and not Intel's main source of FPGA revenue). The amazing thing is the aftermarket for addons. There are many possible combinations of addon boards that add RAM with deterministic latency, USB hubs, cooling fans, cases, retro controller ports, etc. All custom-made for this ecosystem.

It is definitely being sold at a loss, the Cyclone V SOC being used costs more than the entire development board.[0] I wonder if Intel will ever take notice due to MiSTer's growing popularity and quit subsidizing the board.

[0] https://www.digikey.com/en/products/detail/intel/5CSEBA6U23I...

Edit: it was erroneous of me to state the board was being sold at a loss, rather I meant that the board was being definitely being subsidized by companies such as Intel and their partners such as Panasonic. My mistake. I also wasn't meaning to convey that the consumer Digikey pricing was the same as the large volume manufacturers such as Terasic. Rather I meant to demonstrate and agree with the OP on the astounding situation that MiSTer currently exists in, owning to the lack of economic viability for someone to produce a low volume commercial FPGA emulation machine for a niche audience without any subsidization.

There is absolutely no way they’re sold at a loss. Your DigiKey price of $245 proves this, because a factor of 10 is a good starting point as a ratio between volume and one-off DigiKey pricing of any type of complex silicon.

A better way to approach this is as follows: what’s the die size of an FPGA like this? What’s the production cost of the die? Then check the historic gross margin percentage of FPGA companies. Xilinx is around 68%, and that includes high-end products which carry the highest markups, unlike this cookie cutter thing.

That should give you a good ballpark number.

DigiKey charges what they do because nobody else is willing to sell these things in low volume, and they have very high inventory costs.

Digikey pricing is not indicative of actual volume pricing, especially for FPGAs where they are often many times overpriced when buying from distributors. I doubt the board is sold at an loss, probably sold at a small profit, not that's it's really significant for a low volume dev board.

The price of a DE10-Nano is $135 ($115 for academic use.)

Anyone who thinks that Terasic sells these at a loss doesn’t have a clue about volume pricing of FPGAs. And as a special Intel partner, there’s little doubt that Terasic has access to this kind of pricing.

I just got my first fpga and i have no idea where to start. Any recommendations so I'm not overwhelmed?

By any chance is the AMD acquisition of Xilinx also targeted towards emulation market for their gaming processors? In the future would they have the emulated bitstream present for different iterations of the gaming consoles?

> By any chance is the AMD acquisition of Xilinx also targeted towards emulation market for their gaming processors?

Not even a little bit. Retro gaming emulation accounts for basically zero revenue compared to other markets that Xilinx serves, like HPC, networking, and computer vision.

What about backward compatibility of games in consoles?

My buddy recently did a really nice video overview of the MiSTer project: https://www.youtube.com/watch?v=-IP0k3GatHE


I have a couple of FPGA boards on their way to me in the mail which I intend to use for some homebrew video game projects. Besides getting the development environment working, it can be tricky outputting video from an FPGA because of the precise timing involved. I will have to look through these resources to see if there are any good tricks to use here.

It can be tricky outputting video from FPGAs? Quite the opposite, this is where they shine.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact