The 100 MHz 6502

ryanianian · on Oct 13, 2021

> The idea of implementing a CPU core inside an FPGA is not new, of course.

Indeed. I took a computer engineering class in undergrad. The capstone project was implementing from scratch a multi-pipeline RISC CPU (including the ALU and a very basic L1 cache) in Verilog that we flashed to FPGAs that we then programmed to play checkers using hand-compiled C. The FPGA was easier to flash and debug than the Spartan-6 mentioned in TFA but was significantly more expensive as well.

It was a brutal class, but it totally demystified computers and the whole industry in a way that made me feel like I really could understand the "whole" stack. Nothing scares me any more, and I no longer fear "magic" in software or hardware.

jwineinger · on Oct 13, 2021

I had a similar course, using VHDL going on to Spartan-3E kits (which I still have sitting in a box 14 years later). Our professor gave us three C programs -- with the input that would be entered and expected output -- and we had to implement everything to make it work: CPU, VGA output (with "font" definition), keyboard input, and translation of the C programs to our CPU's instruction set.

That was a difficult yet extremely rewarding class. My wife, then girlfriend, still remembers that semester because she barely saw me.

IIRC, bonus points were given to the team with the highest clock speed. I didn't win, but I seem to remember mine being somewhere in the 18MHz range and the winner in the low to mid 20s.

plandis · on Oct 13, 2021

> Nothing scares me any more, and I no longer fear "magic" in software or hardware.

You sound exactly like my professor for computer architecture. “Computers are not magic!” He mentioned this at least once during every lecture and my experience was similar to yours

chronolitus · on Oct 13, 2021

Sufficiently expert wizards would probably go around saying "Magic is not magic!"

FPGAhacker · on Oct 13, 2021

It really depends on how many steps down the ladder of abstraction you go. Computers are still magic if you step down a few more rungs.

pjmorris · on Oct 13, 2021

Clarke's Third Law: Any sufficiently advanced technology is indistinguishable from magic. [0]

[0] Profiles of the Future, Arthur C. Clarke

Teongot · on Oct 14, 2021

The corollary to that is, of course, that any technology which is distinguishable from magic is insufficiently advanced

picture · on Oct 13, 2021

Down to the physics of FETs and the underlying chemistry? How much further doe it go?

Cyph0n · on Oct 13, 2021

Quantum effects, particle physics, and then the unknown (for now).

baq · on Oct 14, 2021

10, 7 and 5 nm patterning and EUV light sources are basically magic.

Dylan16807 · on Oct 13, 2021

I'd say the field effects on electrical flow are probably the weirdest part to look at, and they don't reach 'magic' levels.

The atoms below that are relatively straightforward, and them being made up of building blocks is fine I guess, increasingly irrelevant to the issue of a computer.

And going up everything starts to get very non-magical as you turn response curves into binary signals and then string gates together.

ngcc_hk · on Oct 14, 2021

Qm is still magical as you sort of use it but do not really understand it. It is so weird. And that is magically part.

Btw I read your below word as above.

Dylan16807 · on Oct 14, 2021

It's such a narrow sliver of quantum effects that are relevant here, though, and it's not the weird stuff.

ZeikJT · on Oct 14, 2021

Quantum tunneling is a step into weird.

Dylan16807 · on Oct 14, 2021

You don't need tunneling to make a transistor work.

(And for difficulty making walls too thin lest electrons leak through, you don't need to invoke tunneling to explain that.)

fsckboy · on Oct 14, 2021

i think amplified current flow through bipolar junction collector depletion zones are the weirdest part, and they are magic.

erosenbe0 · on Oct 14, 2021

Yeah for me it is the parasitic bjt/collector that forms in a fet (c.f. latch up). Also that bjts work in reverse active to some extent, despite the emitter nominally being the injector of the carriers. Weird!

Also I never understood things like slew rates, noise, gain bandwidth. Too high up the stack.

dan_hawkins · on Oct 14, 2021

Gain bandwidth is directly related with slew rate. Slew rate is mostly an effect of capacitances (gate capacitance in FETs for example.) Noise is mostly caused by thermal effects (atoms bumping randomly.)

tuatoru · on Oct 14, 2021

At the third to bottom rung, all electronics are analog.

Salgat · on Oct 15, 2021

Ironically computers are as close to magic as you can think of. It's literally a mechanical creature that you control with written spells.

RhysU · on Oct 17, 2021

And the public raises pitchforks when one person in a castle controls too many of those mechanical creatures at once.

WizardOfLight · on Oct 13, 2021

https://en.m.wikipedia.org/wiki/Magic_number_(programming)#M...

mayli · on Oct 13, 2021

full stack engineer vs "whole" stack engineer.

tomxor · on Oct 13, 2021

There's also adjacent stacks... like being able to build your own computer out of crabs and rocks if you get stranded on a desert island. So that you can get them to write SOS in the sand and play pong while you wait.

ultimape · on Oct 14, 2021

Unconventional computing platforms FTW. https://phys.org/news/2012-04-scientists-crab-powered.html

garfieldnate · on Oct 14, 2021

Are there any resources online that would let me do a similarly deep study myself? I'm enjoying the famous "Nand to Tetris" course, but it uses a simplified architecture (no pipelining, no cache, no interrupts) and runs in an emulator instead of an FPGA.

erosenbe0 · on Oct 14, 2021

Awesome Base to start with! Lot of things in the next step though: PCIE, JTAG, differential signaling protocols, debuggers/monitors, UEFI firmware, cache coherency, TPM type modules, linkers, loaders, vector instruction sets, sound codecs, DACs, GPUs, multi channel SDRAM, MMUs, display protocols, device trees. Plus latency, throughout, and thermal management in any of the above in a real system.

vvanders · on Oct 14, 2021

As someone who started pretty high up the stack(Visual Basic!) and kept peeling layers away out of curiosity I will say working with a CPLD/FPGA was really eye-opening. Latency difference between SRAM/DRAM and the "why" in pipelining made a bunch of disjointed approaches to optimization just drop into place.

necovek · on Oct 17, 2021

Nostalgia is high with this one :D

I've started with gwbasic, then QBasic and moved up to "Visual Basic for DOS" (ncurses-style UI stuff, similar to Turbo Pascal iirc — I don't think many people even knew it existed since Win 3.11 was already big, but I loathed it and only switched to it because of... Trumpet Winsock for the internet!), but that did not stop me from playing with driving serial (COM) ports or parallel ports (LPT) with printer escape sequences. Not really FPGA level low (not even close), but DOS-based stuff was really easy to start hacking up!

moritonal · on Oct 13, 2021

Yep, not nearly as hardcore, but writing a TCP stack changed how I saw the Internet.

adamgordonbell · on Oct 14, 2021

Wow, that sounds like an amazing experience. Challenging though I bet.

kabdib · on Oct 13, 2021

Leonard Tramiel once told me that the "world's record" for a production 6502 was around 25Mhz before the smoke came out. This was in a lab at Commodore, and beer was probably involved (and may have been used to cool the chip).

_abox · on Oct 13, 2021

Reminds me of when I worked at a computer shop in the Pentium 1 days.. Motherboards had these complex blocks of jumpers in those days to configure the speeds, there was no autodetecting the CPU.

One day I was working at a different branch filling in for someone on sick leave, and I spent an hour trying to get a PC I just built to boot reliably. Every time it crashed after an minute or so. I didn't get it, everything seemed fine.

Eventually it turned out the "cheat sheet" they had of the jumpers was upside down over there. Someone had copy/pasted the pictures so it was matching the orientation of how they usually had the PC on the desk, rather than upside-down as I was used to. But the text was the right way up. So the total thing looked the same as in our branch except it wasn't.

It was a square block so I hadn't noticed the orientation was different. Turned out I had the 100Mhz pentium configured for 180Mhz. Oops. That wasn't even an officially supported speed of the motherboard but the BIOS messages indicated this (which I only noticed afterwards)

As we didn't want to sell this CPU after the torture I had put it through we decided to use it for a display box instead, and we tried to keep it running as long as possible by using compressed air cans upside down to blow dry ice :D It actually ran reliably until the can ran out. Only later I found out that liquid nitrogen extremeclocking was actually a thing :D

myself248 · on Oct 13, 2021

My 486 (nominally 40MHz) had one of those jumper blocks, and it was right near the front corner of the motherboard. Right behind the floppy-drive opening in the case, which had the drives mounted in these little removable sleds. And I didn't use my floppy drive much, and besides, the floppy cable didn't seem to mind being hot-plugged as long as you weren't accessing the disk at the time and unplugged the power connector first.

So during a long download, I didn't need all 40 MHz screaming along (and heating up the chip to the point that it needed a cooling fan -- a COOLING FAN, can you imagine a CPU running so fast it couldn't cool itself on ambient air?), so I decided to see if the clock generator jumpers were hot-pluggable.

Lo and behold, they were! I could reach in and seamlessly downclock the CPU to 8MHz (which was just one jumper-cap different than the 40MHz setting), which was still plenty to service the UART FIFO interrupt. Unplug the CPU fan too, which made the machine silent. Turn the monitor off, kick back in my chair, and take a catnap. The Telemate terminal software would play a little tune when a download finished, which would wake me up, I'd turn the monitor back on, open a DOS prompt, start unzipping the file, and then reach in and clock the CPU back up so the pkunzip process would finish in a timely manner.

It would do 50MHz but the upper half of RAM would disappear, so there weren't a lot of workloads appropriate for that configuration....

einr · on Oct 13, 2021

and heating up the chip to the point that it needed a cooling fan -- a COOLING FAN, can you imagine a CPU running so fast it couldn't cool itself on ambient air?

This feels super nitpicky but I'm curious about your setup and if you're either remembering the clock speed wrong or if the fan was actually completely extraneous, because in fact neither of the common 40 MHz 486 parts, the Cyrix Cx486DX40 or the AMD Am486DX40, required a fan. The Cyrix one came with a heatsink, which was a rarity at the time.

The first 486-class CPU that pretty much always ran with active cooling was the DX4/100. Even the DX2/66 could run fanless if you had half decent airflow.

myself248 · on Oct 13, 2021

I'm certain of the clock speed, but I think you're right that the manufacturers alleged they were fine without additional cooling. (They considered it a bad look, and actually said that chips sold with fans were likely overclocked chips intended for a lower speed bin, or otherwise graymarket.)

But consensus among everyone _but_ the manufacturers was that additional cooling couldn't hurt. (A representative opinion can be found in Upgrading And Repairing PCs, whatever edition was current at the time.) Running right at the top of Tcasemax wasn't good for longevity in terms of electromigration within the chip itself, nor for the capacitors and other components in the neighborhood. Thermal goop wasn't commonplace yet, but the little heatsinks and fans sold like hotcakes (har!) at the local computer shows. Plain aluminum heatsink, clear (polystyrene?) fan, with a holographic "CRYSTAL COOLER" sticker on top. I still see the fans around, but without the shiny sticker.

The Am486DX-40 was my favorite chip. With a VLB video card (Trident 9400CXi) that worked well on the 40MHz bus, its pure pixel-pushing power ran rings around 33MHz-bus systems regardless of their core clock, and that included the P-75. I later got the impression that I lucked out with that Trident card, as almost everyone else with a 40 or 50MHz VLB machine had tales of woe and flakiness.

If you're not already familiar with it, you'll likely enjoy this trip down memory lane: https://redhill.net.au/ig.html

einr · on Oct 14, 2021

Thanks for the reply! That is interesting and makes a lot of sense.

Yes, running a 40 or 50 MHz bus made a huge difference, especially if you could get VLB graphics running reliably on it. I'm into collecting and tinkering with 486-era machines for nostalgia's sake and often i see things like DX2 or DX4 systems with plain cheapo 16-bit ISA graphics cards and think such wasted potential...

necovek · on Oct 17, 2021

I was pretty sure none of the 486 CPUs absolutely required a cooling fan (though I only had a DX2/66MHz): I remember Pentium II being the first CPU I've seen that absolutely required it (came with one integrated on the second computer I assembled for myself), and I distinctly remember having trouble testing an AMD CPU a year of so later on a computer I was assembling for someone else because... well, it wouldn't even POST without a CPU cooler.

Luckily I did not fry it and after adding a cooler it worked just fine.

I don't remember if AMD or Cyrix CPUs were worse though.

jon_adler · on Oct 13, 2021

This sounds similar to how the 8088 XT motherboards could be toggled with a turbo button on the front of the case.

ddingus · on Oct 13, 2021

There were cards that had a pot on the back. You could just dial up more speed, until your machine ran poorly, then back off just a little. Kind of crazy to think about!

The speed on my Apple FastChip is adjustable in real time too. It's neat to just dial a speed appropriate for the application at hand.

cogburnd02 · on Oct 13, 2021

I had not heard of the FastChip before today, so thanks for letting me know it exists. :-D

for other interested parties:

http://www.a2heaven.com/webshop/index.php?rt=product/product...

ddingus · on Oct 13, 2021

It's a great product!

If you are interested in programming your Apple in assembly, you can ask nicely for your FastChip to include a 65816 processor. It's going to act like a 65802, due to hardware limitations, but otherwise yeah. You get the 16 bit instructions to use.

I've not had any compatibility trouble with mine, which is a 65816.

ddingus · on Oct 14, 2021

It also runs at less than 1mhz. That is just as interesting, frankly.

necovek · on Oct 17, 2021

I am pretty sure I've had this on at least my 80286, and maybe even 80386 and 486DX2 (33/66MHz I think) too (though highly uncertain on the latter). Perhaps you only needed a case that would connect the turbo switch to the motherboard.

It was fun to turn it on for games that used timing-loops for frame rendering to make games twice as fast :)

franga2000 · on Oct 13, 2021

Do I recall correctly that those turbo buttons would, counterintuitively, actually down-clock the CPU? For compatibility with software that had hard-coded timings or something?

einr · on Oct 13, 2021

The correct way to wire them is so that turbo "on" means full speed and turbo "off" means slowed down. Different motherboards implemented it differently. Usually downclocking the FSB or inserting waitstates for memory access.

They originated with "Turbo XT" class machines which ran an 8088 but at 8, 10 or 12 MHz -- faster than a real IBM PC/XT. Turbo on meant a faster machine, and turbo off meant 4.77 MHz -- fully compatible with timing-sensitive PC software.

Later, in the 386/486 whitebox PC era, some machines had the buttons wired wrong and now it's a meme that turbo made the computer go slower, but that was never true for systems built correctly.

ddingus · on Oct 13, 2021

Yes. Old software had timing loops and other delay constructs. For a while, that button was meaningful when running games intended for the original clock rates.

erickhill · on Oct 14, 2021

I discovered this a few years ago playing Ultima IV on my Amiga 3000, which is 16Mhz. It was impossible to play (and funny to look at) because the game was so sped up. All of the NPCs in the game, which normally just stand in place and move their arms about, were moving so fast it was bananas. You could barely see their arms they were like hummingbirds.

The game was intended to be on a 7Mhz stock machine pre-1990. And it was perfectly timed to that speed.

ddingus · on Oct 14, 2021

Indeed. Same story here. The Atarisoft port of ROBOTRON for DOS was that way. Insane at any two digit clock.

Sidebar: Ultima games are great. Did you see Nox Archaist?

erickhill · on Oct 14, 2021

Wow - that is quite the unabashed Ultima clone right there!

ddingus · on Oct 15, 2021

Seriously. The guy put in a good amount of time and delivered an Altima with many modern sensibilities baked in. I've been playing it on my Apple it's a lot of fun.

zwieback · on Oct 13, 2021

Great story. I remember those days well, I also remember finding every last byte of UMB, elaborate autoexec.bat files and figuring out IRQ assignments to get max functionality. Kids these days don't know how good they got it.

necovek · on Oct 17, 2021

You mean how bad they got it? :D

I loved modifying CONFIG.SYS with a hex editor to translate MS-DOS 6.2 (?) boot menu.

zepearl · on Oct 13, 2021

Reading this made me remember that "Turbo button"...

https://en.wikipedia.org/wiki/Turbo_button

> With the introduction of CPUs which ran faster than the original 4.77 MHz Intel 8088 used in the IBM Personal Computer, programs which relied on the CPU's frequency for timing were executing faster than intended. Games in particular were often rendered unplayable. To provide some compatibility, the "turbo" button was added. Engaging turbo mode slows the system down to a state compatible with original 8086/8088 chips.

I never had such a button in my PCs (first one was a 386SX) but I did see it on other PCs and always wondered what it did... => today I finally found that out :P

Osiris · on Oct 13, 2021

The one I had I believe was a 386. The problem was that it was easy to accidentally bump the button and my Mom would complain the computer was running slow.

MichaelMoser123 · on Oct 13, 2021

I once had a system, where the turbo button wasn't doing anything. Don't know if that was by accident, or by design.

zepearl · on Oct 14, 2021

Right! I remember having the same experience when visiting a friend - misterious Turbo button that (apparently?) didn't have any effect :)

jart · on Oct 14, 2021

> As we didn't want to sell this CPU after the torture I had put it through we decided to use it for a display box instead

Why? What you did is basically a burnintest. All manufacturers torture their hardware by locking it in a hot room for several days at max speed to see if it fails. The basic theory is that if it's able to survive the torture test, then it's less likely to fail once it's been sold to the customer. Parts for things like space missions go through even more severe torture tests, where they're bombarded by radiation and every horrible thing you can imagine and that actually makes the price go up!

QuercusMax · on Oct 13, 2021

I had a K6 233 MHz I couldn't get to run reliably in windows without underclocking to 200. But it was rock solid on Linux. Always wondered why....

ddingus · on Oct 13, 2021

I had an old Pentium 90 that would crash on NT consistently. The diagnosis was a crappy bus that resulted in various errors.

Just for fun, I loaded Red Hat 5.2 on to the machine and it ran just fine. The syslog was full of bizzare errors, chattering the whole time too.

LargoLasskhyfv · on Oct 13, 2021

If something like Windows98 probably because of it not using the HLT instruction of the CPU vs. Linux doing the right thing, resulting in a cooler CPU on average when running under Linux.

See http://www.benchtest.com/rain.html

Reasoning by MS was low quality of the countless low-end power supplies, and maybe voltage regulator modules on mainboards, being 'unreasonably' stressed by load changes that fast.

feisuzhu · on Oct 14, 2021

I remember when I was using Windows 98, the media player [1] I was using is shipping with a tray icon which has a menu item written 'Save power when CPU is idle'[2]. It did exactly that (HLT thing). After ticking the menu, CPU just go cold.

[1]: 豪杰超级解霸2000 [2]: https://www.wendangwang.com/doc/e6d559d6f2cd9ba7007939d7/3

cmrdporcupine · on Oct 13, 2021

A 65c02 bought new today from WDC will comfortably do 20mhz, and 25mhz would probably also be no trouble.

5faulker · on Oct 13, 2021

Doesn't sound like something one can try at home.

AnimalMuppet · on Oct 13, 2021

Why not? You can always buy some beer...

jazzyjackson · on Oct 13, 2021

At the Vintage Computer Fest last week Bill Mensch mentioned to the audience that no one ever hears about the 65C02 and 65C816’s use in defibrillators and pacemakers - life critical applications - unless he tells them!

Does anyone know of good write ups or explanations of what makes the 6502 so reliable and what competition it had in being chosen for medical applications?

jacquesm · on Oct 13, 2021

Simpler is an advantage in that world, if you can understand the functioning of your device to the cycle level then you have a much better chance of delivering something that will work reliably.

dhosek · on Oct 13, 2021

One of the things that I loved about the Apple ][ was that it was possible for one person to completely understand everything about that computer from the hardware to the software. I've never had that level of complete understanding of any system I've used since.

mrandish · on Oct 13, 2021

Yep, similar experience here. My first computer was a Tandy / Radio Shack Color Computer. It had a 6809 processor (8/16-bit precursor to the 68000) @1.8MHz, 4k of RAM (upgradable to 64k), 16k or 24k ROM memory with a quite expansive MSFT Extended Basic Interpreter (supposedly the last ROM OS & BASIC that had assembler written by BillG himself).

I taught myself BASIC, assembler, graphics programming and game programming on that machine over a period of about four years of hacking around on it (including hand-commenting some significant chunks of the ROM). By the time I retired it for a shiny new Amiga 1000 in 1986 I'd upgraded it to 256k of bank switched RAM with a soldered-in hack board, added four floppy drives, various I/O boards and learned OS/9 (a UNIX-inspired multi-tasking, multi-user OS) and hacked in my own extensions to the ROM OS (including adding my own new commands and graphics modes to the BASIC interpreter).

It started out as a lot of trial and error but, on later reflection, ended up being a surprisingly thorough grounding in computer science from which to launch my career. That 6809 machine was also the last time I really felt like I was aware of everything happening in a computer from interrupts to registers to memory mapping down to the metal.

jacquesm · on Oct 13, 2021

Yes, that was the beauty of the 8 bit era, and many people lost it without even knowing that they lost something very precious. The total control is a very nice feeling.

zozbot234 · on Oct 13, 2021

I'm not sure why "simple, understandable system design" would have to be synonimous with 8-bit computing. One of the most appealing things about new open hardware initiatives is how they bring this simplicity and surveyability together in what's otherwise a very modern design and context.

__s · on Oct 13, 2021

8-bit can't afford much abstraction, so it's simple/understandable by necessity

ddingus · on Oct 13, 2021

Seems every time someone applies that to hardware with a wider compute path, other complexity creeps in.

Would be interesting to make a 32 bit Apple 2 style computer. Include a ROM for a means to boot, and leave everything else simple, with some nice slots. Could be a great development / learning machine.

cmrdporcupine · on Oct 13, 2021

I've "built" such machines in FPGA; PicoRV32 core, hand made display processor, a bit of RAM, and a couple UARTs. It was fun and not that hard for me, a newbie to FPGA.

One of the bigger challenges is integrating peripherals. I got bogged down trying to do SD Card interfacing. There are off the shelf bits of IP from Xilinx, etc. you can use to do this, but that sort of defeats the purpose of the exercise.

I think modern machines started their slide into mind boggling complexity when bus speed and CPU speed outstripped RAM speed. So much complexity and unpredictability is in the all the infrastructure built around cache.

Something like an Amiga or Atari ST was still not hard to understand all/most of, despite being 16/32 bits.

ddingus · on Oct 13, 2021

Those kinds of things can be done with the CPU, Apple 2 style. Map some I/O to a few addresses and service the SD card with a small routine.

jsymolon · on Oct 14, 2021

New machine based on the C64, nearing production release.

https://www.commanderx16.com/forum/index.php?/about-faq/

pjmlp · on Oct 14, 2021

Like these ones?

http://www.ic0nstrux.com/products/gaming-systems

ddingus · on Oct 14, 2021

Yes, but perhaps aimed more at retro computing.

I have a couple of those. The HYDRA was a lot of fun.

ThrowawayR2 · on Oct 14, 2021

Because once the clock speed gets past a handful of MHz, maintaining good clock distribution and good signal integrity become more painful very, very rapidly.

wcarey · on Oct 14, 2021

Are there any good sources of documentation for working with Arm chips? Like, a simple single board (or breadboard) computer?

ThrowawayR2 · on Oct 14, 2021

It's unclear what you're asking for here? The Raspberry Pi Pico and Raspberry Pi Zero are existing and very well documented ARM based single board computers.

wcarey · on Oct 14, 2021

Fair enough. I suppose what I'm asking is whether it's possible to purchase a more modern CPU independently of a SOC and design my own single board computer built around that. Can you buy a naked Cortex M0 in a PDIP package? Are there data sheets for it?

A search of the interned didn't turn up what I was looking for, but I'm very new at hardware work. Perhaps newer chips have such tight timing requirements that you can't work with them without using a SOC?

ThrowawayR2 · on Oct 14, 2021

Nearly everything is going to be a SOC because that's what commercial applications need to minimize part count and cost; there's negligible demand for a standalone processor.

If you're just looking to breadboard up a computer but don't want to go back to 8-bit processors, the Motorola MC68000 / MC68008 used in the original Apple Macintosh is a 32-bit processor in a DIP package running at a manageably low frequency and can be found on eBay inexpensively.

duskwuff · on Oct 14, 2021

> Can you buy a naked Cortex M0 in a PDIP package?

No. The busses used to access peripherals and memory are not suitable for off-die use. (This goes for all ARM cores, not just Cortex-M0.)

dirkt · on Oct 15, 2021

You may like Project Oberon [1] designed by Niklaus Wirth [2] then. His guiding principle was to make a powerful but simple system that could be understood from top to bottom by a single person, from the RISC hardware to the OS to the compiler used to compile and run the OS.

It's quite a bit above the Apple ][ in terms of power.

[1] http://www.projectoberon.com

[2] https://en.wikipedia.org/wiki/Niklaus_Wirth

ddingus · on Oct 13, 2021

And there is a ton of developed software ready to go. At the least, all the internal code would be very mature at this point.

yitchelle · on Oct 13, 2021

I don't have any documentation but I would imagine that as these chips have been in existence for so long, its behaviour is extremely well understood, including most, if not all, of its weak points. The work around for these weak points should also be well known.

flohofwoe · on Oct 13, 2021

The nice thing about the 6502 is that it is completely reverse engineered down to the transistor level, so it's possible to explore what exactly is going on in the chip for each clock cycle even when the original design documents had been lost:

http://visual6502.org/JSSim/index.html

(and shameless plug, my own 'remix' with better performance and more features: https://floooh.github.io/visual6502remix/)

danbolt · on Oct 13, 2021

I wonder if that makes finding replacements easier too, since you can comfortably find (or even make) new ones.

ncmncm · on Oct 13, 2021

Blank black window for the remix. What should I be seeing?

[Edit: It's webgl. I don't have webgl in QubesOS :(.]

pkaye · on Oct 13, 2021

Though I'd think they would use a microcontroller with a 6502 CPU which integrates ROM/RAM/GPIO/peripherals into one. Here is a microcontroller with a 6502.

https://pdf1.alldatasheet.com/datasheet-pdf/view/103795/ETC/...

wvenable · on Oct 13, 2021

I just read an article recently about how 6502-based chips are used inside of satellite receiver boxes.

I wonder if it's just a function of the time. I imagine anything designed new now would use an ARM based microcontroller but likely when many of these systems were originally designed those were much less common and more expensive.

tzs · on Oct 13, 2021

I'd expect that there are still a lot of new designs where something like an 8-bit microcontroller such as an AVR makes more sense than using something ARM based.

mlyle · on Oct 13, 2021

It's getting harder and harder to find places where this is true.

ARM Cortex M0/M0+ blows AVR out of the water, and is usually cheaper except for the very lowest end AVR parts. Generally will use less power, too. And that's assuming your unit counts are so high that firmware developer time is free.

Of course, it's getting impossible to find 5V VCC ARM parts, so that's something that would steer you towards AVR if your system is really a bunch simpler by having a 5V micro.

milesvp · on Oct 13, 2021

This is not strictly true. Many AVR chips can handle more computation than their clockspeed would suggest due to some really nice assembly codes that allow for common DSP calculations.

I ported an AVR code base to a cortex M4 last year, and some of the inlined asm didn’t translate. I ended up having to use inlined C instead. So, my 120Mhz M4 chip struggled to do what a 90Mhz AVR did no problem.

mlyle · on Oct 13, 2021

Note that it seems you're talking about AVR32, then, and not an 8-bit AVR like we were talking about.

AVR32 was neat, but has lost all commercial relevance.

milesvp · on Oct 13, 2021

Oh, yeah, good call. I guess I would be surprised to see significant DSP related asm in any 8 bit processor.

puzzlingcaptcha · on Oct 17, 2021

You can get Attiny to sleep at 6uA with a watchdog, and ~120nA with an external interrupt. Can M0 match that? Genuinely curious, I don't have much experience with ARM.

userbinator · on Oct 13, 2021

AVR is actually rather expensive, relatively speaking. I believe it's only popular due to Arduino. Even the various PICs will be cheaper, and of course there's still a lot of (very fast) 8051 variants as well as 4-bit MCUs at the ultra-low-cost level (<$0.01).

grishka · on Oct 13, 2021

I remember reading that space probes, including Mars rovers, use what amounts to a PowerPC G3 but in a radiation-hardened modification.

tyingq · on Oct 13, 2021

Similar for the Z80...

There's an FPGA soft core called nextz80 that's supposed to do 4x more per clock cycle than a normal z80.

"Works at up to 40MHZ on Spartan XC3S700AN speed grade -4) - performances similar or better than a real Z80 running at 160Mhz."

https://opencores.org/projects/nextz80

thrtythreeforty · on Oct 14, 2021

Wouldn't be hard to beat the z80, it's microcoded and can only retire one instruction every 4 cycles

flohofwoe · on Oct 13, 2021

That idea to treat some memory regions (e.g. memory mapped IO areas) as "external memory" which cause the CPU to run at the system clock speed (instead of the much faster "internal clock") sounds like it could also work well for (software) emulators.

However for "highly integrated" home computers like the Atari 8-bitters and C64 I guess this wouldn't be of much use, because most games and demos depend on proper CPU timing, even when not accessing memory mapped IO regions (for instance in wait-loops to get to the right raster position before reprogramming the video output).

cdcarter · on Oct 13, 2021

There's a lot of discussion of wait stating and slow-clocking of 6502 and accessories over at 6502.org. In particular, many 65xx series accessories still need a common/regular clock signal to keep their internal timers going, even if the CPU is running faster/slower than normal for a specific access.

This ends up becoming a very fun design problem when you do it with integrated circuits!

whartung · on Oct 13, 2021

Well, it's not new in the 6502 world either. The Apple IIGS has to do just this to clock down its 2.8Mhz CPU selectively to work with the floppy drives and such, outside of doing that to run as an Apple II.

300bps · on Oct 13, 2021

C64 I guess this wouldn't be of much use

Correct, and even more - the Commodore 64 used a 6510 processor, not a 6502 processor. They're similar but there are significant differences.

https://en.wikipedia.org/wiki/MOS_Technology_6510

Beyond that, a 6510 isn't the only thing you really need to emulate a Commodore 64. You also need a SID Chip (MOS 6581) for sound and a MOS VIC-II for display and a number of other things.

coldacid · on Oct 13, 2021

Fortunately, there's already a FPGA plug-in replacement for the SID, which can also act like the later MOS 8580 _and_ supports stereo/dual SID mode.

[0]: https://www.fpgasid.de/

compiler-guy · on Oct 13, 2021

The cpu in the article is real hardware and pin compatible with the 6502, and they use it by putting it into 6502 sockets on old hardware, so no need to emulate the SID chip or anything else--there would be a real one available.

It would be quite easy to modify the design to full 6510, which just has a few more pins dedicated to IO. The biggest issue is properly emulating the bank switching, which they have done for other hardware, but the 6510 has a more complicated scheme.

vidarh · on Oct 13, 2021

The C64 bank switching is trivial - the 6510 has a few IO pins mapped to address $1, and a few of them are used for the bank switching.

The bigger problem is that all the RAM is really used for "IO" (in theory anyway) on the C64, as the VICII can remap the character generator (font) location, where it pulls sprites from, and where the screen content is stored.

So a static memory map is insufficient if you want it to just plug into the CPU socket and work.

_abox · on Oct 13, 2021

I think the 6502 was in its disk drive IIRC.

I always wondered in those days why the disk drives for 8-bit computers were so crazy expensive. In Holland they cost more than the computers they were meant for.

But only later I learned that they were basically another whole computer themselves. Plus the drive mechanism of course which also wasn't cheap (but not nearly as expensive to warrant the high price).

It was the same for the Atari 800XL I had, I never owned a commodore 64.

retrac · on Oct 13, 2021

It's all the odder when you consider the Apple II.

It came out before any of those other home machines, and yet had the cheapest floppy disk storage from 1978 onward. That was largely due to Steve Woz's brilliant disk controller design, which did away with everything but some simple glue logic and a couple ROM chips, lifting everything else in software.

Of course, the Apple II had real expansion slots, obviating the need for using a serial connection, too.

From what I can tell, while the Apple II family had a much higher up-front cost, the more serious you were about computing, the more the low-priced home machines with expensive peripherals worked against you in the long run.

zwieback · on Oct 13, 2021

I'd agree with that. The case size, open design, accesible FW, etc. made it really easy to add custom HW to the Apple ][, I got a lot of joy out of mine. I have to admit to being a bit jealous of the C64 and Atari kids with their better gaming capability.

rbanffy · on Oct 13, 2021

> and yet had the cheapest floppy disk storage from 1978 onward.

OTOH, the time-critical hack that allowed it also made it nearly impossible for Apple to upgrade the II without breaking backwards compatibility. The only Apple II with a faster 6502 is the //c+, and that because it has the crazy Zip Chip acceleration logic on the motherboard.

rbanffy · on Oct 13, 2021

> I always wondered in those days why the disk drives for 8-bit computers were so crazy expensive. In Holland they cost more than the computers they were meant for.

The mechanics were also somewhat expensive. In Brazil, an Apple II drive was often as expensive as an Apple II clone.

What makes the intelligent drives a great idea is how easy it is to emulate them - you emulate a nice protocol. When you have to emulate, say, an Apple II drive, you need to emulate the delays the drive mechanics introduce, as well as the head electronics, because the Apple II's 6502 is reading the head and assembling the bits. That's also why accelerating an Apple II requires you to slow it down for a longer time every time it accesses the IO region - because the disk needs to revolve in the exact time the 6502 takes to run some amount of code. With an intelligent peripheral, it doesn't matter you don't wait several seconds between commands, as long as you only issue them at the required speeds.

cbm-vic-20 · on Oct 13, 2021

Here's a demo running directly on the 1541 disk drive. It generates video by hacking the serial cable that's usually used to connect to the computer, and audio from the drive motors.

https://www.youtube.com/watch?v=zprSxCMlECA

_abox · on Oct 13, 2021

Haha, not surprised this was possible at all. But I am surprised someone spent the time in 2021 to make this.

Well done!

ngcc_hk · on Oct 14, 2021

We have breadboarding (ben eater and someone basically do a sort of os to ease the pain of doing rom etc check YouTube) and many just arudino mega hacker (one of this but there are many : https://www.instructables.com/6502-Minimal-Computer-with-Ard...).

And if not hardware easy6502 is a very good software assembler and simulator using JavaScript and canvas no less

Real hacker spirits alive.

vidarh · on Oct 13, 2021

Yes, the 1541 used a 6502.

They're compatible enough you can drop a 6510 into it with no problems (I tested that, to my parents great despair). You can also swap the IO chips I think, with various effects (you at least can drop the Amiga CIA chips into a C64 - you lose the realtime clock nobody uses, but gain timers).

Putting a 6502 into a C64 may or may not work for some values of work or not at all - I don't recall what the default for the bank switching would be, but the tape drive certainly wouldn't work (the gpio lines on the 6510 is used for bank switching the ROM, and for the tape). But it should be quite easy to make it work except for the tape drive. You just need to ensure the right voltage on 3 pins for the ROM bank switching (various software that expect to be able to change it will fail though)

vidarh · on Oct 13, 2021

The only difference that matters is the IO pins mapped to address $01. Given this is an FPGA based project it likely would be pretty trivial to make it work in a C64 as well.

bluGill · on Oct 13, 2021

Depends. Some Atari 8 bit games used the display interrupts for timing, so those could run as fast as possible so long a the display interrupt happened at a consistent 60hz. (now that I think of it, I think some games has issues in countries with 50hz updates)

vidarh · on Oct 13, 2021

This is true for a lot of older and simpler games for the C64 as well, but at the same time for most of them you'd see absolutely no benefits, since it's common for everything to hang off the interrupt.

A few games that glitches when there's too much stuff going on at the same time might run smoother.

Demos are likely to mostly not work because any remotely fancy effect tends to depend on much more precise timing, though.

tinus_hn · on Oct 13, 2021

There’s plenty of games and demos that require the exact amount of raster time to function. If you want to display things in the left and right border you have to switch something at exactly the time the graphics chip is displaying the right border. That’ll never work if the timing changes.

djmips · on Oct 14, 2021

They are already caching / stimulating system memory so why not go the next step and simulate the other devices attached to the bus. ;)

ddingus · on Oct 13, 2021

I have a 16Mhz Apple 8 bit computer.

What struck me is how lean early tools and applications really are. Many are just usable at 1 Mhz.

At 16, things are generally luxurious. Doing graphics, or writing, even running programs in BASIC all make sense and perform.

100Mhz is crazy! Frankly, one could add to the software and take advantage of the fast electronic storage available today and get real things done.

I wonder just what people will wnd up doing on a BBC Micro, or Apple 8 bit machine fitted with one of these.

I want one! Fun project.

cmrdporcupine · on Oct 13, 2021

In reality in a machine like the C64, etc. it wouldn't really run at 100mhz, because in that case the bus speed is driven by the VICII chip and the bus access is stolen by it to do its work, and this is all tied to NTSC/PAL speed. So in the design on this FPGA impl it could theoretically internally do a burst of 100mhz-esque work it would only be while the VIC-II (or equivalent in other machines) has given time over to it.

The original article touches on this, the difficulty interfacing an Atari 8 bit, C64, etc.

einr · on Oct 13, 2021

This was already an issue on the Commodore 128, which had a 2 MHz "fast" mode but you had to manually engage it, and doing so would turn off the VIC-II. Good for doing large calculations or working in 80 column text mode, but useless for games etc.

cmrdporcupine · on Oct 13, 2021

Yeah I suppose one could use a CPU similar to this 100mhz FPGA one and just have it do ~50 cycles worth of activity every time the VIC-II yielded control to it. Then it'd be idle for 50 cycles, etc. And then have a "fast" mode like you're talking about to do 100 cycles in a clock.

Memory and peripheral access would be seriously wait-stated though. 50 cycles of action doesn't do you much good if memory is slow. Especially when you consider that programs for the 65xx made heavy use of zero page / direct page as an extra bank of pseudo-registers.

So you'd end up implementing some kind of cache, or memory mirroring, or just moving the whole of RAM in the FPGA... and then you start to wonder why you didn't just do the whole thing in FPGA as a C64 SoC.

Flow · on Oct 14, 2021

I thought that the faster 6502-clone should have its own 64kb RAM and only let writes end up on original RAM. So only writes should be slowed down to the original 6510 free slot.

All reads could be from fast RAM.

Then we have hardware registers and external DMA, those have to be handled specially.

ddingus · on Oct 13, 2021

Yes, and that's precisely why the Apple 2 is still on my hobby / workbench. It's simple to interface with, and with a faster clock, remains useful.

And the Apple has slow RAM and fast RAM in a similar way. Really, to get the machine to run at 16Mhz, it's necessary to copy code into the fast RAM on board the card, leaving system RAM unused.

The Color Computer, Apple 2 and some others were made in a simpler way that did not interrupt the CPU for refresh and or video access cycles. That makes projects like this easier.

ddingus · on Oct 14, 2021

Don't get me wrong. 8 have the Apple on my hobby bench for stuff like this, but I also have my Atari and when I replace it, C64 for gaming goodness.

Just saying...

Simplicitas · on Oct 13, 2021

"The 65.02 MHz 6502" would be a much better headline

OnlyMortal · on Oct 13, 2021

Way back when, in the early 90s, I ported a unix 6502 C64 emulator to Macintosh (Classic of course). I used it to play ripped Rob Hubbard tunes when the printer ran out of paper.

It surprised the end users who’d been ignoring the original SysBeep(1) sounds the application previously used.

MarkusWandel · on Oct 13, 2021

This is somehow super retro cool. I remember the 8032 being the most serious computer I'd seen to date. Wordpro 4+ and something called "The Manager" in use at my high school, plus the rebadged daisywheel printer Commodore had at the time (CBM 6400?) just had "serious computer" written all over them, right before the IBM PC steamrollered all of that into oblivion.

100MHz. The software you could run! Add a megabyte or so of full speed, pageable RAM expansion. Every computer language right up to C++ (if it works on the 8-bit Arduino it could be shoehorned into a fast 6502 - limited stack? Who cares, just do the big stack in software. Special zero page? Just use it as glorified CPU registers).

What's the point really? But awesome all the same.

jandrese · on Oct 13, 2021

A megabyte of ram on a machine with only a 16 bit address space gets a bit silly. Do you really want to manage 16 memory pages? These aren't multitasking machines, what are you going to do with that ocean of memory you can't access without faulting?

detaro · on Oct 13, 2021

> These aren't multitasking machines

Why wouldn't a faster machine with more RAM be a multitasking machine? (Obviously without extras you are limited regarding security etc, but plenty early multitasking machines didn't have that)

flenserboy · on Oct 13, 2021

Indeed. Consider that this was done even for some of the more primitive machines of the day —

https://en.wikipedia.org/wiki/MP/M

jandrese · on Oct 13, 2021

Without a memory mapper or scheduler it's hard to write a multitasking OS.

detaro · on Oct 13, 2021

Not in the sense we usually mean nowadays, but assigning pages to processes and switching between them, with a small non-paged segment for general data/code lets you build multitasking systems. E.g. I believe early multiuser BBSes ran on systems like that.

vidarh · on Oct 13, 2021

There were tools that added multitasking to the C64, such as e.g. BASIC Lightning that let you run multiple BASIC "threads" + sprite animations at the same time.

It's easy to write a scheduler for a 6502 as there's so little to save, though you'll need to be very careful about stack usage, and you might do better with a specialised scheduler (e.g. for C64 BASIC) as a lot of code you might want to run may store additional state in fixed locations.

RetroSpark · on Oct 14, 2021

> A megabyte of ram on a machine with only a 16 bit address space gets a bit silly. Do you really want to manage 16 memory pages?

The Game Boy Color has a 16-bit address space and almost all its games are 1MB or larger (although that's ROM rather than RAM). The largest game is 8MB in size - which is managed as 512 banks of 16KB each.

MarkusWandel · on Oct 13, 2021

Well take Wordpro 4+ for example. In a 32K machine, it had enough memory for a few pages of text. It would have been relatively minor software complexity even at 1MHz to use paged memory to obtain a much bigger text buffer. Ditto BASIC could be readily rigged up to use a paged memory architecture. Not everything needs a linear address space.

reaperducer · on Oct 13, 2021

A megabyte of ram on a machine with only a 16 bit address space gets a bit silly.

There were plenty of machines kitted out with that amount of RAM. S-100 bus and other multi-user systems in the 70's and 80's could handle dozens of simultaneous users. It's cooperative multitasking, not preemptive multitasking.

rusk · on Oct 14, 2021

I can put 16MB on my c64 with my 1541 ultimate. Haven’t looked at ir much but I believe there were some commercial productivity apps such as GEOS with very basic multitasking capabilities would be well able use this. Imagine they’d had an amped up CPU too!

LargoLasskhyfv · on Oct 13, 2021

[Z00M!] https://www.pagetable.com/?cat=18

buescher · on Oct 13, 2021

Back in the day, the typical use was as a RAMdisk.

xioxox · on Oct 13, 2021

I was wondering how quick an emulated 6502 could go. This project claims 10GHz+ using JIT emulation: https://github.com/scarybeasts/beebjit

coldacid · on Oct 13, 2021

Yeah, but that's not a hard/firmware replacement. Besides, JITs are cheating :D

marcodiego · on Oct 13, 2021

That's cheating. JIT is more recompilation than emulation.

ncmncm · on Oct 13, 2021

Cheating is the name of the game.

js2 · on Oct 13, 2021

This got me thinking about the CPU accelerators that were available for the Apple ][ line, including the GS's 658C16. I googled and little did I know that there's an enthusiast community still developing for it, that's gotten the GS is up 18 Mhz:

https://wiki.reactivemicro.com/TransWarp_GS

nevster · on Oct 14, 2021

I've still got my original unaltered TransWarp GS. It was pretty much essential for the computer at the time.

djmips · on Oct 14, 2021

Agreed. I get the warm fuzzies thinking back to when I got mine and build times were slashed tremendously.

NL807 · on Oct 13, 2021

Looks like a fun project. I should get back in to FPGA tinkering. Last time I played with one was about 20 years ago. I wonder if the development environment has improved since then?

jacquesm · on Oct 13, 2021

Yes and no. Much heavier footprint, mostly the same proprietary bs, somewhat faster and much better debugging/simulation tools. Languages have improved somewhat, and as long as you stay within an eco-system and use the vendor supplied tools and software you should be mostly ok, stray outside of that (for instance: open source toolchains for cutting edge FPGAs) and you'll be in for a world of trouble. The fabrics (got a) lot larger and there are some more and interesting building blocks to play with. Higher switching speeds.

Someone who is more active in this field may have a more accurate and broader view than I do.

https://symbiflow.github.io/

Is one of the most recent and - for me - significant developments. Note that for companies that use FPGAs none of the above is considered a hurdle, though their engineers may have a different opinion and that the hobbyist/hacker market for FPGAs is so insignificant compared to the professional one that the vendors do not care about catering to it.

kragen · on Oct 13, 2021

I think there are a lot of major developments in the last 20 years, although I'm not active in the field. Symbiflow is largely a distribution of yosys, a bunch of other IceStorm projects, and nextpnr (is that part of IceStorm?), in the same sense that Debian is a distribution of Linux. Another one, but I think limited to Lattice FPGAs, is https://github.com/FPGAwars/apio.

I think the biggest development, though, is that there's enormously more off-the-shelf Verilog and VHDL, not just on OpenCores like 20 years ago, but also on GitLab, GitHub, and so on. Easy examples are CPUs like James Beckman's J1A, the VexRiscv design used in Bunnie's Precursor: https://github.com/SpinalHDL/VexRiscv (as little as 504 Artix-7 LUTs and 505 flip-flops), and Google's OpenTitan.

But from my POV the more interesting reason for using an FPGA is for things that aren't CPUs. For example, the SUMP logic analyzer and its progeny OLS https://sigrok.org/wiki/Openbench_Logic_Sniffer (32 channels at 200MHz), although I think both of these require the proprietary vendor tools to synthesize. I'm gonna go out on a limb here and guess that reliably buffering up data at 6.4 gigabits per second is not a thing that any CPU can do, even one that isn't a softcore; CPUs that run at speeds high enough to potentially do it invariably depend on cache hierarchies that monkeywrench your timing predictability.

As I said, though, I'm not active in the field, so all I know is hearsay.

marktangotango · on Oct 14, 2021

I'd also add as a bystander, that fpga providing onboard hardware for things like DDR3/4 and PCIe seem like significant development. Really tremendous performance is available in these devices.

_abox · on Oct 13, 2021

Wow... Really really well done.

I love people that throw themselves at a problem that has no real use but just want to master technology. Kudos!!

The board design is a masterpiece too. Really clean.

e-bastler · on Oct 13, 2021

Thank you so much! And kudos for fully understanding the intent and spirit of the project. :)

userbinator · on Oct 13, 2021

This is not far in concept from modern CPUs where only the core and cache run at full speed, although it might be the first 6502 implemented this way. It reminds me of the upgrade processors for 386 and 486 PCs that had a Pentium and cache in a socket-compatible format.

TacticalCoder · on Oct 13, 2021

> It reminds me of the upgrade processors for 386 and 486 PCs that had a Pentium and cache in a socket-compatible format.

"Pentium overdrive": https://en.wikipedia.org/wiki/Pentium_OverDrive

userbinator · on Oct 13, 2021

...and here's one in a 386 motherboard, using a 386-to-486 adapter: https://dependency-injection.com/pentium-on-a-386-motherboar...

corysama · on Oct 13, 2021

If that's not enough, you could check out "Clocking a 6502 simulator to 15GHz" https://news.ycombinator.com/item?id=22859706

AlexCoventry · on Oct 13, 2021

Imagine a beowulf cluster of these.

fentonc · on Oct 13, 2021

It's as awesome as you imagine: http://www.chrisfenton.com/the-zedripper-part-1/

rzzzt · on Oct 13, 2021

Imagine a GPU based on 512 of these, with each unit responsible for rasterizing a 128×128 pixel subsection of a 4K frame.

zsmi · on Oct 13, 2021

I don't think that would make a very good GPU as it wouldn't fit the data model well [1] as the instructions and data need to share the same bus which would mess up streaming.

Your suggestion is closer to a grid computer but even then I don't think an unmodified 6502 would be a great choice because the memory model (or lack thereof) would really restrict performance.

[1] https://booksite.elsevier.com/9780124077263/downloads/advanc...

rzzzt · on Oct 13, 2021

Surely the beowulf cluster was a better idea :)

Intel did entertain a similar thought for a while, as far as I can understand: https://semiaccurate.com/2012/08/28/intel-details-knights-co...

zsmi · on Oct 13, 2021

I get that it was mostly a joke. :)

The LAN controller used to make the Beowulf cluster would probably have more compute (and memory) than the 6502 itself.

The Intel cores in the linked article have a distinct L1 data and instruction caches inside them, and associated L2 caches, which makes a big difference in comparison to the 6502.

marktangotango · on Oct 14, 2021

Dreamcast hardware did this sort of "tiled" rendering if I recall correctly.

https://www.copetti.org/writings/consoles/dreamcast/#archite...

amatecha · on Oct 13, 2021

I wonder if this would be possible to run on the Xilinx Kintex-7 FPGA employed in the currently-in-dev SoM board for MNT Reform laptop[0]? I know very little about FPGAs so I don't know if it's easily applicable to other boards or what. I mean, I'm not saying it would be _practical_, but it would be pretty cool! haha :)

[0] https://www.crowdsupply.com/mnt/reform/updates/post-campaign...

vidarh · on Oct 13, 2021

The Kintex-7's are much bigger than the Spartan 6's. Getting a 6502 core into them wouldn't be the issue. The special thing about this board, though is that it's designed to be pin compatible with a real 6502 CPU, so you can plug it straight into a real 1970's/1980's home computer that used a 6502. If that's not what you want to do you'd need other glue logic to interface the CPU core with whatever you want to interface it with.

Teknoman117 · on Oct 13, 2021

It should be. Looks like bog-standard verilog to me. It just implements the standard 6502 bus rather than something like AXI or wishbone.

(link pulled from the references section of the post)

https://github.com/hoglet67/CoPro6502/tree/master/src/Arlet

jhgb · on Oct 13, 2021

Finally, C64 GEOS running at acceptable speeds? That might be an interesting thing to try...

coldacid · on Oct 13, 2021

Nope. This thing doesn't talk 6510 nor supports the ROM banking used by the C64.

vidarh · on Oct 13, 2021

6510 compatibility only requires adding 6 IO lines mapped to address $1 though (otherwise the 6510 is 6502 compatible enough), so given it's an FPGA based project it wouldn't necessarily be a big ask, and from the webpage it sounds like they're open to suggestions.

renewedrebecca · on Oct 14, 2021

I don't think memory at $00 and $01 are the only challenges. The 6510 has a tristateable bus. A regular 6502 doesn't. The C64 used this to disable the 6510 when the Z80 in the cp/m card was active. (I think it also might be used to allow the VIC II to take over the bus as well, but I'm not positive about that.)

cmrdporcupine · on Oct 13, 2021

6510's clock is driven from the VICII. So the 100mhz would be paused half the time while the VICII did its thing.

vidarh · on Oct 13, 2021

Actually the bigger problem with this design on a second read through is that it tries to mirror the ROM and RAM into a 64K on-chip RAM. That won't work on the C64 because of the bank switching and the fact the VICII can access memory everywhere. You'd have to change it to use the on-chip RAM as a smarter cache.

If you were to disable the use of the on-chip RAM it'd be stalled far more than half the time, as it'd be unable to fetch instructions fast enough.

coldacid · on Oct 13, 2021

That still doesn't address the ROM/RAM switching done by the C64, which this project doesn't support (at least not in accelerated/turbo mode).

vidarh · on Oct 13, 2021

The bank switching uses three of the IO pins, so if you add support for those as I mentioned that fixes both the bank switching and the tape drive (which uses the remaining IO pins).

EDIT: Actually you're right that there's a problem with the bank switching here since it tries to mirror the system RAM/ROM, and it won't be able to as it has only 64K on-chip RAM. You could conceivable get it to work by designating the entire address space as an "IO area" but it'd totally kill performance.

jhgb · on Oct 13, 2021

Quoting from the page itself:

"It may be possible and worthwhile to also support some slightly later machines: The Acorn BBC Micro, Atari 400 and 800, and maybe the Commodore C64 come to mind."

So support is definitely under consideration.

vidarh · on Oct 13, 2021

It'll take a more extensive modification than I thought, though, because it does a RAM mirroring thing that is necessary for performance but that kills compatibility with at least the C64.

If you disable the RAM mirroring, all you need to make it compatible is to map the 6 IO pins to address $1. That "solves" the bank switching, but at the cost of killing performance totally as the chip will be starved for memory access most of the time.

Judging from his pictures, he's using a version of the Spartan 6 (XC6SLX9) that has 72KB on-chip RAM, though, so unless he's using any of the RAM for anything else he could still mirror both the 64KB RAM + the KERNAL and BASIC ROMs. But he'd also need to keep track of various VICII registers to know which areas to designate as a "IO areas" to pass through writes for, given the VICII can address memory "everywhere" for sprites, fonts and bitmap data depending on what you write to different registers. Since that can at any time it'd involve a lot of "fun" logic to flush data from the on-chip cache to the C64 memory if a register changes.

6510 · on Oct 13, 2021

I often ponder all the work put in to make electronics execute series of instructions one by one but where it is undesirable we often lack abstraction for parallel computing. The funny part is how parallel is so "easy" in electronics. My gut says there has to be some simple approach we've all overlooked.

analognoise · on Oct 14, 2021

VHDL and Verilog on FPGAs/ASIC is parallel - we don't think serially at all!

xgkickt · on Oct 13, 2021

When diamond semiconductors get to sufficient transistor counts, I wish for an 80GHz 6502 demonstration device.

ncmncm · on Oct 13, 2021

There are supposed to be 50GHz gallium-nitride FPGAs. Might not be big enough to host a 6502 and all its RAM, though. I gather they are mostly used in military SDRs.

marktangotango · on Oct 14, 2021

Where would one learn more about these exotic parts?

ncmncm · on Oct 16, 2021

Probably by working at a military contractor. The parts are probably still way too expensive for any other use, and probably too specialized for signal processing.

JoachimS · on Oct 13, 2021

Way back in 2003 or so, we at my old company InformAsic designed a single chip transparent VPN solution for serial communication (RS-232, RS-422).

The control and protocol handling part of the was a modified 6502 core with a sort of MMU and single cycle zero page registers. The whole thing was clocked at 33 MHz, probably making it the fastest 6502 in production at that time. Not that we sold that many of the devices...

As an old c64 scener, I really enjoyed being able to code the application SW using my favorite ASM tools. Though most of the code we actually compiled from C using cc65 and then hand tuned to fit the mem constraints.

Today a simple Cortex M0+ MCU (with internal AES core) would be able to do what we did, and probably be smaller and require less power.

klelatti · on Oct 13, 2021

Out of curiosity, how does an M0+ + MCU manage to be smaller and more power efficient than something as simple as a 6502?

JoachimS · on Oct 14, 2021

ASIC process generations. The original MOS 6502 was manufactured in really big process - when "Contact" actually meant making contact on the plastic sheets that became the masks. Huge transistors, 5V power supply etc.

Modern Cortex M0+ chips are probably manufactured using 90nm, 65nm process nodes (or possibly even smaller - but the die size will become I/O-bound. Though you can then add more memory easily without driving up the die size). They have much lower core supply, much better I/Os - and low power modes.

In our specific case, we used I believe 250 nm, or possibly even 350 nm ASIC process. And size in this specific case also related to the package. We used a QFN. Today you can get a M0+ based MCU with low number of exposed I/Os in a small BGA or WCP packages that is just a few mm2.

The idea we had at the start was a chip small enough to fit inside the connector of a serial wire, require so little power to not need external power (basically harvesting), be fast enough to not reduce bitrate. Add very low and fixed latency. And be transparent to (after configuration) be totally transparent as seen from the application. Basically a secure serial cable. But reduced to be an extra cable connector that is inserted between an IoT, SCADA device and its serially connected modem.

Due to the process node we didn't really get there. But today this is basically feasible using off-the-shelf MCUs.

I actually found a few of the chips and one of the cable connectors/dongles Yesterday. So I still have a few 33 MHz 6502s ;-)

klelatti · on Oct 14, 2021

Thanks for a really comprehensive reply - very interesting.

compiler-guy · on Oct 13, 2021

The 6502 isn’t manufactured with modern processes and technology. It’s still stuck in the eighties for that sort of thing. So basically everything even a few years newer will be smaller and more power efficient.

klelatti · on Oct 13, 2021

Of course an M0 on a modern process node would be much smaller but I’d interpreted (probably wrongly) the OP as going further and saying that the M0 would be smaller on a comparable node. With 12k or so gates on an M0 that doesn’t seem possible although maybe modern power management would make it more power efficient.

JoachimS · on Oct 14, 2021

No, that is not at all what I meant. But I agree that I could have been more clear.

andrewstuart · on Oct 13, 2021

Off topic but the eZ80 supported up to 50Mhz https://en.wikipedia.org/wiki/Zilog_eZ80

Datasheet: https://www.mouser.com/datasheet/2/240/ps0066-2584677.pdf

You can buy them still:

https://www.mouser.com/Semiconductors/Embedded-Processors-Co...

marcodiego · on Oct 13, 2021

> same footprint and pinout as the original 6502

I like this idea. This may bring a new level of repairability for devices whose chips will, one day, no longer be manufactured.

coldacid · on Oct 13, 2021

This is often the first consideration after "does it match the behaviour of the original chip perfectly" when dealing with FPGA reimplementations of other chips.

MicroCoreLabs · on Oct 14, 2021

The MCL65+ is also a drop-in replacement for the 6502 which uses an 800Mhz microcontroller to emulate the CPU, so it can run in cycle accurate and accelerated modes. https://microcorelabs.wordpress.com

MicroCoreLabs · on Oct 14, 2021

PS: Same code also ported to the MCL64 which fits in the 6510 socket of a Commodore 64.