Hacker News new | past | comments | ask | show | jobs | submit login
Sweet 16: The 6502 Dream Machine (1977) (1emu.net)
82 points by jacquesm on March 2, 2015 | hide | past | favorite | 20 comments

I've been working for the last two months in a personal project to build a 8-bit microcomputer based on AVR MCUs. Because these MUCs use a Harvard architecture I implemented a 6502 VM to run programs from external SRAM.

But I digress; the relevant bits for this story are that I'm using CC65, a compiler for the 6502, so I can write programs in C for my toy computer.

It's impressive the amount of stuff CC65 does in its runtime just to implement C. Sure the 6502 is a beautiful processor (I loved implementing the VM!), but it is also very limited.

Just an example: C65 implements its own stack in regular memory space because the actual stack of the 6502 is limited to 256 bytes!

Every time you pass a pointer as a parameter in a function call the compiler loads the 16 bit address in A and X and pushes it into its own stack, where the the destination will access to it as needed.

Look at the pushax implementation here: https://github.com/cc65/cc65/blob/master/libsrc/runtime/push...

All that just to "push value in a/x onto the stack". Amazing :)

EDIT: typo

The cc65 project also has a very excellent 6502 assembler, ca65, which I find quite nice when writing assembly by hand. Together with ld65 it implements a full unix-style module/library-archive stack for 6502 object code. The assembler's macro system is quite well developed, as well, to the point that I prefer using it over cpp.

Do you prefer it over KickAssembler? http://www.theweb.dk/KickAssembler/Main.php

ca65 is the only 6502 assembler I've used, but I have experience with other CPUs assemblers and I must say I like ca65's macros a lot.

Just an example: C65 implements its own stack in regular memory space because the actual stack of the 6502 is limited to 256 bytes!

256? That's twice as much as all of a standard 8051's RAM, which is only 128 bytes! The 8051 is probably the other 8-bit Harvard MCU that's still in widespread use, and compilers have managed to make C run on it; external stack emulation is one of the things they do too. It also happens to be one of the very few architectures with an upward-growing stack (a push increments the stack pointer.)

The 8051 is an MCU, the 6502 is not!

May be my comment was confusing: I'm implementing a 6502 VM in an ATmega MCU; but my comment was about 6502's limitations and that I can relate to Wozniak's frustrations dealing with 16 bit data.

> still in widespread use

Wait, where, how? And why? (Honestly curious!)

There are many 8051 clones available.. it's one of the cheapest microcontrollers you can buy, but popular no doubt due to installed base plus tools from Keil.

But it's certainly not the only Harvard architecture processor still in use. Microchip's PIC is the other big one- in fact it predates the 8051, having been available from General Instruments since 1976.

I'm finding 8051 variants integrated into SOC's for capacitive touch screens, RFID readers, and even in contactless "smart" cards.

Hah, my first though on reading this was that the software stack implementation must slow the machine down horribly. Then I remembered that this was back in the day before cache memory and other optimisations existed, making the technique close to comparable to using the hardware-implemented stack.

Caches existed, just not in microprocessors. The IBM 360 model 85 introduced caches in 1968 [1]. The DEC PDP-11/70 had one in 1975 [2], the same year the 6502 debuted.

The Motorola 68020 was the first Motorola microprocessor with a cache in 1984, and the 386DX was Intel's first in 1985 [Wikipedia].

I agree, it's kind of shocking to read about the designs people could get away with before caches became important. It also illustrates why it's so important to continually re-evaluate any programming doctrine related to performance and data organization.

[1] http://www.bitsavers.org/pdf/ibm/360/funcChar/A22-6916-1_360..., see references to "buffer storage"

[2] http://www.pdp-11.nl/pdp11-70startpage.html

Not so - this routine actually takes a long time by comparison, and I've no doubt the execution will be shamefully slow :) To push 16 bits to the hardware stack on the standard 6502 takes 8 or 9 cycles (depending how you retrieve the second byte); this routine takes 50 (as a round trip including the JSR).

The Internet Archive has the November 1977 issue of Byte, containing this article on page 150, here: https://archive.org/details/byte-magazine-1977-11

I've posted this because I really like the elegant way in which he expanded the 6502 instruction set. Something similar could be done with just about any other processor.

If you like the "call a subroutine, subsequent data is interpreted as custom instructions" on the 6502, check out Super-Mon / NakedOS by Martin Haye. Very nicely done, incredibly minimal programming environment / OS for the Apple II: https://bitbucket.org/martin.haye/super-mon/wiki/Home

John Draper once asked a friend of mine if he wanted to go into the back of a van and execute an 1802 instruction. Can you guess which one?


Because of the 16-bit address bus, and the 8-bit data bus, the sixteen general purpose registers are 16 bits wide, but the accumulator (the so-called data register, or D-register) is only 8 bits wide. The accumulator, therefore, tends to be a bottleneck. Transferring the contents of one register to another involves four instructions (one Get and one Put on the HI byte of the register, and a similar pair for the LO byte: GHI R1; PHI R2; GLO R1; PLO R2). Similarly, loading a new constant into a register (such as a new address for a subroutine jump, or the address of a data variable) also involves four instructions (two load immediate, LDI, instructions, one for each half of the constant, each one followed by a Put instruction to the register, PHI and PLO).

Still being ported to old systems and used in, for example, the Oric-1/Atmos machines:


(Old computers never die, their users do!)

There is a 16-bit version of the 6502: the 65816. It would have been cool if the 65816 was an implementation of the Sweet16.

It supports both 16-bit and 8-bit mode. I find it quite annoying to have to keep track of which mode a register is in.

Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact