
Dirty tricks 6502 programmers use - nurpax
https://nurpax.github.io/posts/2019-08-18-dirty-tricks-6502-programmers-use.html
======
teh_klev
Not to belittle the article, because it's definitely interesting. But as an
ex-BBC 6502 programmer, my nitpick here would be that the title should really
be named "Dirty tricks C64 6502 programmers use".

On the beeb we had our own set of tricks specific to the memory layout and ROM
of our beloved beige and black machines.

~~~
reaperducer
_title should really be named "Dirty tricks C64 6502 programmers use"._

In that case, probably _Dirty Tricks 6510 Programmers Use_ would be even
better.

~~~
vidarh
While that may be technically correct, the programmer-visibile difference
between the MOS 6502 and the MOS 6510 is totally incidental in this case - the
6510 has a a built in 6/8-pin IO port (partially used for bank switching the
ROMs in the C64). Unless you touch the IO ports, they should behave
identically, down to the cycle timings of instructions and the same behavior
of undocumented opcodes.

In this case, the real C64 specific tricks are not 6510 specific, but
depending on the specific initialization done by the C64 ROM and calling C64
ROM routines.

~~~
tenebrisalietum
A real "dirty trick" is reading the actual RAM in memory locations 0 and 1 on
the 6510 (and not the I/O port values).

------
rusk
_> Entries were posted as Twitter replies and DMs, containing only the PRG
byte-length and an MD5 hash of the PRG file._

This is clever. So basically rather than getting bogged down reviewing
submissions you just pick a winner and then validate post-hoc! (because when
you win the hash of your code has to match the one you submitted)

~~~
saagarjha
I wonder if you could brute force a particular solution with that information.

~~~
stjo
I would say no. 34 bytes is
256^32=7588550360256754183279148073529370729071901715047420004889892225542594864082845696
combinations, and even if you could easily narrow it down to only valid
programs you would still need to simulate it, which is way slower than
computing a hash.

~~~
saagarjha
Only a fraction of those would be reasonable programs, and you can test almost
all of them immediately by computing a MD5 hash.

~~~
userbinator
34 bytes is equivalent to bruteforcing a 272-bit key. It's already _physically
impossible_ to do that for a 256-bit key even if you ignore everything other
than incrementing the key counter itself:

[https://pthree.org/2016/06/19/the-physics-of-brute-
force/](https://pthree.org/2016/06/19/the-physics-of-brute-force/)

~~~
saagarjha
But as I said, you’re not brute forcing the entire key space because you
likely have an idea of at least some of the bits.

------
Zelphyr
What, if any, modern products still use the 6502?

EDIT: That’s not a dig against the 6502. I still fondly remember leaning BASIC
in my C64 and wish now I had ventured into Assembly with it. By today’s
standards it seems to have a simpler and more approachable instruction set so
I’m wondering if there aren’t products I could hack on to learn Assembly with
it. Or maybe I should just break out my old Commie.

~~~
Gibbon1
I'm unsure but I think 6502's are available as cores for semi and full custom
IC's. Where the the processor core and memory is fully laidout. Bonus runs
with a GHZ clock which gives you the ability to twiddle bits like mad.

So outside of retro computing you won't see a 6502 IC in the wild. But they
likely are buried deep in nondescript IC's

~~~
fulafel
[http://www.6502.org/commercial](http://www.6502.org/commercial) lists some of
these.

------
fortran77
Thanks!

6502 is still my favorite architecture, even though I've done assembly
language programming (professionally!) on many platforms in the past 35 years.

~~~
bcook
Why is it your favorite?

~~~
eej71
I'm not the OP, but I always liked its simplicity. There's just enough space
to do something interesting without getting bogged down in too much
complexity.

~~~
fortran77
I think it's because it was my first. And it is simple. Yes there are many
addressing modes (like zero-page, and "absolute indirect", and "indexed
indirect") but there's only 56 instructions and you can learn it in one
afternoon.

And doing something useful with 56 instructions, 8-bits at a time, is like
solving a puzzle.

~~~
pvg
_8-bits at a time_

It's a detail that doesn't always come up in these threads but it's worth
remembering how belligerently 8-bit a 6502 is. Not only are there next to no
general-purpose registers but they're 8 bit and there are no pretend-two-
registers-are-one-16-bit-register instructions at all. You can't put an
address in a register. Compared to even other popular 8 bit CPUs of the time,
that's a bit metal.

~~~
ddingus
It is. There is always 6809 for a bit more civilized fun, IMHO.

~~~
NikkiA
Personally I preferred the 6800 family (6800, 6802, 6805) over the 6502, but
the 6809 always felt a little too far.

~~~
jacquesm
The 6809 is an amazing little processor, you can run multi-tasking and
relocatable code on it with relative ease. And with some bank switching magic
you can even do that with appreciable amounts of RAM for each task. It is also
one of the few instruction sets that is very predictable, if you know some
base formats then you can 'compose' instructions and they usually exist as a
valid opcode.

~~~
jonsen
“UniFLEX is a Unix-like operating system ... for the Motorola 6809”:

[https://en.m.wikipedia.org/wiki/UniFLEX](https://en.m.wikipedia.org/wiki/UniFLEX)

------
kazinator
My first 6502 program was self-modifying; I wrote it just before reading the
book chapter on using registers for indexing relative to a base address. That
book was _Programming the 6502_ by Rodney Zaks.

I have some 1986-dated 6502 assembly code of mine in hard copy (on dot matrix
paper with the "holes" intact). I'm going to scan it one day and post.

~~~
vidarh
A lot more 6502 code was self modifying than necessary - I know a lot of
people (myself included) did not pick up zero-page indexed indirect/indirect
indexed address modes and instead kept using absolute x/y indexed and modified
the absolute part for larger loops. A large part of the reason why I didn't
learn about it until fairly late was that I mostly saw absolute x/y indexing
in the code I looked at to learn. It's interesting how many bad habits you'd
see in code like that, given e.g. the C64 ROMs were extensively dissected and
documented and published, and they used zero page all over the place.

~~~
ddingus
Raises hand. Yeah, me too.

When I first got zero page, I thought it looked like up to 128 address
registers, with only a couple cycle penalty.

But, like you, a lot of code self modified the easier a solute indexed address
mode instructions.

And it was right there, easy to see.

------
js2
Sorta related, some 6502-based demos:

[https://hackaday.com/2018/12/05/apple-ii-megademo-is-
countin...](https://hackaday.com/2018/12/05/apple-ii-megademo-is-countin-
cycles-and-takin-names/)

Apple II, but in the comments are links to a C64 demo, a IIGS demo, and a ZX
spectrum demo (which is Z80, not 6502, but same era).

------
LarryMade2
There can be a significant trade-off on size vs speed, the more tricks you do
to shave down bytes usually adds to the complexity of the iterations.

So assembly programmers may go for the more kludgy looking code as the
execution far outpaces the optimized byte count version. Ive heard of such
things in video timing and game loops.

~~~
jwr
This really depends on the specific architecture and the application. In some
cases, you will want to optimize mostly for size, so that your hotspots fit
entirely into I-cache. Modern CPUs spend most of their time waiting for data
(or instructions) to become available, so often computations are essentially
free.

~~~
gpderetta
the average IPC over a variety of loads is, IIRC, estimated to be ~1. So no,
most modern cpu do not spend most of their time waiting for data.

