

The 6502 CPU's overflow flag explained at the silicon level - unwind
http://www.arcfn.com/2013/01/a-small-part-of-6502-chip-explained.html

======
jeffbarr
I loved that chip!

My first real programming job entailed writing a blazingly macro assembler
from scratch for the 6502, using a much slower (and non-macro) assembler from
the vendor (Ohio Scientific, for those of a certain age).

I simulated the proposed hashing algorithm in FORTRAN at the community college
I was attending, and found that it led to a lot of collisions for the base
6502 opcode set. When I pointed this out to my manager (who's still my friend
32+ years on), he made sure that I got my first raise.

I still have those listings and the original design documents around
somewhere.

------
raverbashing
Very interesting

And this is 30 year old tech. The 6502, the processor in you cell phone is
about 1000x faster than it (and much more capable)

To me the hardest part (apparently) is converting the electronic circuit to
the actual chip drawings. Not sure how this is done (how do you route it). And
this was done _by hand_ in the 6502, the drawings were done the size of a desk
and reduced photographically. (IIRC)

~~~
melling
How did you come up with 1000x? With Moore's law we are only 3 orders of
magnitude better than a processor from the late 1970s?

~~~
zxcdw
The x1000 is a huge understatement. For example these days CPUs are much more
optimal in terms of _cycles per instruction_ and inversely _instructions per
cycle_. Back then when multiplication of two word-sized(8 bits back then)
values took 24 cycles, these days we can do that in 12 cycles for 64-bit
values. Because of superscalar processing and thus instruction level
parallelism, we can typically do 2-4 ALU operations in parallel(given that
there's no data dependencies) and thus increase the instruction throughput 2-4
fold. Then, because of SIMD features and data level parallelism we can do same
operaton on multiple data(say, operate on a vector of 4 elements in a single
cycle) and thus we eliminate the need for repeated instructions.

This all gets a bit complicated in modern days because of memory access costs
and caches which try to alleviate the costs, but the idea is that modern CPUs
are likely to be around 10 times as fast _per-clock_ as 6502 and because of
multiple cores and threads that value goes to something like 40-60. Add the
huge increase in clock speed and you're a bit south from x100_000 in optimal
case.

I would hope every programmer would write some core on a C64 to really learn
how much RAM the 64 KB really is. You can actually _waste_ some of it and in
some cases it really is "enough so that I don't have to optimize". :) Real
hard-core people would go with VIC-20 which as only 5120 bytes of RAM, or
Atari 2600 with 128 bytes of RAM. One could imagine there's nothing you can do
with them but oh boy how wrong one would be! Heck, a single tweet is 140
characters. And you can fit that in 128 bytes. You really can... :)

~~~
mkup
There days we have kilobytes of RAM on Arduino and other AVR boards.

------
js2
This is great. If you enjoyed it, you'll probably really like the (oft
recommended here) "Code: The Hidden Language of Computer Hardware and
Software" by Charles Petzold.

<http://www.charlespetzold.com/code/>

------
awy
Also a great relevant read:

How MOS 6502 Illegal Opcodes really work <http://www.pagetable.com/?p=39>

I just finished a 6502 emulator, I recommend doing it for everyone. Lots of
fun and very interesting.

