And each 8-bit CPU had its specialties, and it wasn't uncommon that computers were designed around those specialties.
For instance when you look at the video memory layout of Z80 computers like the ZX Spectrum or CPC, those often have a weird non-linear arrangement, which only makes sense with the special 16-bit register pairs on the Z80.
E.g. when the 16-bit register pair HL is used as a video memory address, the memory layout was often such that H and L could basically be used as X and Y coordinates. E.g. incrementing L gets you to the next character on the same line, and incrementing H to the first pixel line of the character line below.
The KC85/4 (East German home computer) even had a 90 degree rotated memory layout of 320x256 pixels, 8 horizontal pixels grouped in a byte, so the video memory was a matrix of 40 columns by 256 lines.
Put the column (0..39) into H, and the line (0..255) into L, and you have the complete video memory address (or rather offset, H must 0x80 + column, because video memory started at 0x8000). Increment or decrement H to move to the next or previous column, and L to get to the next or previous line.
edit: messed up the number of columns, it's 40, not 80 :)
The instruction set was very limited (in comparison to i386 etc.) so the learning curve was not steep and writing in assembler was, in practice, no more time consuming than using C now - once you were in the flow.
However due to the resource constraints you not only had to dry out everything into functions as you would with C today - but often had to manually mutate parts of the function instruction set code by poking in new values before calling to change the behaviour to avoid wasting memory on branches etc.
For large writes people would often move the stack pointer around as PUSH BC etc. were faster than LD at writing an address and inc(dec?) the pointer. I seem to remember IX/IY being avoided as much as poss as they were quite costly.
EDIT: The code in article addendum is OK. The author finally caught up to Z80 style after many trials and errors. This is how idiomatic Z80 assembly looks like.
Z80 assembly tends to be very easy to read because you don't have to keep a mental map of what lives in which register, A really is the accumulator and the only register that you can use to do anything more complex than inc or dec to.
And see how much more effective the 6502 set is when it comes to empowering the various registers. The 6502 does not need 'shift' codes (slow) either.
I've programmed both, and even though they both have their charm I would prefer to code the 6502 for the same problem (and I'd much prefer to use the 6809 over either).
Didn't have much trouble storing things in B/C/D/E/BC/DE, at least I don't remember it as a problem. The innermost loops had to get top priority when deciding what each register was used for, that's all.
That’s not my experience. I once ported a program from the 6800 (the mother of the 6502) to the Z80. It was very straight forward. But to my surprise the version on the much “fancier” Z80 turned out to be both larger and slower despite the Z80 ran a little faster clock speed.
I'm als influenced a lot by my personal circumstances, as though the Z80 took generally more cycles than the 6502, as I moved from a 1mhz 6502 (C64) to a 4mhz z80 (CPC) in general the z80 felt faster.
That's the disassembly of the whole ZX Spectrum ROM which has the built in BASIC interpreter with the floating point support.
Also, the article explicitly mentions it does more loads and stores than necessary (”In real life, values from one expression will remain in registers for the next, and so won't need to be reloaded; the examples are all deliberately choosing the worst possible case.”)
Finally, writing to memory wasn’t as bad in those days as it is today (or rather: using registers wasn’t as fast as it is today). Writing to a fixed address, for example, only took twice as long as a register-to-register move (4 cycles vs 2 cycles on a 6502, if I googled that correctly)
The other thing is there is a trade off between number of registers and instruction size. With 8 bit machines you see that for instance where only certain addressing modes can be used with certain registers. You don't have enough instruction space to encode for every addressing mode for all the registers you have.
I had no qualms at all to use memory as variables when convenient. I didn't have to use the stack at all in all the assignments as 16 registers and a handful DB sprinkled through the code for more locations to write to were enough – the disassembler in the debugger didn't like code interspersed with data, though.
This is how a lot of microcontroller architectures work, and indeed there are C compilers for them too.
It's misleading to call the Z80 "terrifyingly slow" based on cycle counts without mentioning that it clocked higher than the 6502. E.g. the Apple II ran at just over 1MHz, while the ZX Spectrum ran at 3.5MHz (although with wait states for accessing the memory area shared by the graphics hardware).
There was a 16MHz Z80: Amstrad briefly built a machine on it in the 1990s (the PcW16).
EDIT: Oh, the certificate is with the product? Well, there might be some smaller players who currently can't compete with TI who could handle the production side.
Hardware-wise it should definitely be easy enough to create a cheaper calculator just as capable of meeting the requirements, no?
However IIRC most contemporary architectures were closer to the Z80, so the 6502 was considered impressive because it was competitive despite a much lower clock rate.
LD A,(ADR1) LOAD OP1 INTO A
LD HL,ADR2 LOAD ADDRESS OF OP2 INTO HL
ADD A,(HL) ADD OP2 TO OP1
LD (ADR3),A SAVE RESULT RES AT ADR3
I should go dig out an emulator soon! (I usually have a game or two of "Chaos: The Battle of Wizards" every six months or so.)
I would have enjoyed that kind of framing much more than "lol the z80 sux!"
* Not saying the z80 was implemented with all nand, just providing the figure as a reference.
There’s also a set of “alternate registers” you could swap in with exx, which sometimes enabled faster arithmetic without hitting memory.
One notable thing here is they are writing a code generator and not actually programming in assembly. As a result their modules need to be more general purpose than if directly programming.
LDA ($40,X) ; if X == 0, and $40 is your pointer.
I've just had a quick look at an instruction cycle table and it seems that without indexing they took 4 cycles more than HL then with an index that increased to 12 more.
Ref from search returning http://www.z80.info/z80time.txt
That was during my schooling years of 10, 11 and 12 prior to going on to study engineering.
While studying engineering, I then got to work with 6502 assembler and while I have no doubt that earlier Z80 experience help greatly, I still remember thinking, writing assembler for the Z80 seemed to be so much easier than coding for the 6502.
My first job out of college was writing z80 assembly language at Cromemco. Later we ported everything to the 68000. We didn't write anything in a higher level language because it was too slow. In fact the entire CDOS and Cromix OSs were written by a single person. He originally wrote everything in c - but when he ran it was so slow. He then rewrote everything directly in z80 assembly language and kept the c code as comments. Raw c code were the only comments in the code.
I wrote the graphics drivers for screen and printer and a wysiwyg word processor. There were no floating point processors. All math was in the registers (as stated in the article. You can still render a lot of graphics by converting your renderer to additions and multiplication by 2 (register shift left and right). I was happy to find years later that code that I derived to render circles and arcs using only 1 bit step and multiplication by 2 was also derived by someone else and published in graphics books. You live within limitations when that is your only option.
Still, compared to the 6502, this one had lots of comfort.
I think the author might to write that only A can be indirectly written to or from memory but even that isn't correct as demonstrated by the code bits that retrieve and store from and to ram @HL (and IY/IX+blah). The 8080 was able to directly read and write HL. The Z80 could do it for IX, IY, and IIRC BC and DE too.