Hacker News new | comments | show | ask | jobs | submit login
Why the Z-80's data pins are scrambled (righto.com)
137 points by zdw on Sept 26, 2014 | hide | past | web | favorite | 37 comments

The motivation behind splitting the data bus is to allow the chip to perform activities in parallel. For instance an instruction can be read from the data pins into the instruction logic at the same time that data is being copied between the ALU and registers.

Essentially pipelining, several years before the RISC movement popularised it? Could the Z80 have been one of the first pipelined single-chip CPUs?

That was a very interesting article. I've tried staring at the Visual6502 chip images for a long time, and although I understand the principles behind how diffusion/polysilicon/metal layers are put together to form transistors, for some reason I feel absolutely lost trying to follow the connections and find the borders between the regions especially when one layer is hidden beneath another.

Even looking at the NOR gate with its layout side-by-side I can't see much beyond the metal layer, despite it being partially transparent. I have no problems with transistor-level schematics, however. Is there some sort of trick to being able to easily read and follow the circuitry in die images and layout-level diagrams? It's like some people can read these and visualise/draw the schematic immediately.

The 6502 also has a similar 1-stage pipeline, and the concept was old already at that point, though not used much commercially outside of Cray's designs for CDC.

Essentially pipelining, several years before the RISC movement popularised it? Could the Z80 have been one of the first pipelined single-chip CPUs?

I don't think you can call it a pipeline, they only state that the different parts of the instruction are being decoded in parallel. For instance, which instruction will be executed and the register associated with the operation. There is no instruction pipeline.

There was a JPL probe years ago (can't remember which, and can't seem to find a reference) that had a radiation hardened memory IC with error correcting codes and a system to detect and correct the bit flips that were expected due to cosmic rays.

After launch, the number of unrecoverable errors (due to multiple bits flipped within the same codeword) was higher than expected. It turned out that someone had swapped some combination of address or data lines, which ended up changing the physical grouping of bits within the codewords. Some of the bits within a logical codeword were so close together that a single event was able flip both of them, causing the error correction to fail.

Back in the days of hand made printed circuits, I randomly assigned both data and address pins on a microprocessor circuit, and got everything onto a single side with just one or two jumpers.

I felt so clever. Then I remembered that the program in the ROM assumed a particular bit numbering, literally while my board was bubbling away in the ferric chloride. Oops.

Rather than re-design the board, I thought about writing a program to rearrange my binaries, or make a socket adapter for the EEPROM programmer. The socket adapter won out.

That's kind of how the TMS1000 microcontroller (used in the Speak n Spell) works. Instead of incrementing the program counter on each instruction (like every normal processor), they saved a few gates by using a linear feedback shift register. The result is the program counter goes through a pseudo-random but predictable sequence. So they just program the code into the ROM in the same sequence and everything works just fine. (Some day I'll write a blog post about this, since it's interesting to look at the silicon that does this.)

The CIC (copy-protection chip) in the NES used a tiny 4-bit MCU that also had an LFSR for a program counter:


That sounds killer. I can't wait to see what you post. One of the things I find fascinating is the limitations that each generation of engineers deals with in order to get the next problem solved, or the next product out the door. Those were the days when bits and megahertz were expensive.

This is a very weird and specific question I know, but I figured I'd ask anyway because you seem to have experience with this chip.

In 'Halt and Catch Fire,'[0] one of the characters loads his children's names onto a Speak n Spell's memory. He's portrayed as a very talented engineer, and a lot of the show seems to be pretty true to the tech.

My question is, would this be possible for someone with a lot of patience, experience, and a home workshop at that point, or is it an apocryphal story?

[0] A last TV season about an early PC startup in the 80s

I'd say its possible but unlikely that someone reprogrammed their Speak n Spell. The first tricky thing is the TMS5100 voice synthesis chip uses a complex LPC-10 encoding to encode the sound with a very low bit rate (1100 bits/sec). Basically it's modeling the filter characteristics of the vocal tract. So the first problem is you have to convert your audio signal into this representation, which is going to be really, really hard unless you have access to the TI system that does this conversion.

The second problem is the speech data is stored in a TMS6100 ROM which is kind of a strange chip: the 14-bit address is loaded 4 bits at a time, and then the ROM steps sequentially through memory from there. The point is that you can't reprogram this chip (since it's a ROM), and emulating it with a standard EPROM would be a big pain.

I should point out that I don't have firsthand experience with these chips (apart from using a Speak n Spell years ago). But I happen to have been studying them in detail a couple weeks ago for random reasons.

For more information on this chipset, the datasheets are at http://www.datasheet-pdf.com/datasheet-html/T/M/S/TMS5100_Te... and http://www.ti99.com/exelvision/website/telechargement/tms610...

I just did a search and found someone who hacked new words into a Speak n Spell a couple years ago. But he needed to use a CPLD (like a FPGA) to simulate the ROM, and a Windows LPC encoding program, so this wouldn't have been possible in the 80s. http://furrtek.free.fr/index.php?a=speakandspell&ss=1&i=2

Hmm... his wife worked at TI as an engineer IIRC. So it's a tenuous, unlikely, made-for-tv kind of possibility indeed.

It's gotten me interested in circuit bending again, however.

Looked it up and that show is set in 1983 so it does seem rather implausible but not impossible of course. :) Here's a good run through using the Windows LPC encoding program. http://www.youtube.com/watch?gl=GB&v=wVDE-6TtmFQ

In the show, one of the characters works for TI, so has access to all kinds of internal tools for the task. Its not unrealistic to imagine they could do it; details were omitted though, as to how they did it.

There's a good chance that something like the Speak n Spell during that era stored its program in mask-programmed ROM. That's where the program is fabricated into the IC itself, and specified when you order a batch of chips.

A friend and I were doing layout on a Z80 system. Dual sided PCBs, therefore dense layout was a (relatively speaking) PITA. My friend just randomly hooked up both the address and data lines to the boot ROM. This freaked our boss out. He could understand the concept of rearranging data lines, but was concerned about rearranging address lines.

I wound up writing a C program that took in pristine "Intel HEX" (i.e. from the assembler) and swizzled both the address and data to correct for the board layout. I then spit out corrected Intel HEX. Simple, almost trivial.

Interesting that you were designing single sided PCBs. I certainly wouldn't have wanted to do that layout for anything at all complex. Hopefully you had high volume production to make it pay off.

OTOH, I'm surprised you opted for a socket adapter. Way too much hassle compared to a simple C program.

> I have been reverse-engineering the Z-80 processor using images and data from the Visual 6502 team.

Many moons ago, I heard a seventh-hand rumor that the guy doing the layout of the Z-80 chip had a nervous breakdown, because of the difficulty of the work.

I have no idea if there was any truth whatsoever to that, but I'm glad to find it's not a Langford Blit thing which maddens those who see it ;-)

The Computer History Museum's oral history of the Z-80 has some interesting stories about the layout (but doesn't mention any nervous breakdown). The Z-80 project hired some layout draftsmen, but they were slow and the chip was running behind schedule. So CEO Federico Faggin starts helping with the layout and ends up doing 3/4 of the chip layout himself after 3 1/2 months of 80-hour weeks. "You know, a CEO doing layout draftsman job was not something that would be normal, but that’s what I had to do, and I did it." By comparison, the simpler 8080 took six months of layout.


Having laid out a chip by hand, I can see that happening.

So, I'm thinking

I think to connect a memory chip you just don't care and you can swap them as you want (as long as you connect the 8 data pins to 8 data pins in the memory)

For IO you care, of course, or you "just" shuffle all data that you want to write (which is a sure way of making someone go crazy)

For old fashioned RAM and ROM chips, you'd be OK, and you're probably generally OK with data pins.

Address pins are another matter. The Z80 had a built in static RAM refresh circuit, and there is some schtick about which addresses are refreshed as a group (rows or columns, I forget which). So, rearranging the address bus might result in some nasty surprises. And it might get even more interesting with more modern memory devices, which are way over my head.

On a whim, I got the Howard Sams book on the Z80 while I was in high school, around 1981, and I devoured it.

Modern DDR2 SDRAM busses are a bit more involved. They use the address lines as a command word for putting the chips in the correct "link trainging" mode at startup, selecting burst access lengths, enabling self-refresh mode, setting on-die termination values, &cetera so they may not be swapped. Each "byte lane" of 8 data lines is allowed to have a different signal path length difference between clock and data (that is measured during training for compensation during operation) and signals may be swapped arbitrarily within the byte lanes.

Furthermore, the high-performance DDR3+ controllers typically hash the data word with the address so that when a repetitive data stream is transmitted it doesn't generate more EMI. (Some controllers also hash with a random seed gaining resilience against chilling the DIMMs of a running machine and reading them out on another machine in search of sensitive data.)

I find it really cool that any time you change the DIMMs in your computer, it essentially has to measure the length of the wires to the ICs on it. (I've found it less cool to have to manipulate timing values to compensate for deviation from PCB design rules, but thankful that it's possible. The fun of board bring-up.) If your BIOS has a "fast boot" option, mostly that means it remembers the wire lengths from last time so it doesn't have to do the measurement again every boot.

In the "good old days" it was certainly possible to do board design using 2x or 4x sized Bishop Graphics tape for lines, vias, etc. You applied them to transparent mylar sheets corresponding to board layers.

But now, sheesh! You need to carefully constrain the PCB CAD program so that all the lines match to within 0.1" or less. And, as you mention, that's just the tip of the design iceberg.

It's no longer possible to layout computers at low cost in a garage. Oh, well. Now hipsters sit around in open offices in SOHO and create silly apps.


Notice the reason why! In fact, I think the idea of a startup producing laptops targeted at developers has been mentioned before here on HN.

Not the kind of laptop the link I posted is talking about, which is x86-based. And I found the HN comment where the idea was mentioned BTW: https://news.ycombinator.com/item?id=7079053

Good lord my head is spinning now. For me, microcontrollers with built in memory pretty much eliminated all of those considerations, but it's fascinating to know the level of sophistication achieved today.

I totally agree about micros. They are wonderfully nice and tidy for small, well defined problems. I've had extremely satisfying success using them for communications glue, motor controllers and thermoelectric cooler controllers.

I'm in a chatty mood tonight, so I hope you'll excuse my indulgence of another few 'graphs on modern memory bus stuff I think is cool that I had to learn under product ship-date duress: Dynamic Termination.

Remember the high-school physics demonstration of the "reflected wave" where you grab one end of a rope that's tied off at the other end, give it a flick, watch the wave travel from your hand to the knot, then reflect back? Well kind of the same thing happens with voltage level on high speed data signals.

If your high school was fortunate enough in these (or those) fiscally troubled times to afford the equipment, the next demonstration tied off the same rope to some sort of damping spring that absorbed your wave's energy and didn't reflect it back at you. Those springs translate to "termination impedance" in electrical circuits.

So in order to avoid reflections, the address/control/clock lines of the memory bus "fly by" each and every DRAM IC until they reach the end of their path, where they "terminate" via resistors that prevent reflection to a voltage potential (half-way between 1- or 0- value potentials so that it's no more work to drive a 1 that 0 or vice-versa).

The data lines are wired directly from the memory controller to the DRAM chips so the DRAM chips have termination built-in to absorb the energy at the end of the line and prevent reflections.

Except that maybe there's more than one DIMM. DDR3 chips let you command them to connect or disconnect their termination resistors. So when you are accessing the closer-to-the-controller DIMM, the next one that's further out can apply termination. And when addressing the one that's furthest out, you disable on-die termination and rely on the actual resistor on the motherboard at the end of the bus.

My hard-learned lesson: the signals don't so so well when you get confused about which rank of chips are closer and which are further away.

I understand getting signaling to work in the frequencies we use is hard (when I was in college a lot of people doubted we could go above 50 MHz), but I can't shake the feeling the complexities we have on modern computers are part of a very elaborate Monty Python-esque prank to make our work look like magic to muggles... :-)

My school couldn't even afford the rope. ;-)

I remember reading about the PCI bus, which I think is the first time I heard of a mainstream digital system requiring serious transmission line design.

Actually, one of the selling points of PCI in contrast to competing standards was that it was significantly easier and cheaper in terms of transmission line design due to reflected wave switching (ie. no termination), this essentially means that only the steady state is relevant for signaling. While not good practice by any measure, this means that PCI bus does not have to be physically an linear bus, but works with arbitrary trees that does not contain excessively long paths (see for example aftermarket right-angle multi-slot PCI brackets for SFF and rackmount systems, that bus most of PCI signals for multiple slots together and only break out REQ#, GNT# and IDSEL through few wires to other original board slots).

DRAM refresh circuit, not SRAM. The "S" in SRAM means that it doesn't need to be refreshed.

Yes, you're right. Thanks!

IIRC 7-bit refresh. Good for 4116's; 4164's needed extra logic.

7 bit, indeed. But there were variants of the 64k DRAMs that only required 7 bit refresh, specifically for this use case.

Amazing analysis. Another reminder we're all standing on the shoulders of giants every time we whip out our phones carrying billions upon billion of gates...

It's giants all the way down.

While layouting an full-custom SPI bus for an IC I had to do the same thing. It simply made sense to sacrifice programming model in that case to layout simplicity.

Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact