Later I did some 8086 assembly programming. I learned why a generation of programmers hate assembly... only a few special purpose registers, segmented memory, long list of nonorthogonal instructions, ... yuck
I blame the general aversion for assembly programming to the choice of using the 8086 for PC's.
The auto-increment/decrement addressing modes were aware of the bit size. Thus you could have polymorphic subroutines where the same code could be used to sum an array of bytes, words, 13 bit quantities or whatever size word you wanted (up to 32 bits).
Since the processor had built-in circuitry to help drive graphics displays it took a high input frequency of something like 40 or 50 MHz and actually divided that down to run the processor at around 10 MHz or so. The opposite of what we're used to now and made the darn things look scary (well, scarier) from an emulation programmer's perspective.
The memory interface was designed to work with shift-register VRAM. The idea there being that the VRAM chips had a built-in shift register 512 pixels wide. The display circuitry would make use of it by having each line from the frame buffer dumped into the shift register as the raster was moving down the screen. And then the shift register would clock out the pixels as the raster moved across each line.
During VBLANK (when the video circuitry is waiting for the raster to return to the top of the screen) you could use special instructions to load and store the shift register which would be as fast as any load/store operation. The entire frame buffer could be filled very quickly with any repetitive pattern or erased entirely by copying some fixed line to all the others.
Lotta cool stuff in that beast.
There was also a built in Bressenhsm line drawing instruction.
Then there was the TMS34082 floating point coprocessor. I essentially wrote a primitive OpenGL'ish pipeline API. That was fun.
Wow... I really wish a general purpose processor like this made it into the PC. The best stuff doesn't always win in the marketplace.
And yes, Intel's instruction sets are so butt-ugly they make everybody else's look beautiful by comparison. I really wish almost any other chip had won the war for that reason. I'm encouraged by the rising profile of ARM.
I had lots of fun coding for with, initially with as86 and then TASM.
The only thing I hated with the x86 was trying to use AT&T syntax a few decades later.
I also coded for Z80, 68000 and MIPS.
Then you had real mode vs protected mode which used the same segmented registers in a completely different way -- a clever way to be backwards compatible -- long story. In real mode you have a variety of different memory models: tiny, small, large, ... -- near and far ptrs? -- ugh. 20-bit physical addresses from segment << 4 | offset. Uggggllllyyyyy....
Forget about the x87 floating point processor -- arguments go on a 8-level deep stack -- really? I got to push and pop to do floating point?
Now you have AMD-64. But all that old stuff is still there taking up transistors!
It was nice though, I wrote a backend for GNU binutils for it and wrote an X11 driver for the chip.
I loved 8086 assembly after coming from Z80. And one of the reasons it was so fast was because of minimal registers.
And 8086 was released 10 years earlier.
Well, nowadays you can use LLVM instead and achieve portability, at perhaps a small cost in efficiency.
The TMS9918 was the first chip to refer to overlaid graphical objects as "sprites." It was used in a multitude of early computers and game consoles, like the MSX1, Colecovision, Sega's SG-1000, and of course TI's own 99/4 (where it was paired with a TMS9900 CPU). It also served as the basis for the video controllers in the Sega Master System and Genesis/Mega Drive.
The TMS9918 is also interesting because it was one of the few off-the-shelf video generator chips ever produced. (There were a couple others, like the MC6847 used by the CoCo, but almost every other home computer/console of the era used custom silicon for video generation.)
Yes, that's exactly how it worked. The 9900 only had one internal register, which pointed at the current "register bank" in main memory. I worked at TI in those days and wrote code for the 9900. It wasn't a crazy idea when the chip was designed; after all it made context switches completely free. But after the chip went into production, the speed differences between CPUs and DRAM started becoming obvious.
At that time memory could be at nearly the same speed or even faster than the CPU; in fact the CISCs which were popular left the memory bus mostly idle while they executed instructions internally, which is what let the relatively memory-bandwidth-consuming RISCs become viable.
In fact I'd almost bet that, had memory always been slower than the CPU, RISC would've never been invented.
"So far so good - it seems unusual to our modern “memory is slow” mindset that the processor touches RAM every cycle, but this is from an age where processors and RAM were clocked at the same speed."
This page also shows that the 6502 happily did extra memory reads.
In fact, the memory had bandwidth to spare. http://www.6502.org/users/andre/osa/oa1.html#hw-io:
"The memory access is twice as fast as the CPU access, so that during Phi2 low the video readout is done and at Phi2 high the usual CPU access takes place"
I remember reading about a setup with two 6502s both running from the same single-ported RAM at full speed, but cannot find it.
I believe that the PDP-10 (well, some versions) had the first few memory locations equivalent to registers.
The AT&T Hobbit (Aka Crisp) chip had a stack pointer, and aggressively cached memory around the stack, essentially. Once cached, stack-relative memory operations were as fast as registers. (The Apple Newton was going to use the Hobbit, but switched to ARM when it became clear that AT&T wasn't truly interested in committing to consumer grade pricing of the CPUs).
Essentially. Actually the registers were addressable as the first 16 addresses in memory for all models. For the PDP-6 and first PDP-10 (model KA) the registers were fast semiconductor devices (DTL, as I believe it predated TTL) while the rest of memory was literally core (convenient for when the power went out, as happened occasionally in Cambridge -- whatever process was running died since the registers were lost, but everything else was in core, so the machine could just be restarted).
Since they were addressable you could run code out of them, like bootstrap routines or some deranged TECO code I once wrote). On the other hand any word of memory could be used as a stack pointer (two addresses fit in a word, so one half was the base and the other half the depth).
It was quite a RISC-like, highly symmetrical architecture for its time and a pleasure to program. I still miss it.
So not totally crazy, but a severe performance limiter unless you can afford the complexity of the standard trickery.
TI did make a personal computer that competed with the likes of the VIC-20. It was sloooooow, but had nice (for the time) color graphics.
In the context of something vaguely like an SoC where you can make "0 page" memory registers fast with a small bank of high speed SRAM it can make sense, particularly for decoupling manufacturing defects or silicon production processes.
Of course for modern, potentially out of order and speculative branch predicting, pipelined instruction systems this is a horrid idea.
This made instructions simpler, as register instructions did not need a distinct addressing mode.
I always thought the workspace-pointer-to-register-set could make for some easy multitasking context switches. You just change the workspace pointer and immediately you're working in another context.
In practice it was slow though compared to processors with real registers.
Actually 8088 really had a huge advantage: easy to port CP/M apps to the IBM-PC. Even if the 68K was ready to go, 8088 was probably the better choice.
IIRC the 68k was an expensive chip period. It was the chip you used if you had money to burn.
Fitting twice as many chips on the board is probably a pain too. (And suppose you go for the 2 x quarter capacity option - now you need 4x as many if you want the same amount of RAM!)
I think around 1979-1980 that an Apple II with 16k was like $800. One with 48k was $1900. And they weren't passing any markup on the memory. Much different than today where the cost of DRAM is much smaller fraction of the cost.
There is a slower kid in every family. :)
Note that both the 8086 (1978) and 68000 (1979) were introduced ahead of, respectively, the 8088 (1979) and 68008 (1982). Basically these 8-bitsters were probably kind of a cost reduction following a familiar pattern in the hardware industry: product catches on, then customers want to put it into more and more things that are cheaper and cheaper, with simpler boards, where big MIPS aren't needed.
(Those reasons are not mutually exclusive, of course.)
The 68000 would have been more of a problem since it moved from the matched memory and clock cycle scheme of the 6800 to a four clock cycle scheme with a complicated handshake. A special memory mode and two extra pins made it talk just fine to the 8 bit i/o chips. There was no need to wait for the 68008 for that.
One huge mistake that was made in the 8088 and 68008 (and I will suppose the TMS9980 as well, though I haven't checked) was that they didn't have a simple way to take advantage of page mode access in DRAMs like the original ARM did. If they had, the gap in performance compared to the 16 bit bus models would have been smaller.
The Mac came out three years after
One can see reasons for C's design tradeoffs if you're worried about machines like that.
But even after you abuse every trick in the book it's hard to see how that machine isn't hobbled by its lack of memory.
I remember reading the first IBM PC came with 16k. A former colleague of mine once reminisced about programming on a PC in the late 1980s that only had 256k of RAM (although I think that must have been an old or low-end machine).
Probably 16 64kX1 DRAMS.
> IIRC the 68k was an expensive chip period.
Fuzzy memory but the 68000 was a 64 pin ceramic package. I remember comments that the IC testers of the day didn't have enough IO to test them. That upped the cost as well.
That's true, but that doesn't change the principle. Other processors (like the 6809 for instance) used the same model and could relocate the 'zero' page.
> Most instructions (boolean and ALU) only leave results in the accumulator, not RAM.
Yes, but that's the way this is supposed to work. A, X and Y are scratch with the real results held in 'quick access' zero page variables.
> (The zero page is basically a parallel set of instructions with eight zeros in the upper 8 bits off address).
Yes, and this was explicitly designed in such a way to offset the rather limited register file of the CPU.
In the 6809 it was called the 'direct' page, and in that form it was a lot more usable since you could do a complete context switch with a single load (which the 6809 operating system OS/9 used to good effect).
But of course there was only one ALU, and to use it you had to use the Accumulator register.
No, it couldn't. Even a NOP was 2 cycles. Memory access is at least 3 cycles for a zero-page read (read opcode, read immediate byte, read data), or more for the more complicated addressing modes.
Suppose it executes a zero page read instruction. It reads the instruction on the first cycle, the operand address on the second cycle, and the operand itself on the third. 1 byte per cycle.
(For a NOP, it reads the instruction the first cycle, fetches the next byte on the second cycle, then ignores the byte it just fetched. I think this is because the logic is always 1 cycle behind the next memory access, so by the time it realises the instruction is 1 byte it's already committed to reading the next byte anyway and the best it can do is just not increment the program counter.)
The solution is to either have a hybrid, such as 8 bit data and 16 bit address, or use some kind of memory management unit. So the 8088/8086 had a segmented sort of MMU built in while many 8 bit computers added external MMUs to break the 64KB barrier (MSX1 machines could have up to 128KB of RAM while the MSX2, still Z80A based, could have up to 4MB per slot).
And a lot of embedded 8051 designs used one of the 8 bit ports to extend the address space from 16 to 24 bits. I think both common C compilers for the 8051 supported that memory model.
Also if I remember the 68000 correctly indirect addressing was 16 bit register + 32bit constant. Definitely not a 'pure' memory model. Though the 808x was far more ugly.
The AMD 2900 went one better by having a huge register file; calling a function moved a pointer into a large stack of fast on-chip memory.
For its time the 9900 could have been revolutionary. But the architecture of the TI-99/4A was odd in its own right.
... the Motorola 68000 was a very successful chip!!
But it still never approached near the volume of sales of x86. And was abandoned by Motorola by the early 90s (when they started to go down the RISC road with the 88000 and then PowerPC)
There's virtually no inside information written down about the history of this chip, so this was pretty fascinating.
Towards the end, you could pick them up for $200, which was unheard of for that class of PC. Easy to see why that would draw an adoring audience while frustrating TI.
If IBM, and not Microsoft, had controlled MS-DOS, Windows, and so on, the computing world would now be a different environment.
iMHO probably a much worse world. Microsoft might have been the "evil empire" but ibm was the original.
There was no Resource Workshop for designing dialog boxes visually. You had to write all the statements in a text editor, and wouldn't see the appearance until compiling and running the program. This slowed down development.
While debugging, I recall trying to step into the GUI part of the application code. This locked up the computer so bad we had to re-install OS/2, a tedious process requiring 22 diskettes.
The IBM documentation was so inscrutable that even for ver. 2 of OS/2, we had to refer to Microsoft's ver. 1 manuals. IBM's support gave an interesting insight into OS/2's downfall. My client had an expensive OS/2 support contract with IBM. The client's contacts were two guys in the PC support section. When I required technical support to answer some obscure and specialized questions about OS/2, it didn't make sense to get the client guys involved, even though they were supposed to be the only official go-between with IBM. So I called IBM with the client's authorization, and impersonated one of the support techs. After a few calls I slipped and used my name. The IBM guy was livid and tore a strip of me, saying that as an independent contractor, I had no business blah blah. Here I was just trying to get a customer up & running.
Authors of their own misfortune.
IBM also tried their luck to have a Smalltalk like IDE for C++, but it was too heavy for the typical hardware configurations on those days.
It was the fourth version of Visual Age for C++ OS/2.
It's interesting to compare this to another article linked here about the downfall of the TI home computers. Apparently TI understood the value of software to the point where they threatened to sue unlicensed software vendors, which pretty much guaranteed that those vendors chose to write software for other platforms. Perhaps they would have had a better chance if they had opened up their platform?
Had IBM done that, we would be in a similar position regarding PCs.
On the other hand, maybe thanks to that single event, Atari ST or Amiga variants would still exist.
However, the industry seems to be turning back to those days.
Even FOSS won't help here, because each OEM just packages their own OS and SDK flavour and locks down the hardware.
Apple (Mac) was never open enough.
None of the UNIX companies would have ever made anything cheap enough to hit the mass-market until after Windows 95 came out.
Without Microsoft we could have ended up with some weird IBM world running OS/2. They might have even switched to Motorola eventually.
But Microsoft and Compaq made things open and cheap enough to get us to where we are today.
None of them would get a piece big enough of the pie.
The PC only happened due to the way OEM market was created, originating a race to the bottom regarding computer components.
My other point being that given the current state of computer market with iDevices, Android, Chromebooks, IoT, hybrid tablets (aka netbooks), TVs, Watches,..., those OEMs are now trying to turn the remaining of the market into that vision.
So besides their proprietary OSes, we get customised versions of their own forks from open source OSes (e.g. Huawei Linux, LG BSD, put your flavour here) in locked down hardware like those computer systems with their OSes written in ROMs.
Microsoft is still here, but the day they actually do loose the PC market, don't expect the "Year of Linux/BSD" to happen.
Having the ability to temporarily suspend protected mode while running older software allowed a smoother transition on the x86 path.
In this alternative world, only IBM would be producing PCs just like Atari and Comodore produced their own products, there wouldn't be OEMs driving costs down and creating comodoty computers.
I guess "ugly" is a fair assessment of the 8086 at the time. Certainly the 68000 was a much cleaner and orthogonal architecture. On the other hand, I'd rate the 8086 at least as good as if not better than other contemporary microprocessors such as the Z-80 and 6502. The 6809 was sweet but a 16 bit address space rooted it in the previous direction and the 68000 make it clear that the 6809 wasn't in Motorola's future plans. Sure, had IBM chosen the 6809 there surely would have been a compatible follow-on but I can't imagine even the stanchest IBMer to have that kind of hubris.
But calling MS-DOS ugly at the time would have been unfair. It was as capable as any other microcomputer OS at the time in the home computer space. It was widely proclaimed to be a rip-off of CP/M so we might take that as a compliment and if you look at TRS-DOS, Apple DOS or whatever PET's used it was just fine. It'd be unrealistic to suggest and mini or mainframe OS was an option and Unix just wasn't there yet. If IBM had given Microsoft more lead time they might have went with Xenix which they did have out in 1981 for the Z-8001. I'm not so sure IBM would have been interested, though, as they wouldn't have an exclusive license to the OS.
Not to mention that the overhead of the operating system was an important consideration. The machines didn't have much capacity to waste and whatever you picked it still had to perform well on a system with only floppy drives. Maybe that in itself doesn't rule out Unix but it sure cramps the design space.
In both cases, CPU and OS, the ugliness really took off with backward compatibility to maintain. The 80286 was already being designed so it drove that deeper into the weeds and there was no way of bypassing MS-DOS compatibility once it anchored the marketplace. The only way forward was to improve the OS while keeping MS-DOS programs running and the whole OS/2 debacle only helped to delay that upward path.
I mean, fair enough to say "ugly won" but some consideration should be given to the lay of the land when these long-term trends were set in motion.
But, I hear this about Intel's instruction set - at a number of tech sites.
Intel got their start making memory (SRAM). They had Moore and Noyce, as I recall.
Anyhow, I am not qualified to opine on instruction sets. I am curious as to where it went 'wrong.' I accept your declaration as being likely true - I've seen it echoed elsewhere. But, if you know, do you happen to know where (perhaps even why) it went wrong?
I have tried the mighty Google, by the way. It was no help. I may have not had the correct search terms. I am a mathematician, I only programmed (retired now) out of necessity. In fact, I hated computers for the longest time. I am just curious about the history of where they went so wrong that so many people complain about it.
As an erstwhile CP/M user, I have of course known Kildall's name for decades, but never had a sense of who he was as a person. This somewhat autobiographical recounting is very illuminating in that regard.
Everything ran much slower back then.
Google street view has confirmed the authors' opinion.