
What made the 1960s CDC6600 supercomputer fast? - kken
https://cpldcpu.wordpress.com/2020/02/14/what-made-the-cdc6600-fast/
======
dwheeler
I used and programmed the 6600, including in assembly language. They were
incredibly fast for the time at numerical calculation. I used them for
electronics simulations in SPICE, and they were great for that.

However, they had 60 bit words and no way to address data directly within a
word. By convention, characters were six bits long, stuffed in 10 characters
to a word. So while this machine was incredibly fast for its time for
numerical calculation, it was painful to do text manipulation. You had to pack
and shift characters into words, and unshift and unpack. You could do
interesting things with great cleverness, but it took a lot of work to do
simple things.

Thanks for the trip down memory lane.

~~~
tpmx
Cool!

This expands on the text processing issues:

[https://www.museumwaalsdorp.nl/en/history/computerhistory-
ba...](https://www.museumwaalsdorp.nl/en/history/computerhistory-background-
information/6400hwac/)

"There was no byte addressability. If you wanted to store multiple characters
in a 60-bit word, you had to shift and mask. Typically, a six-bit character
set was used, which meant no lower-case. These systems were meant to be
(super)computing engines, not text processors! To signal the end of a text
string, e.g. a sentence, two different coding techniques were invented. The
so-called 64 character set was the CDC-default. A line end comprised of two
(or more) null-“bytes” at the end of a word followed by a full zero word. The
63 character set, quite popular in the Netherlands and the University of
Austin, Texas, signalled the line termination by two (or more) null-“bytes” at
the end of a 60-bit word.

The Michigan State University (MSU) invented a 12-bit character set, which was
basically 7-bit ASCII format with five wasted bits per character. Other sites
used special shift/unshift characters in a 6-bit character set to achieve
upper/lower case."

~~~
Someone
_”The 63 character set, quite popular in the Netherlands and the University of
Austin, Texas, signalled the line termination by two (or more) null-“bytes” at
the end of a 60-bit word.”_

So, did Dijkstra invent the null-terminated string? (What else links Austin
and the Netherlands in computing at that time?)

~~~
eesmith
I can safely conclude that the content at the site cannot lead us to conclude
that Dijkstra invented the nul-terminated string.

[https://www.museumwaalsdorp.nl/en/history/comphistory/comput...](https://www.museumwaalsdorp.nl/en/history/comphistory/computer-
history-the-period-1974-1978/comp742e/) says:

> A Control Data 6400 system was installed in May 1974 ...

> The Laboratory, like all other computer centres in the Netherlands, had
> opted for the so-called 63 character set. Control Data only tested systems
> with the 64 character set in America. Unsatisfactory code or code from “new”
> programmers yielded one or more errors with almost every new release, which
> we corrected at the TNO Physics Laboratory and made public with a lot of
> misgivings through the Problem Reporting System mechanism (PSR). Every two
> weeks a set of microfiches with all complaints and solutions collected
> worldwide was sent to all computer centres by Control Data. At every release
> level, it was exciting whether the errors we found were the first to report
> or that our colleague from the University of Arizona was going to blame

That gives a mismatch between "University of Austin, Texas" and "University of
Arizona".

[https://en.wikipedia.org/wiki/Edsger_W._Dijkstra](https://en.wikipedia.org/wiki/Edsger_W._Dijkstra)
says he joined the University of Texas at Austin in 1984. In the 1970s he
lived in the Netherlands.

[https://en.wikipedia.org/wiki/Null-
terminated_string](https://en.wikipedia.org/wiki/Null-terminated_string) says
that the PDP-11 in 1970 had NUL-terminated strings ("Null-terminated strings
were produced by the .ASCIZ directive of the PDP-11 assembly languages and the
ASCIZ directive of the MACRO-10 macro assembly language for the PDP-10.")

This was when Dijkstra was at the Eindhoven University of Technology.

Therefore, whatever is described at the link cannot be used to conclude that
Dijkstra had anything to do with nul-terminated strings.

~~~
Someone
Thanks. I searched a bit further, and found
[http://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/EW...](http://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/EWD466.html),
where Dijkstra critizises how line endings were stored in the 64 character
set:

 _”Niklaus told a terrible story about CDC-software. With 10 six-bit
characters (from an alphabet of 63) packed into one word, CDC used the 64th
configuration to indicate "end of line"; when for compatibility reasons a 64th
character had to be added, they invented the following convention for
indicating the end of a line: two successive colons on positions 10k+8 and
10k+9 —a fixed position in the word!— is interpreted as "end of line". The
argument was clearly that colons hardly ever occur, let alone two successive
ones! Tony was severely shocked "How can one build reliable programs on top of
a system with consciously built-in unreliability?". I shared his horror: he
suggested that at the next International Conference on Software Reliability a
speaker should just mention the above dirty trick and then let the audience
think about its consequences for the rest of his time slice!”_

I don’t think that

------
protomyth
The T. J. Watson Jr memo gives a bit of insight into what the competitors
thought of the CDC6600
[https://www.computerhistory.org/revolution/supercomputers/10...](https://www.computerhistory.org/revolution/supercomputers/10/33/62)

~~~
rbanffy
Interesting. It looks typewritten, but it has proportional spacing. I'm not
aware of any typewriter of the time that could do that. The Selectric Composer
could, but it was released in 66 and the 6600 predates it by two years.

The different weights also make it look like a mechanical typewriter rather
than an electric one, which would be an odd choice for the office of TJW.

~~~
stonogo
All IBM Executive-series typewriters made after World War II featured
proportional spacing.

------
exmadscientist
Nice work! Any article that features results from a test PCB is a winner in my
book!

> Let me know if you find similar devices.

The Rohm high-frequency BJTs look promising: 2SC3838K is the fastest (fT
wise), but there are several others (see page C28 of Rohm's 2019 catalog). I
only checked the datasheet for the one, but it's got a very nice Figure 10,
showing that it'll probably switch fastest around 20mA collector current, and
should be about 3x faster there than the MMBTH10L at 4mA.

~~~
blattimwind
The thing is that if you want logic to go fast proper you don't use saturation
logic (DTL/RTL/TTL), but current steering logic (ECL). That way your usable
clocks gets much closer to the fT of the transistors involved, instead of
being limited to a tiny fraction of it.

That's how Cray built supercomputers after the CDC6600/7600.

~~~
TheOtherHobbes
At the cost of insane power/heat budgets - although one of the nice things
about ECL is the power draw is relatively constant because the transistors
don't saturate, and you don't get the spikes and PSU hash you get with TTL
etc.

ECL was amazing for its time, but I'm honestly more impressed by modern
PC/phone electronics. PCBs and chip designs are mass-produced commodity
products _clocked at microwave frequencies_ \- sometimes with battery power.

This is incredibly impressive compared to the state of the art in the 60s and
70s. And it's taken for granted as an everyday thing.

~~~
fanf2
I recently found a 1980s Cray installation guide which has a lot more detail
on the power, cooling, and other physical requirements.
[https://news.ycombinator.com/item?id=22284518](https://news.ycombinator.com/item?id=22284518)

~~~
dfox
One notable thing about the constant power draw of ECL and Cray-1 in
particular is that power draw is constant enough that the logic supply is
unregulated, just rectified and filtered output of 400Hz transformers (placed
physically in the "bench part" of the chasis). What is regulated is the
208V/400Hz supply for the thing (produced by the motor generator shown in the
article), but regulation of that IIRC involves manual turning of physical
knobs during installation/maintenance and is more about compensating for
unstable mains on the input side.

------
neonate
"Optimize one basic thing very well, replicate it, and use it as a
hierarchical building block."

~~~
mark-r
I think Cray kept that basic philosophy with all his subsequent designs.

------
GnarfGnarf
I learned FORTRAN in 1965 on a CDC 3100. Real core memory, tape drives, vacuum
drum card reader, Calcomp plotter. Super slick.

I took a computer science course at Dalhousie University in 1970, used a CDC
6400. My professor was obsessed with pseudo-random numbers, the 60-bit word
size was a godsend.

------
larusso
Oh I love these detailed runs into the past. Also the fact that physics and
chemistry where the main driver. Software became the most prominent figure in
computing or at least that is how I perceived this growing up. I learned very
late what the advances in semi conductor development really meant and how
important they were.

~~~
Gibbon1
It was amusing reading that the transistor used int he CDC6600 was gold doped.
Turns out that common switching diodes like the 1n4148 are gold doped for the
same reason. Increases switching speed. You pay for it though, they leak like
a sieve esp at higher temps.

~~~
blattimwind
1N4148 are also excellent photodiodes if you don't want that.

------
todd8
What I remember about assembly language programming the CDC6600 was how
beautifully simple the machine's principles of operation were at the register
and instruction level.

In 1974 I learned CDC6600 assembly language in grad school. In comparison to
IBM360 assembly language programming I had previously done, the CDC 6600 was
so straightforward. It took perhaps one day to learn all of it. I still have
the small book that I learned from, _Assembly Language Programming for the
Control Data 6000 Series and the Cyber 70 Series_ by Ralph Grishman.

 _Addressing:_ The machine had unusual data layout, it was word, not byte
addressable and each word was 60 bits long, not 16 or 32 bits long.

 _Data Format:_ Text was stored in six-bit fields, ten per word and so the
characters weren't directly addressable. Furthermore, integers were stored in
1's complement, not the more common 2's complement or sign-magnitude format.
There was a single 60-bit floating point format (1-bit sign, 11-bit exponent,
and 48-bit coefficient).

 _Speed:_ Floating point multiplication took 1000ns, but the 6600 could do two
floating point multiplies, a floating point add, and an integer add
simultaneously if coded carefully.

 _Memory:_ The memory of the 6600 was stored in a ferrite core memory and it
had a maximum size of 128K words. Later, there were slower, larger memory
tiers as options.

 _I /O:_ This was handled by peripheral processors that had access to the main
memory and could offload data transfers to devices so that the central
processing unit didn't have to handle expensive interrupts (expensive because
the out-of-order execution of instructions meant that saving and restoring the
state of the CPU was relatively time-consuming).

 _Registers:_ There were 8 X-registers. These are 60-bit registers that are
used as the operands in the assembly language instructions. There are also 8
18-bit A-registers that are used for addressing and an additional 8 18-bit
B-registers for use as loop indexes, etc.

 _Instructions:_ Opcodes are always 6 bits, there are only 71 instructions
(one of the 64 possibilities in the 6-bit op code is further divided into 8
instructions). The instructions were simple:

    
    
        IX4 X5+X6   ; Integer sum of X5 plus X6 goes into X4
    

Such an instruction takes 15 bits, six for the opcode, three each for the
three registers.

This was all a lot less to understand than the intricacies of the IBM360
principles of operation. The IBM360 of the time had instructions like
TRANSLATE-AND-TEST or the SHIFT-AND-ROUND-DECIMAL. Here for example are the
first four paragraphs explaining the TRANSLATE-AND-TEST instruction from the
_IBM 360 Principles of Operation_ :

> The eight-bit bytes of the first operand are used as arguments to reference
> the list designated by the second operand address. Each eight-bit function
> byte thus selected from the list is used to determine the continuation of
> the operation. When the function byte is a zero, the operation proceeds by
> fetching and translating the next argument byte. When the function byte is
> nonzero, the operation is completed by inserting the related argument
> address in general register 1, and by inserting the function byte in general
> register 2.

> The bytes of the first operand are selected one by one for translation,
> proceeding from left to right. The first operand remains unchanged in
> storage. Fetching of the function byte from the list is performed as in
> TRANSLATE. The function byte retrieved from the list is inspected for the
> all-zero combination.

> When the function byte is zero, the operation proceeds with the next operand
> byte. When the first operand field is exhausted before a nonzero function
> byte is encountered, the operation is completed by setting the condition
> code to O. The contents of general register 1 and 2 remain unchanged.

> When the function byte is nonzero, the related argument address is inserted
> in the low-order 24 bits of register 1. This address points to the argument
> last translated. The high-order eight bits of register 1 remain unchanged.
> The function byte is inserted in the low-order eight bits of general
> register 2. Bits 0-23 of rcgister 2 remain unchanged. The condition code is
> set to 1 when the one or more argument bytes have not been translated. The
> condition code is set to 2 if the last function byte is nonzero.

> ... [There's a lot more]

------
kgran
Interesting to find this on HN as I'm reading a book on Seymour Cray and his
supercomputer adventures (Charles J. Murray's "The Supermen: The Story of
Seymour Cray and the Technical Wizards behind the Supercomputer"). Not too
much technical intricacies there, but still an interesting read from a
general/histori perspective.

------
dieselerator
The article shows us transistor level logic circuits to study, but I think the
quick answer is as the architect Seymour Cray gets the credit.

~~~
qubex
Definitely.

For an insight into that great man’s life, I highly recommend reading _The
Supermen: The Story of Seymour Cray and the Technical Wizards Behind the
Supercomputer_ by Charles J. Murray. I remember reading it back when I was in
high-school so it’s more than twenty years old by now, but it’s still an
amazing account of how those amazing people built those stunning machines.

~~~
jacobwilliamroy
I hope these people are all dead now so you never have to meet them and accept
that they're not as amazing as the book makes you think they are.

~~~
mark-r
I met Seymour Cray, and he was definitely as amazing as you can imagine. He's
also very much dead.

~~~
dboreham
I held a door open for him in 1989. Unfortunately I didn't realize who he was
until later.

~~~
mark-r
Reminds me of a story I heard on the radio.

A guy went out to eat in a New York restaurant. A couple came in, a pale white
dude accompanied by the most beautiful black woman this guy had seen. They
were seated close to him. It was immediately obvious to him that they were
getting much better service than him - for instance they got their food almost
immediately after ordering, while he was still waiting for his. So he started
to heckle them.

Fast forward to a year later. David Bowie has died, and his picture is on the
front cover of every magazine. This guy finally realizes who it was he'd been
heckling.

------
interrealmedium
>10 MHz

>1964

That's insane. What's even more insane is that a bit over 20 years later
homecomputers reached that frequency. And in the next decade they reached over
100 MHz. - Pure lunacy.

Posted from my 5 GHz homecomputer.

~~~
blattimwind
ECL logic systems reached effective clock frequencies in excess of 500 MHz in
the late 60s or so. It was _extremely fast_ compared to contemporary RTL/TTL
logic.

~~~
ajross
Exactly. It was MOS that lagged, not "transistors" or "computers", really.
Transistorized microwave circuits (analog, not digital) in the GHz range were
operating as early as the 1960's too.

MOS is just hard. It's hard to fabricate, it's hard to design (i.e. you need
computers to make computers!), it's hard to scale. It required new chemistries
be invented and new circuit design techniques to manage the (really tiny!)
components. So for a long time you could have a big circuit in discrete
bipolar components which ran fast or a tiny one in MOS which was slow.

But MOS was tiny and MOS was cheap. So by the early 90's it had caught up in
speed with all but the fastest discrete technologies (GaAs was "right around
the corner" for like two decades as it kept getting lapped by routine process
shrinks -- now no one remembers it) and the market abandoned everything else.

~~~
kken
>MOS is just hard

I think the important point is that "MOS scales". All the bipolar technologies
never had anything like Dennard scaling, which was the backbone of Moores Law.

~~~
blattimwind
Case in point, you could get _bipolar ECL RAM_ in the 80s with access times of
around 1-2 ns (which is at least four times faster than the fastest DDR-SDRAM
in 2020). Except those things would have a few kilobits at most and burn a
Watt or two; an entire 16 GB stick of DDR4 doesn't require much more than
that. (This is SRAM of course; you can't build good DRAM on a bipolar process,
and MOS SRAM is much faster than DRAM as well. However, MOS SRAM in the 80s
would have access times of 20-150 ns; it's typically the suffix of the part
number, e.g. 62256-70)

------
rbanffy
> The design of the machine is well documented in a book by James Thornton,
> the lead designer

The type on the twin CRTs of the console is very interesting. The cover of the
book shows a sample of it and the slight imprecisions of the beam deflection
give it a whimsical quality, as if the fastest computer of its time used Comic
Sans to communicate with its operator.

------
burlesona
Does anyone know why 60 bits and not some power of two? How did that work? Or
am I just being silly and it doesn’t matter? :)

~~~
retrac
At the time, many computers were decimal, or had word lengths like 18, 24, 36
or 48, or even 72 bits. Characters were usually 6 bits. The power-of-two
standard based around an 8-bit byte didn't exist yet.

Whoever picked 60 bits (Cray?) was almost certainly thinking in octal, not
hex. 60 bits is multiple of 3, it fits in 20 octal digits, and it holds 10
characters. Most importantly, a 60-bit floating point number is precise enough
for just about any calculation.

~~~
AnimalMuppet
IIRC, some early mainframes (IBM and maybe Sperry?) had 36-bit words. 36 bits
was enough for 12 (decimal) digits of accuracy with fixed-point arithmetic, or
10 digits for floating point. It was good enough for atomic calculations
(where the difference between an atom's mass and the masses of the two atoms
it breaks into is a very small fraction of the initial mass).

~~~
ScottBurson
The DEC PDP-10 was a 36-bit machine.

~~~
ThomasBHickey
With an interesting and functional set of 'byte' instructions where you could
specify the number of bits per chunk. IIFRC 6 or 9 bits were typically used
for characters, but I think there was a 5 bit character set in use as well.

------
baybal2
As transistors are getting smaller, it's said that when transistors will
approach <10nm gate sizes, RCL may reappear again because at these sizes
semiconductors will begin to lean so much, that FET based logic will no longer
have advantage in the current draw over current based logic families.

~~~
dfox
I would not expect return to RTL/DTL but iwould not be surprised by use of
NMOS-style pull-up transistors/resistors in combination with traditional CMOS
logic. You can make NMOS gate in CMOS process quite easily and it comes out
significantly smaller. Doing DTL in CMOS process seems somewhat pointless
given the fact that simplest way to make diode-like thing in CMOS is
transistor. And then there is the issue of small fanout of RTL/DTL.

