Stanford had several Alto machines, but they didn't have Smalltalk, due to some licensing issue. They just ran standalone Mesa programs. When I was at Stanford, few people wanted to use the obsolete Altos, so time on them was available. So I did a small project on them.
Bravo was used as both the text editor and the word processor. The file format was plain text, then a control-Z, then the formatting info. The compiler stopped at control-Z. So you could use bold and italic in your programs, and make the source code look good.
As in the picture shown, the Stanford machines had the keyboard and display on top of the computer. This isn't required, and it's really annoying to type on. The keyboard is great; it's a massive casting around clicky keys.
Altos talk PUP, Parc Universal Protocol, over 3MB coax ethernet. Stanford had gateways to connect this to the wider world.
I think I still have some of the Alto manuals.
The vision statement for the Dynabook is in "Personal Dynamic Media" This is worth re-reading every few years.
If the world had used Mesa instead of C, computing would have been far less buggy. Mesa was a hard-compiled language, but it had concurrency, monitors, co-routines ("ports", similar to Go channels), strong type safety, and a sane way to pass arrays around. In the 1970s.
(I should donate this stuff to the Computer Museum. I just found the original DEC Small Computer Manual, many 1960s UNIVAC mainframe manuals, and a reel of UNIVAC I steel magnetic tape.)
 The Alto docs are at http://bitsavers.org/pdf/xerox/alto/ - I recommend the hardware reference (AltoHWRef.part1.pdf ) and the schematics (schematics/)
The Mesa manual mentioned earlier is at http://bitsavers.org/pdf/xerox/mesa/5.0_1979/documentation/C...
Nobody seems to be very interested in early UNIVAC mainframe stuff, which is most of what I have.
Is it that the "small set of sharp tools" provided by C, and the "safe and somewhat onerous discipline" provided by Modula-3 represent two points on an evolution that's converging towards the ideal systems programming language? Or maybe the language level is the wrong level at which to be considering this, as if we were analyzing prose at the level of phonemes?
There they created Modula-2+ with input from Niklaus Wirth that had created Modula and Modula-2 after his sabatical year at Xerox PARC.
Modula-2+ eventually became Modula-3, which was used for creating Spin OS, distributed objects network protocols and other interesting goodies.
Meanwhile at Xerox, Mesa evolved into Cedar. Niklaus Wirth on his second visit learned about this new system and created Oberon when he returned to ETHZ.
Oberon then gave birth to Oberon-2, Active Oberon and Component Pascal. The Gadgets framework in version System3 was great for a mid-90's workstation OS.
Nowadays I would say that C# is the spiritual successor of Modula-3, with Singularity, Midori and now .NET Native.
With Go taking the role of spiritual sucessor of Oberon.
Everyone should try to delve into books and papers from Burroughs, Xerox PARC, ETHZ, UK Royal Navy, DEC for an alternative view of doing systems programming the right way.
We have two goals. One is to have the restoration chronicled as it goes along, in a way the HN community can discuss and participate in. Obviously we hit the jackpot there, with one of the best technical bloggers in the world.
The other goal is to do something with the Alto that the community will find interesting once it's running. A couple ideas are to make it fetch and render the front page of HN (we'd happily write whatever code was needed to serve it in a suitable format, since HTML is probably a bridge too far), or if we could find a second Alto to communicate with, play Maze War on them (http://www.digibarn.com/collections/games/xerox-maze-war/#ma...). But we'll be eager to hear any suggestions the community comes up with!
It's like triple-pizza-box size, not a refrigerator nor even the size of the Alto.
There are two other things about the alto that have really stuck in my mind. First, the whole thing uses only 300 SSI and MSI TTL chips! No higher order chips (no LSI, much less VLSI).
The other is that the bus bandwidth was only 3/2 the screen update rate. Updating the screen was really important: this was a user-centered, IO-focussed machine which was super radical for its time. If you wanted to do a lot of computation you could steal cycles from the screen update, causing it to go black (in just the bottom half or so IIRC which I probably don't).
Error in the article: I do believe the Alto was the origin of the BITBLT instruction, but it was based on the PDP-10 (PDP-6) block transfer instruction BLT, and the expression blitting was current before the Alto was developed. In fact PARC had a PDP-10 which was the standard research computer at the time -- homemade as well (clones) because at that time Xerox was in the computer business and wanted PARC to use an SDS. (Again this is before my time though MAXC was still running when I was there -- with an Alto as its front end!)
Also contrary to what the article says, the Alto display was not unusual in being portrait mode -- most glass TTYs (think ADM-3A, Hazeltine, VT-52, and I believe the 3270 as well) were taller than they were wide, like a piece of paper. The Alto display was unique, as mentioned, by being bitmapped and black on white. Because of the Alto, bitmapped portrait mode was standard for workstations such as the CADR lisp machines, Bell's BLT terminal, three rivers PERQ, and of course the later PARC computers we used, Dolphins, Dorados (all ECL logic!), Dandelion (sold as the Star). I remember vividly the first landscape machine I used, the Symbolics 3600 in 1985. I didn't, and still don't appreciate the wasted space of landscape displays.
Three-button mice with the mouse buttons arrayed horizontally was also standard because of the Alto. The first time I saw the Macintosh mouse in 84 I was shocked: how could someone use only a one-button mouse? There was a lot of mouse (originally called the "bug") experimentation in the 70s on button count and layout.
The microcode of the alto was compatible with the DG nova as that was the computers used at PARC before the Alto was developed (before my time!).
edit: forgot to mention the origin of blitting.
> most glass TTYs (think ADM-3A, Hazeltine, VT-52, and I believe
> the 3270 as well) were taller than they were wide
Edit: Yes, I should have replied to the other comment…
I'm confused about your portrait vs landscape comments, though. The ADM-3A, Hazeltine 2000, VT-52, 3270, as well as Datapoint, Four Phase, Viatron, etc had a horizontal display, not a portrait display.
It seems crazy today but not in context. There wasn't video hardware in today's sense. There were either mainframes (with channel controllers) with the terminal doing the "rendering" or minicomputers and microcomputers in which the CPU did everything (which is what I guess it was like before the mainframe era).
You can see this reflected in Unix, and therefore in Linux: unix was developed for the PDP-7 (and later -11), a reimplementation of some of the ideas of Multics, which ran on a mainframe. So C's IO was "user mode" (I seem to remember a funny line in the original version, something like, "You mean you I have to call a function to do IO?") and the kernel had expensive, primitive IO capabilities and involved the CPU in everything. Well, there wasn't any alternative in the smaller PDP line (FWIW PDP-10 were larger machines than the -7 or -11, though only the later models had channel IO).
Memory mapped IO was not uncommon on minicomputers.
> I'm confused about your portrait vs landscape comments, though. The ADM-3A, Hazeltine 2000, VT-52, 3270, as well as Datapoint, Four Phase, Viatron, etc had a horizontal display, not a portrait display.
I'm confused / unclear. Those terminals had more columns than rows, true, but the character positions were rectangular. So the ADM-3 and the Datapoint were rather square, actually, because TV tubes were squarish, not paper-like as I claimed. I think I biased my memory because I used a bunch of hacked terminals like AAA Ambassadors which could cram 80x48 rather than 80x24 and because of the rectangular characters were portrait-ish. Unfortunately it's too late to go back and edit my comment.
Had a lot of those first at the Columbia computer center back in the DEC-20 days, later at the Fairchild AI Lab startup (DEC-20's and LISPMs), and even later at Imagen (a Stanford TeX project spinoff building the first commercial laser printers before Apple and Xerox released theirs).
If you wanted to do IO on the C64 for example you manipulated RAM addresses between D000 and DFFF.
I guess that makes it easier to reverse-engineer the hardware. I know it will be destructive, but does anybody know how many of these machines the visual6502.org team would need to do that?
Just look up the logic diagram in your copy of The TTL Data Book.
Also, I just saw https://news.ycombinator.com/item?id=11930327, which states: "All the Alto schematics are available at Al Kossow's Bitsavers . If you want to understand how it works, start with the hardware manual "
That must make this one of the easiest pieces of hardware to keep running, at least for the digital parts (monitor and hard disk are more of a problem, I understand)
(I find the Nova instruction set strangely similar to the ARM in many ways, but maybe I've just spent too much time looking at the ARM1.)
At one point we had some questions about SmallTalk-76 and called up Xerox PARC. Managed to get hold of Adele Goldberg, who answered our questions but was not terribly amused. I think Alan Kay would have been friendlier to us kids :-)
At work, I have a 24 inch TFT in portait mode which I use mostly for coding (and other tasks where vertical space is valuable). It is very nice, because e.g. in text processing, a whole page fits the screen nicely.
In a way, we have that with tablets and phones, too.
Certainly screen rotation (with windows tablets and phones) would involve a lot of inefficient re-rendering. But i think the official reason was that since the subpixel colouring effect depends on the background colour, it's hard to animate transitions efficiently.
On a high-DPI screen, you'd be hard-pressed to notice the difference compared to greyscale hinting. Colour subpixels were a great hack, but high-DPI is the proper solution to this issue.
In the end I missed the horizontal space often enough that I've now settled on having at least one full-size - i.e., not laptop display - landscape display in the setup. (Currently I've got my laptop, 1 x landscape in front of me, and 1 x portrait to the side.)
If you're going to do this, I guess you want to buy IPS-type monitors to minimize colour discrepancies due to the wide range of viewing angles you'll be using. eIPS is supposed to have a 170° viewing angle, but the colours on mine still aren't quite consistent from side to side (or top to bottom as it is once rotated). If you've got a TN-type display I imagine it will be even worse.
Also might be worth buying all your monitors at once. I bought mine over a couple of years; the colour temperature is very slightly different on each one, even when using the factory-calibrated settings, and one has a noticeably different anti-glare coating.
As it was mostly used for DTP it made loads of sense.
The problem came in when running applications that didn't respond correctly to the configuration changes - since we had patched the os (transparently, particularly when not actually driving the display) we got blamed for a lot of application crashes and ended up debugging and sending very explicit bug reports to other software vendors.
When the iPhone came out, I was deathly afraid of the screen rotation for about 6 months, convinced as I was that 50% of the apps wouldn't respond gracefully. (And I personally have never found one that failed here.)
Sooner or later the last functional Xerox Alto will cease to work (sadly). In that case it could make sense to replace the dysfunctional parts with modern retro circuits. I wonder if a project to build a functional Alto clone (with TFT as screen) would make less effort than the famous monster 6502 which was presented recently.
Making a FPGA-based Alto clone would be a possibility.
I guess someone could just write an emulator for the Raspberry PI and Linux or something based on the design of the Xerox Alto and how the chips function.
It seems there are emulators or simulators for Xerox Alto out there already:
To help out would be the source code to the Xerox Alto:
It was released and I am sure the team can make modifications to it to make it run with modern retro hardware.
Seeing one of these machines in action would be awesome. (Is there an emulator available?)
The more I research into Xerox's papers and manuals for Interlisp-D, Smalltalk and Mesa/Cedar systems, the more I become convinced it was a big step back to the industry the adoption of inferior systems like UNIX.
Thankfully many traces of those ideas are now in Windows, Mac OS X, Android and iOS, Language Playgrounds and many IDE workflows.
The other major failing that I see is overlooking ideas from ZFS, that the filesystem can act as a virtual tree over any storage medium, so UNIX wastes a lot of time on things like dependency hell, permissions, and distinguishing between file and socket streams or local and remote processors. It could have jailed each process in its own sandbox where copies of libraries are reference counted by the filesystem, running in a virtual storage and thread space. We're just now seeing the power of that with Vagrant and Docker (technically it took so long to get here due to virtualization resistance by Microsoft and Intel).
My other main gripe is more about approach than technology. UNIX (and LINUX especially) stagnated decades ago due to the RTFM philosophy. The idea being that to be proficient in UNIX, one had to learn the entirety of the operating system. This goes against one of the main tenets of computer science, that we are standing on the shoulders of giants. So I really appreciate how passionately the Alto tried to make computing and programming approachable to the masses.
I keep hoping someone will release a portable lisp machine that can run other OSs under virtualization and release us from these antiquated methodologies..
That's pretty simple to explain: all those other options were just way too slow to get the kind of performance required out of the hardware available at the time. The difference was simply too large to be ignored.
It's all nice and good to theorize about how the past should have been, but without UNIX you probably wouldn't be writing any of this on the medium you're currently using.
It has its flaws and it is far from perfect but at the time it fit the bill nicely.
The real problem is that we are categorically unable to move on when better options are around. There is a large amount of silliness involved when it comes to making responsible choices in computing, lots of ego, lots of NIH. Those are the real problems, not that UNIX was written in C.
If you compare with Xerox PARC hardware not really, the major issue was the price to produce the type of architecture they were having.
As for safe systems programming, Burroughs was already doing it a decade early in computer hardware much weaker than a PDP-11.
Every 10 years or so the industry just restarts the same loop it's been stuck in since the Amiga (actually the Amiga is probably the first loop starting with the Alto) just with different syntax and faster hardware. Software is stagnant; ALL progress in in hardware. And with the end of Moore's Law that is grinding to a halt too.
And in a way that's a real pity. It could have been that if Moore's law had been a doubling in 30 years rather than 18 months that we'd have had a lot more appreciation for writing good software. As it was the crap won out over the good stuff simply by being bailed out by Moore's law just in time for the next cycle.
But in some alternate universe hardware progress was so slow that any gains had to come from better software.
And stack turtles.
Whenever i see a headline about unikernels, i envision doom running on DOS in a VM on top of Linux on top of some hardware somewhere. How many layers of (potentially leaky) abstractions are we looking at?
The minimal set of reliable human computer interface is a keyboard and a screen. going beyond that quickly add either noise, delays, or both.
I'm not sure it cost anyone anything. I mean, a lot of the OSes were/are written in it, so if you were going to go down that path you'd have to not totally forget to add a rather large benefit in the credit side. It's hard to imagine but programming wasn't always about compensating for not quite understanding how the 17 different frameworks you've downloaded from github and dragged into an IDE worked by just getting a faster machine. Once upon a time people had to carefully measure how much to unroll the loop, or how small a lookup table they could get away with before the errors become a problem.
Edit: I've changed the link. Thanks for letting me know about the bad redirect, golergka. The jwz link looked fine to me; it's pretty obnoxious if the site does a NSFW redirect.
Heh, and while writing this it dawns on me that it may well be that he is about OSX because UNIX. Meaning that OSX is a BSD derivative, that in turn is derived from the historical UNIX. While on the other hand Linux is just a bastard clone.
Because it hit me that i have seen some similar fervent anti-Linux sentiment from other UNIX people, related to either the BSDs or Solaris.
Wikipedia has an objective and some structure, whereas HN is, for all its uniqueness and value, not an organized activity.
This makes me want to message him and ask him about his time there.
> The Alto was introduced in 1973.
2.5 megabytes of removable/swappable storage ? In the 70s? I'm amazed that users found it constraining! That's more than even the Amiga managed to fit on 3.5" floppy disks (Unlike the PC which generally were only able to format for 1.44 MB, the Amiga generally fit ~1.8 on the raw 2MB HD 3.5" floppies).
The user guide describes in great detail  why Alto users found 2.5MB of disk limiting. To summarize, the disk starts with 4800 512-byte pages. Basic OS stuff takes 900 pages. FTP and the editor take 900 pages. Fonts take more, as well as other commonly-needed software. So a non-programmer's disk typically has 1600 pages (800kB) available to work with. Programmers require more tools, so the free disk space is tighter. It seems people ended up needing to manage their disk space closely, even though 2.5MB sounds like a lot.
 See pdf page 20 of http://bitsavers.org/pdf/xerox/alto/stanford/StanfordAltoUse... for a discussion of why disk space on the Alto was rapidly used up.
Interesting to note (as @pjmlp mentions) that 1973 + 15 years leaves us at around the time the Amiga 2000 was introduced. I remember ours (a model b IIRC) initially had a 20MB hard-drive, later upgraded with another 40MB for a total of 60MB of storage. It could fit almost all of my floppies on there!
Looking at the wikipedia page for the Amiga 500 I see the RAM latency listed as 150 ns. I'm not sure if that's the actual latency to read data from RAM - if so it's within an order of magnitude for accessing RAM today! (I'm also a little sad that while my i7 is a lot faster than that ageing 68000 that ran at 7.5 Mhz, it's still not easy to find a single core with 7.5+ Ghz of performance :-( ...).
The Alto was not meant to be a personal computer but a "personal supercomputer"—hardware so far ahead of the curve that it could run the software of the future. In other words, not a personal computer but a machine for inventing personal computing with, which of course they proceeded to do. Having such hardware is what makes it possible to figure out what the software of the future needs to be. If you can't run it, you can't create it.
For me this sheds light on a couple of Alan's famous lines: that the best way to predict the future is to invent it, and that people who are serious about software need also to be serious about hardware.
Such advanced hardware is necessarily costly but should still be within the reach of a properly funded research lab. It seems to have been an enormous frustration to Alan and his colleagues that they were unable to persude hardware vendors to make such hardware at any price, especially the Alto's microcode-driven style of hardware. Among other things, that was what had enabled the PARC researchers to make the most of the limited memory they had, since when a task was consuming too much memory they could implement it in microcode, pushing it down to the hardware (at least from the software's point of view) and freeing up RAM for software.
It's important to remember that the Alto was called the "interim Dynabook", a system to try out the Dynabook ideas until technology caught up and made a Dynabook-like system possible (40 years later).
 http://history-computer.com/Library/Kay72.pdf - jump ahead to page 6 for the hardware details
No surprise there. Hardware production is frankly inherently conservative unless one is a startup (aka, nothing to lose but ones pride). This because you have all the up front cost of tooling and production runs before you even know how well something will sell.
Xerox gave the same license to Tektronix (which launched its 4404 and 4406 Smalltalk computers), HP (which created Distributed Smalltalk) and DEC. The DEC license was extended to UC Berkeley.
When Smalltalk-80 Version 2 was released all four companies got a free license for it which allowed them to do anything they wanted, including re-release it under a different license. Which is what Apple did (three times) with Squeak. The Version 2 license is from 1982 while the Squeak licenses are from 1996 and 2006.
But you are right - Xerox did not give Apple any special treatment compared to the other three, though it did compared to the world in general.
Note that Apple had several Smalltalk users involved in the development of the Lisa and the Macintosh, but they felt Smalltalk was too much for the machines they were creating. Steve Jobs felt that was still the case when developing the NeXT, which is why they used the Smalltalk/C hybrid Objective-C.
// sorry for the latency on my responses, I'm driving west on I94
This contradicts some accounts which say that the Lisa was a text based minicomputer before the Xerox visit.
In practice both the Lisa and the Macintosh projects started in 1979. This is pretty amazing when you consider that the floppy disk for the Apple II had just been made available less than a year before and Visicalc had not yet been released. That the Apple III was in development at this time seems pretty reasonable, but these more advanced machines were pretty ambitious.
Jef Raskin was very familiar with the Xerox PARC work since he had visited there when he was a professor at UCLA before he joined Apple to work on documentation. He didn't like the mouse or windows, but had always been interested in a fully graphical computer. His ideas for the Macintosh can be seen in his later Canon Cat machine. Steve Jobs didn't like his project and kept trying to kill it. Jef thought that if Steve could see the Alto he would "get it" and leave the Macintosh group alone. That is indeed what happened. But not long after that Steve was kicked out of the Lisa project by the board who wanted someone with more experience to be in charge of such an important project. Steve joined the Macintosh project and reshaped it as a "Lisa Jr", which caused Jef to leave Apple.
In the Lisa timeline you can see the effect of the second visit to PARC (both were in 1979) in the form of windows, though these didn't stay like the Smalltalk ones for very long (this style reappeared in the BeOS for some reason). And the effect of the launch of the Xerox Star can be seen in 1981 in the form of icons which look very much like the ones on the Star.
The Smalltalk project at Apple was started in October 1980 and lasted 18 months. The system first started running on the Lisa in April 1981.