Hacker News new | past | comments | ask | show | jobs | submit login
VAXen, my children, just don't... (crash.com)
200 points by DiabloD3 on Feb 21, 2014 | hide | past | web | favorite | 50 comments



The really funny thing for me, is that I remember Monday the 19th of October very differently. Sun had called together the biggest press conference they had ever had, they were announcing that Sun and AT&T were going to work together to jointly build one true standard UNIX which everyone would run, with interoperability and reliably across the spectrum of computing. And that AT&T was putting 1B$ into Sun and had options to put another B$ in.

Eric Schmidt, who was the lead on that press conference, and Bill Joy who was his technical backup, were really confused why all during the conference reporters kept running out of the conference room to make phone calls. They didn't believe that their announcement was that big but everyone was clearly quite agitated.

The clipping service didn't find a single major daily that covered that press release that day. Magazines that had it and were embargoed went with it. But it got little to no coverage.


> Sun and AT&T were going to work together to jointly build one true standard UNIX

Which worked, but technically everything was SysV-flavored afterwards, which in many regards was a huge backward step IMNSHO. Sun OS was a joy to work with before that, not so much after.

Also it was sad to see NeWS go -- and speaking of pain: motif.


> Which worked, but technically everything was SysV-flavored afterwards, which in many regards was a huge backward step IMNSHO.

I'm familiar with the history of System V and BSD as the two competing "flavors" of UNIX, but I've never seen a clear description of what "SysV-flavored" really meant in terms of regular interaction with the system. Since you've clearly had some personal history with this change, could you elaborate on what "SysV-flavored" actually meant for SunOS?


Well, AT&T/SysV regarded themselves as the 600 pound gorilla in the room, and Sun as the young upstart punk kid, so in building a joint system, everything "compromised" by just doing it the SysV way.

/bin & /usr/bin was populated by SysV tools with SysV options, with the particularly incompatible command line options of "ps" that plague it and clones of it to this day.

There were still Sun/BSD bin tools hidden away somewhere for use by the desperate, but they were deprecated and weren't in the search path of one's coworkers etc. etc.

I remember there being a lot of pressure to adopt SysV "streams", which from 20,000 feet sounded very powerful but were actually a nightmare -- not according to me, since I managed to avoid them, but according to later accounts of people I knew at AT&T.

And of course there were similar changes to libraries and man pages and /etc and init and etc.

I cut my teeth on BSD, contributing in small ways to its earliest development, and I considered it an improvement over vanilla Bell Labs/AT&T Unix -- not just NIH; BSD introduced the fast file system and demand paging and the "more" command (later incarnation "less").

AT&T on the other hand responded with a very strong case of Not Invented Here; they really seriously did not want to retrofit improvements like that. And tried to do various clones, like "pg" to replace "more", which was so awful it couldn't even page backwards, just to give some idea.

System III had a few new features but mostly sucked, due to that attitude, and System V was more or less more of the same.

Sun OS took all of the BSD improvements (ok, 7 options to "cat" is arguable, but whatever) and added more nice stuff, like NFS and RPC (RPC got a lot of flak, but originally there really was no alternative) and "Yellow Pages" (later renamed to NIS/Network Information Services for trademark reasons).

NeWS was a breakthrough in windowing, based on postscript rendering to screens (although not the same as Display Postscript), and was quite brilliant.

In creating a common standard, the compromise was to throw away or hide the majority of the BSD and Sun OS improvements in order to, basically, just let AT&T do things their own inferior way.

NFS was already becoming an industry standard, so it didn't go away, but NeWS did, in favor of the incredibly bad "Common Desktop Environment" GUI standard, which was a step backwards relative to all X11 offerings, not just relative to the breakthrough NeWS.

The alternative, Motif, had a ridiculously complex and clumsy API that took vast amounts of boilerplate programming to get anything done. Dark times for GUIs.

http://en.wikipedia.org/wiki/Common_Desktop_Environment

[In my first draft I mixed that up with Open Look, a sign of fuzzy memory]

I haven't thought about it in ages, so there are likely a bunch of other things I'm not remembering, but the net result was a standard that was far less pleasant to program to or to do system administration on -- plus many instances of dual standards.

Chuck above was hit by similar nastiness in the kernel itself; I wasn't dealing much with kernel internals at the time, but from what I heard, it was perhaps even worse there.

If Sun OS had merged with Plan 9, that would have actually advanced the world, rather than retarding it.


> NeWS was a breakthrough in windowing, based on postscript rendering to screens (although not the same as Display Postscript), and was quite brilliant.

Although the idea of building apps in PostScript is the stuff nightmares are made of, I recognize it was indeed brilliant. It was partly the work of a young James Gosling.

I remember CDE was heavily, although indirectly, influenced by Microsoft through both HP and IBM. CDE windows and menus are just 3D-like versions of Windows 2 windows.


I wonder if it is possible to implement the tty part but not the networking part of STREAMS.


I'm totally with you there, I count that agreement as the day Sun died. The rest was, as they say, history. One of the things it taught me was how financial analysis can lead you astray technically.


Well, it didn't die, but it did contract an incurable cancer.


lesstif?


I was also at Sun then.

The 1989 Loma Pietra earthquake was a bigger deal.


A bit of explanation: a VAX 750 is about the size of a large washing machine. It does about 1 MIPS, and can handle a surprisingly large number of serial terminal users simultaneously due to the I/O design. An RA60 is a hard disk unit, but not the sealed "Winchester" technology that IBM was promoting (and certainly won). Each disk pack could store 200 or so MB on several 14 inch platters that were held in a clear pack. It was frightening, expensive, and remarkably rugged.

By the time I was part-time sysadmin of one in 1992 or so, they were totally obsolete, and my 486DX-33 running Linux was faster at everything... except effective serial I/O for multiple users.


By the time you were a sysadmin of a 750, in 1992, several generations of newer VAX chipsets had come and gone.

The later VAX CPUs were considerably faster than a 486.


Not that overwhelmingly faster. The early 90's were the swan song for the mainframe. ECL logic was still faster than CMOS, but not by that much, and transistor density was such that multi-chip CPUs still made sense for many applications, but not all. The VAX 9000 was contemporary to the 486/33, ran at about twice the clock speed, probably had 2-3x the net IPC for typical integer code, had a vector unit (not entirely dissimilar to modern SIMD stuff like SSE/AVX), and was available in configurations up to 4-way SMP.

That's a much beefier system, surely. But it's not in a different league. And the contemporary RISC boxes were beating the VAX pretty badly in single-processor benchmarks already.


I meant literal CPUs, in the sense of a microprocessor chip. The ECL VAXen were not killed off by RISC or 486es. They were killed off by CMOS chip-based VAX implementations.

The last generations of VAX microprocessors were about 90% as fast as the giant ECL suckers at a tiny fraction of the cost/heat/size. DEC continued to manufacture and develop them for several years after the end of ECL.

The VAX chips that killed the 9k sat in desktop/deskside chassis just like a contemporary RISC system. Literally: you could buy an identical box with either Alpha or VAX processors.

(Besides, the giant ECL VAXen were always a niche product. Nearly all the VAX systems sold were based on VLSI and microprocessors.)


Oh sure, but the 9000 was DEC's contemporary flagship against which that 486 would have been competing. And my point was that it was faster, but not immensely so -- maybe 5x, or about the same speed as the Pentium 133 that the PC owner would be buying within 3 years.

My point wasn't that VAX as an architecture was dead (it wasn't, though it would be so soon), just a reply to the contention that the line had an insurmountably large performance lead against commodity CPUs. It didn't really.


Interestingly, this performance drop when moving to CMOS was one of the drivers towards the development of clusters (at least for IBM - I am not very familiar with DEV technology).


That was the joke. The reason SPEC picked a Vax as the reference for the SPECmark was that Vax performance was a universal constant, like an electron volt.


But probably not as fast as the phone in your pocket today.


For multi-user machines, IO is more important than processor power.


Some VAXen, in their native habitat. http://www.youtube.com/watch?v=mYP-F7XxrDA


In case the date at the end fails to register, the reference (19 October 1987) is to http://en.wikipedia.org/wiki/Black_Monday_(1987) .


Hah! In all the times I've read this story, I've apparently always missed the punchline. Thanks.


Perhaps I'm severely impaired, is the suggestion that this somehow caused it? I didn't quite understand "we forgot about the VAX" and rebooting it part. Is this hinting at a conspiracy?


    > Perhaps I'm severely impaired, is the suggestion that
    > this somehow caused it?
Yes.

    > Is this hinting at a conspiracy?
Not conspiracy, incompetency (Hanlon's razor: "Never attribute to malice that which is adequately explained by stupidity.").

    > I didn't quite understand "we forgot about the VAX"
    > and rebooting it part.
I'll break it down:

Relevant characters:

    - "the gnome" (the company comptroller)
    - "the data center manager"
    - "the shift supervisor" (of the datacenter)
    - the author/narrator, who is the VAX expert
The gnome is using the VAX to transfer a trillion dollars from one bank to another. During this process, there is a period of time when the money has been withdrawn from one bank, but is not yet deposited in the new bank.

While the money is in this in-between state, the data center manager saw the narrator, which reminded him of the VAX, which (to his knowledge, since he didn't check his messages) hadn't been checked on during the whole ordeal. This causes the data center manager to scream "we forgot about the VAX" to the shift supervisor. So, "he" (either the data center manager or the shift supervisor) reboots the VAX, since he is incompetent at working with the machine.

It is suggested that because the machine was rebooted before the money was deposited in the new bank, the money disappeared. The date suggests that this sudden disappearance of one trillion dollars caused the stock market crash.


I was sitting in the David Russell Hall tv room in St Andrews watching the stock market crash and various students get financially wiped out.


And is it true, was this the cause? Or is this fiction?


I don't know if the rest is fiction, but it didn't cause the crash. Some traders I know have told me that many knew they were wiped out before futures/markets opened.

These can help: http://en.wikipedia.org/wiki/Black_Monday_(1987)#Timeline

"In 1986, the United States economy began shifting from a rapidly growing recovery to a slower growing expansion, which resulted in a "soft landing" as the economy slowed and inflation dropped. The stock market advanced significantly, with the Dow peaking in August 1987 at 2722 points, or 44% over the previous year's closing of 1895 points. On October 14, the DJIA dropped 95.46 points (a then record) to 2412.70, and fell another 58 points the next day, down over 12% from the August 25 all-time high. On Thursday, October 15, 1987, Iran hit the American-owned supertanker, the Sungari, with a Silkworm missile off Kuwait's main Mina Al Ahmadi oil port. The next morning, Iran hit another ship, the U.S. flagged MV Sea Isle City, with another Silkworm missile. On Friday, October 16, when all the markets in London were unexpectedly closed due to the Great Storm of 1987, the DJIA closed down another 108.35 points to close at 2246.74 on record volume. American Treasury Secretary James Baker stated concerns about the falling prices. The crash began in Far Eastern markets the morning of October 19. Later that morning, two U.S. warships shelled an Iranian oil platform in the Persian Gulf in response to Iran's Silkworm missile attack on the Sea Isle City.[8] [9]"

http://en.wikipedia.org/wiki/Talk%3ABlack_Monday_(1987)

"The link at the end, which leads to a story about a VAX machine getting into trouble, might be erroneous. I suggest this because I've seen the story before without the line at the very end linking it to Black Monday (in fact, I always thought the event took place significantly prior to 1987). While it's possible that somebody might add this last line to add dramatic emphasis to the story, it's unlikely that it would be cut out by somebody - it's forms a punchline to the story. Therefore I think the link between this VAX story and Black Monday is probably incorrect. Note: the joke link in question (*Alleged computer mishap) was removed 2005-11-05 RaulMiller 00:20, 22 November 2005 (UTC)"


Seems like there were no emergency shutdown protocols yet for the programs trading stocks so it amplifies and hasten the process. Protocols were implemented after that (from wikipedia).


I haven't laughed this hard in days:

The fire alarm klaxon went off and the siren warning of imminent halon gas release was screaming. We started to panic but the data center manager shouted over the din, "Don't worry, the halon system failed its acceptance test last week. It's disabled and nothing will happen."

He was half right, the primary halon system indeed failed to discharge. But the secondary halon system observed that the primary had conked and instantly did its duty, which was to deal with Dire Disasters. It had twice the capacity and six times the discharge rate.


I was nearly the victim of a situation like this in the late 90s. There was a power failure site-wide and the generator failed to start in time (water in the fuel). There were three of us in the machine room at the time of the power incident. Unfortunately due to some insane security requirements, the doors lock in a failsafe situation (this was a UK MoD facility) voiding all health and safety laws.

One of the idiots decided to light up a cigarette whilst we were sitting in there in the very dim emergency lighting. Turns out the power system for the halon was still active and the 60 second alarm went off.

Fortunately we hit the stop button on the wall to prevent it being dumped on us.

New pants all round that was.



Indeed, but the experience is not one you'd want.


I smiled and said, "No sweat, I'll train you. The first command you learn is HELP" and proceeded to type it in on the console terminal. So the data center manager, the shift supervisor and the eight day operators watched the LA100 buzz out the usual introductory text. When it finished they turned to me with expectant faces and I said in an avuncular manner, "This is your most important command!"

The shift supervisor stepped forward and studied the text for about a minute. He then turned with a very puzzled expression on his face and asked, "What do you use it for?" Sigh.

This feels like every time I've tried to explain some bit of technology to a non-techie.


I actually clapped and said "Oh my god!" when I got to the last line. That was a great story.

Also, here is what a Halon discharge looks like: http://youtu.be/2fyGGqgVzCY?t=1m36s


So, did the story have anything to do with the punchline or was it a coincidence? And what happened to the trillion dollars?


I enjoy stories like these. I read them throughout my high school years (late 90's, early 2000's).

Now I've been in industry for close to a decade, sometimes I feel like I missed the golden age of computing.


I felt the same. When I was a PFY, I hung out with the then middle aged mainframe technicians, and they had lots of fantastic stories like this from the early days of computing.

Stories about unbalanced hard disk units wandering around the room to be found blocking the door the next morning. HD crashes where you had to vacuum up metal shavings before replacing the platters (which were the size of a washing machine drum). Amusing tales of the little vacuum hose that was supposed to automatically suck the lose end of a tape spool into the drive going wrong.

Now I work almost exclusively with virtual machines, which makes such hardware failures much less exciting.


That was the first day of my honeymoon, in Aruba. The hotel had a local four page paper that came out at noon, so I saw it when we checked in shortly afterwards. Boy was I glad not to be in the office of the (investment company) where I worked then. And they said you could never get fired for buying IBM...


What was the barrel for?!


Hauling away the destroyed parts, I presume.


Cool story but I read the date was added afterwards and was removed from wikipedia in 2005 when this was realized: http://en.wikipedia.org/wiki/Talk%3ABlack_Monday_(1987)#Link...


I thought that halon systems where supposed to have interlocks so that they could physically not go off if anyone was in the room.

The dinosaur pen batteries seem to have been done on the cheap though - dont understated how the power coming back took out the UPS though.


The power came back at the same time as the generators came on. I guess this doubled the, uh, voltage, and threw the phase off... yeah.

Edit: clearly the problem was that too much amperage was pushed! :)


You don't just connect the two in series :-) In fact the best practice is not to switch from your backup supply back to mains automatically just in case the power goes out again.

How its should work is mains or your gas turbine are used to charge the batteries the power is always coming from your battery room.


I am not sure this was best practice in 1987. We learned a lot from incidents not entirely unlike this one and I went through one catastrophic power failure (that took down more than 1000 machines in the late 1990s) due to a cascade problem between the multiple "redundant" power sources.

One of the machines refused to come back and we got a call from operations asking where (physically) was the box within the datacenter. It was inside a Cubix blade-like (we didn't call them blades at the time) chassis.


It's best practice according to my 1948 GPO telecommunications handbook power always comes from the batteries.

There is an interesting write up on the register about how a London based colo facility does its UPS here

http://www.theregister.co.uk/2014/02/21/city_lifeline_london...


I agree it's the only sane approach, but I have seen a lot of insanity in computing. It's not impossible telecom best practices were ignored for data processing until it became obvious computers are every bit as critical as telephones.


If you abruptly change the speed of a generator (like when it breaks), a huge voltage peak comes from it.


Absolutely not. Purge systems like this WILL go off if they are not bypassed. Typically you have a short amount of time to hit the 'don't purge me, bro' button, before the system dumps.

They're designed to protect critical hardware, not people.


is this fiction? if not, then why isn't the causes list of Black Monday including this? http://en.wikipedia.org/wiki/Black_Monday_(1987)




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: