Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Was the Y2K crisis real?
465 points by kkdaemas 16 days ago | hide | past | web | favorite | 371 comments
There was very little fallout to the Y2K bug, which begs the question: was the Y2K crisis real and well handled or not really a crisis at all?



Yes, the y2k crisis was real, or more accurately, would have been a serious crisis if people had not rushed and spent lots of money to deal with it ahead of time. In many systems it would have been no big deal if unfixed, but there were a huge number of really important systems that would have been a serious problem had they not been fixed. Part of the challenge was that this was an immovable deadline, often if things don't work out you just spend more time and money, but there was no additional time that anyone could give.

The Y2K bug did not become a crisis only because people literally spent tens of billions of dollars in effort to fix it. And in the end, everything kept working, so a lot of people thought it wasn't a crisis at all. Complete nonsense.

Yes, it's true that all software occasionally has bugs. But when all the software fails at the same time, a lot of the backup systems simultaneously fail, and you lose the infrastructure to fix things.


I remember hearing a story on NPR years ago years after Y2K with someone knowledgeable on the subject that addressed this question. He gave basically the same answer you did: a lot of effort went into avoiding disaster and a lot of people treated it after the fact as if it was hype or hysteria. I recall the interview because it changed my significantly thinking on the subject. (I wasn't in the industry at the time of Y2K.)

He also noted that a lot of American firms worked with Indian firms to resolve it. Indian engineers were well suited to the problem, according to him, because they had been exposed to a lot of the legacy systems involved during their studies.

He said an interesting consequence of this was that many American companies concluded they could use lower wage Indian engineers on other software projects so it helped initiate a wave of offshoring in the early 2000s.

I couldn't find the story I heard in the NPR archives, but they do have a number of stories reported right around that time if you want to see how it was being discussed and treated by NPR at the time:

https://www.npr.org/search?query=y2k&page=1

I did find this NPR interview with an Indian official from 2009 that notes Y2K's impact in US-Indian economic relations:

But I think the real turning point in many ways in the U.S. relations was Y2K, the 2000 - year 2000, Indian software engineers and computer engineers suddenly found themselves in the U.S. helping the U.S. in adjusting itself to the Y2K problem. And that opened a new chapter in our relations with a huge increase in services, trade and software and computer industry.

https://www.npr.org/templates/story/story.php?storyId=120738...


It was also really difficult to detect bugs since no good testing paradigms had been established yet.

This created an exploitable opportunity for the few who knew how to write automated tests.

During the Y2K panic Sun Microsystems (IIRC) announced that they would pay a bounty of ~$1,000 per Y2K bug that anyone found in their software (due to the lack of automated tests, they didn't even know how to find their Y2K bugs)

James Whittaker (a college professor at the time) worked with his students to create a program that would parse Sun's binaries and discover many types of Y2K bugs. They wrote the code, hit run, and waited.

And waited. And waited.

One week later the code printed out it's findings: It found tens of thousands of bugs.

James Whittaker went to Sun Microsystems with his lawyer. They saw the results and then brought in their own lawyers. Eventually there was some settlement.

One of James' students bought a car with his share.


That's a fantastic story, I'm impressed the professor didn't try to take the whole settlement.


Some people have character.


It's sad to live in a world where people are impressed when someone does the right thing.


It’s sad to live in a world where people are naive enough to assume that doing the right thing without incentive is commonplace. You must be great to negotiate with.


On another note, having listened to James Whittaker talk while at Microsoft, he was a phenomenal storyteller and speaker.


+10000 his talks are amazing to listen to


This should be its own post on HN! Great story.


Thanks for sharing. This is why I love HN


I think a more accurate way of describing the Indian engagement is that a lot of recruiters and Indian firms discovered there was very little vetting of candidates, and that almost anyone could be put forward and start earning money for the recruiter or offshorer.

It was money for jam.


Yep, this was something that many firms were delaying spending on it. It's pretty frustrating because this seems normal. Only do something about an issue when the time horizon is near rather than take it on when it's further out.


From what I've heard, a number of IT departments used it as justification to dump legacy systems.

For example, at a former employer of mine, a big justification for getting rid of the IBM-compatible mainframe and a lot of legacy systems which ran on it (various in-house apps written in COBOL and FORTRAN) was Y2K.

In reality, they probably could have updated the mainframe systems to be Y2K-compliant. But, they didn't want to do that. They wanted to dump it all and replace it with an off-the-shelf solution running on Unix and/or Windows. And, for reasons which have absolutely nothing to do with Y2K itself (the expense and limitations of the mainframe platform), it probably was the right call. But, Y2K helped moved it from "wouldn't-it-be-nice-if-we" column into the "must-be-done" column.


COBOL programmers were charging a fortune to be hauled out of retirement to work on this. There was a huge shortage of experienced COBOL devs. And devs who actually understood the specific legacy system involved were even rarer. If the company had customised their system and not kept the documentation up to date or trained new programmers on the system, well, they didn't have a choice. They had to replace it.


It's adorable that these answers are all in the past tense. This is still going on. Many of these systems still exist.


To be fair, I've met a lot of young COBOL programmers since starting to work on mainframe systems last year. I think the "elderly COBOL programmer coaxed out of retirement by dumptrucks full of money" dynamic that supposedly dominated the Y2K response is less prevalent these days, and companies that still run mainframe systems just realize they have to hire regular programmers and teach them COBOL and z/OS.


well, I witnessed it, so it's not really a supposition is it?


Sorry, didn't mean any slight at you by that. I don't doubt your account, but having no idea how widespread the practice was, its dominance on a large scale is just a supposition on my part.

My Boss at BT took Voluntary redundancy and went COBOL contracting he made Bank!


Worked in a project to migrate a huge and critical .gov database from an AS400 nobody understood anymore to Microsoft SQL Server. We had to guess some of the values in some bitwise fields through trial and error. Made it with a few weeks to spare - nobody noticed.


I'm quite surprised no-one understood an AS/400 in 2000 - they were still pretty current and an important part of the IBM range, surely.


Oh, understanding the underlying system, that's comparatively easy. Understanding the bespoke business logic and how it's represented in the system...now that's harder, an order of magnitude at least.

Story time: years ago (2010-ish?), we were doing some light web CMS work for a client. Nothing too complex, except we had to regenerate one section off a data feed that we received in a very peculiar custom format.

Great, everything worked, we finished on time and on budget. And then: "oh, and could you also check this other site? We need some minor tweaks on what your predecessors made". That other site was also consuming that data feed, so we took a peek. It ran in PHP3, walled off from everything, processed its own intermediate languageS, and the output was a massive 200+ page PDF (which was then manually shipped to offset print in large quantities). For Reasons, this had to run daily and had to work on first try. There was no documentation, no comments, and no development environment: apparently this was made directly on prod and carefully left untouched.

Needless to say, the code was massively interdependent, so that the minor tweaks were actually somewhat harder. We did manage to set up a development VM though, and document it a bit - but last I checked, the Monstrous Book Of Monsters Generator still seems to be chugging along in its containment.


Understanding the OS of the AS/400 is a far cry from understanding all the custom software running on it.


> From what I've heard, a number of IT departments used it as justification to dump legacy systems.

> In reality, they probably could have updated the mainframe systems to be Y2K-compliant.

Legacy systems weren't always mainframe systems and, in any case a “legacy system” is precisely one you are no longer confident you can safely update for changing business requirements.

A change in a pervasive assumption touching many parts of a system (like “all dates are always in the 20th century”) is precisely the kind of thing that is high risk with a legacy systems.


I was working at British Telecom at the time and we found about 500 or so systems that where running but no longer had any value.

And don't get me started on Oracle refusing to provide updated y2k compliant backports!


> from "wouldn't-it-be-nice-if-we" column into the "must-be-done" column

Sounds like some of the benefits from the Coronavirus crisis, like the increase in working from home.


As they say in DC, never let a good crisis go to waste


Wasn’t this originally said by Winston Churchill?


How apropos to current events.


I'm having PTSD just thinking about updates of YY to CCYY in mainframe code. Very real and heavy investment across numerous industries from aviation to financial and more. At a macro level, I remember it being pretty well managed risk and mitigation.


When you start 3+ years before the known threshold date, you're giving yourself the best chances at success. I remember flashing BIOSes for weeks in the computer labs at my university.


Oooh. Do you remember any notable number of systems turning into doorstops in the process?

I wonder what the practical mixture of NAND flash vs EEPROMs was. I understand NAND wasn't especially stable back then.


EEPROM. And there has always been a reflash method for bricked firmwares for all AMI and Phoenix BIOS' as well. Insert disk with firmware, Hold a key, power on.

Once upon a time we got somewhere in the neighborhood of 1200 dell workstations and not a single one failed the bios upgrade. 10/10 would do again


Dear god, reviving the ghosts of IT departments past!!!

Yes, updating JDEdwards AS/400 systems and many a PC was a big project, but i dont recall doing it at that time as being super difficult... frack ist so long ago i cant really recall many of the details other than reporting daily on the number of systems updated.

Also, fuck you google - this was when you were nascent a i converted the entire company to your “minimalist” front page vs yahoo, and a few years later you wasted 3 months of my time interviewing to tell me to expect an offer letter tomorrow and called to then reacind the offer (as i didnt have a degree) and then continuing to contact me for FIVE FUCKING YEARS for the same job that you wouldnt hire me for.

/off-my-chest


It happened to all of us. But be glad you are not there now.

None of those people contacting you, or phone screening, were Google employees. They were, or worked for, contractors. It didn't matter to them or to Google whether anybody they contacted was hired.


Oh it mattered to the recruiters, right up until they realized he didn’t fit the template.


I had a teacher in college who'd been part of a project to fix the Y2K bug in the bank he worked at the time. He said to us when talking about project management, that the only project he saw actually finish on time was that one.

The pressure was immense. People don't realise what a huge success story the Y2K situation really was.


Yeah the sheer volume of just bad data entry type situations with messed up dates could have been for some companies absolutely enormous.

Whole businesses would be ground to a halt to stem the mess of bad data and if not prepared they would not have quick fix.

In many cases in the age of apps and web apps we can roll out fixes fairly quickly, but in 2000 that was very much not the case for most situations.


I worked at a University that had a 28 story tower. Around 1998 they set the clock in the lift management system to 1 January 2000 and all the lifts lowered themselves to the bottom of the lift wells because they hadn't been serviced for 50 years (or something).

So yes indeed stuff was broken, but it got fixed before the big date.


This sounds apocryphal. If setting the date forward triggered a maintenance warning, it sounds like the date system was working correctly.

More importantly, lift systems are not sensitive to long duration dates. They do not need to be.

This story reflects one of the myths that emerged from the media. Around 1997 they started discovering the topic of embedded systems and that many devices, including lifts, contained "computers." Not understanding the restricted and specialized nature of embedded systems, they then claimed all these systems were vulnerable to Y2K, and that the whole world was about to crash.

Most embedded systems, in lifts and other devices, had been designed in the 1990s. If they used dates, they did not use mistakes from the 1960s.


The system was using 2-digit year and the service interval was went negative. It indeed happened, I was working in the IT department at the time.

The system was probably designed in the 1970s.


> Yes, it's true that all software occasionally has bugs. But when all the software fails at the same time, a lot of the backup systems simultaneously fail, and you lose the infrastructure to fix things.

This is especially important if you think about what things were like in the late-90s. As a teenage geek in Northern Ireland I read about the the first tech boom, but locally there was very little exploitation of tech and you were an anomaly if you had dial-up internet at home. There was very limited expertise available to fix your computer systems in the best of time.

Big companies had complex systems, but they could also afford contracts with vendors in the UK mainland, or Eire, to fix systems. It was much more limited for other companies. While people were generally not relying on personal computers too much, small businesses had just reached that tipping point where Y2K would have been painful. As a result, they took action.

I only got to visit the US a few times in the early-nineties and it seemed so futuristic (I got a 286 PC in '93 to use alongside my Amiga). I imagined the Y2K problem as being much more painful in the US, and I wasn't accounting for the distribution of expertise.

One great experience was summer training courses that the government had organised with a IT training company in Belfast. These were free and it was a like a rag tag army of teenagers and older geeks from both sides of what was quite a divided community. The systems we were dealing with were fairly simple. It was mostly patching Windows, Sage Accounting, upgrading productivity apps (there was more than Microsoft Office in use) etc., but the trainer was normally teaching more advanced stuff like networking and motherboard repair so we spent a lot of time on that.


I was a computer operator in the late 1970s and I used to read ComputerWorld which was the biggest enterprise tech "journal" (actually a newspaper) of the day. In the late 1970s they had a number of articles commenting on the need to fix the Y2K problem. So smart people were planning for it well ahead of time. On the other hand memory and storage were so limited then it was probably hard to get permission to allocate two more bytes for a 4-digit year. But people were definitely thinking about it for a long time before the year 2000.


>treated it after the fact as if it was hype or hysteria

I agree with everything you and the parent said, but there was also hype and hysteria on top of it all - especially among the communities of preppers, back to the landers, and eco-extremist types. I met entire communities of technophobes who declared I was just ignorant or 'a sheep' and refused to believe me when I tried to explain that the industry was working on the problem and had a solid shot at preventing major catastrophe.

Frightening predictions of power plants going offline for days, erratic traffic lights causing mayhem, food and water supplies being interrupted, even reactors melting down were not uncommon.

As always, there is money to be made by exploiting people's fear. This hysteria was considered an irrelevant side show by most educated people, the way I look at the chem trail people or the way I used to view the anti-vaxxers [1], but the unjustified hysteria was real.

[1] Before I learned their numbers were growing and could threaten herd immunity in some areas.


True. The first image that pops into my head when I hear Y2K is a Simpsons joke where airplanes and satellites start falling from the sky and landing in their backyard:

https://www.youtube.com/watch?v=x0bBYK-7ZiU

Oh, and Dick Clark melts.


I was working for Hughes, now Boeing Satellite Systems, for a couple of years doing Y2K remediation. I was writing ground station code that were on VAX/VMS, basically older hardware and software. So that thought did occur to me of satellites falling from the sky. From what I understand, everything worked out ok. There might be only one instance that we know of that was unanticipated when the year roll-over.


This. So many parallels with every "looming disaster" since. The media went into frenzy mode a few times, predicting the End Of The World As We Know It and scaring everyone witless.

And a few people did take it really seriously and spend the Millennium in some remote place where they would stand a chance of surviving the collapse of Western Civilisation.


Y2K has this tragicomic portrayal in the movie Office Space


ahhh i remember as a teenager making sure our board of ed systems were Y2K-compliant. numerous times the principal staff would evacuate the office to take care of some action going on in the halls of the school. leaving me with access to the entire student records db and the VP login and password written on a sticky under the keyboard (not even joking!). in addition this was a laptop so RAS details were conveniently saved in network neighborhood.

this left me at a crossroads. i thought about writing malware that would randomly raise peoples semester grades by 4 points (e.g. C- would became a B+). i thought about mass changing grades. i thought about altering a target group of kids i didn't like. all but the last of the scenarios ended with me getting fingered because I was the smart computer kid. if i didn't touch my grades i would have plausible deniability. i wrote the malware. then i watched office space and decided to think about my actions (and promptly forgot as a horny teenager does). soooo glad i didn't release it because years later i went back into the code and found a leftover debug that would have targeted the only 2 kids in the school who had this letter in their last name.

tl;dr: office space is real


I wonder how many y2k deniers are there.


> people literally spent tens of billions of dollars in effort to fix it

The sceptic inside me is curious as to how much they actually accomplished in their effort to fix it. I mean, yes, they spent billions to get millions of lines of code read, but how many fixes have been made and what would be the cost (when compared to those billions of dollars spent) if they weren't fixed at all.


A good friend of mine was (and still is) a civil/construction engineer. Other than some introduction courses in college, he had never done any programming before. But after losing his job in 1998, he was hired by a major European consulting company to work on the Y2K problem.

With an auditorium full of others who had barely any programming experience, he was given a crash course in Cobol for a month, and then he had to go through thousands of lines of code of bank software that ran on some mainframe and identify all cases where a date was hard-coded as 2 characters.

Extremely tedious work, but there were tons of fixes to be made.

The problem was real, and the billions that were spent on fixing things were necessary.


I’m wondering what happened to the people in meeting rooms who suggested that this could be automated if it needed no deep understanding of the COBOL language or the banking domain..?

EDIT (for clarity): that’s what I would say, and I’m pretty sure many people smarter than I had said it - so I’m just curious what objections made business push forward with the “relatively unskilled” human labour intensive practices instead, because that’s the history we have now.


There were tools available, but it was increasing the risk (in a risk-averse atmosphere) to trust them to catch all cases. Try explaining to your superiors why you were given the task of fixing the issue and instead of completely fixing it you saved some money. Probably unacceptable.

Also, much of the remediation could not be automated anyway (for example you may not be able to assume the century is '19' for a given date field which can be in the future. You may need to do a comparison to a cutoff date, or you may only have to enlarge the field knowing that the value is supplied from elsewhere which will itself get fixed), so it fell back to people looking at the code, understanding it just enough, making the required changes, and planning/executing the testing.


I'm not a COBOL expert, but I suspect you misunderstand both the difficulty of teaching a machine to understand the problem well enough to automate fixes for it and the available computing resources at the time. This was still the era of Windows 95, the classic Mac, and expensive workstations.


I worked for a place that simulated what would have happened with an older device. It would have been pretty catastrophic... easily losses in the billions.

Remember that big old systems were the low hanging fruit in 1970. The business logic was COBOL and assembler, without things like functions. The system may have hundreds of duplicate routines for things like subtracting dates.


if you're asking "was the money well spent?" then I'm sure it was as well spent as usual, i.e. mistakes got made, a percentage was wasted and some contractors made very good rates. But in the end we got there.

If you asking "would it be better to have done nothing?" then I would have to say No, of course not. What are you thinking?


>..and what would be the cost (when compared to those billions of dollars spent) if they weren't fixed at all.

I worked on fixing Y2K bugs at the time. It was very methodically and planned, and tested on-site at a space agency. If we hadn't fixed those Y2K issues at the time then weather forecasts would have suffered greatly. The cost of that happening? Hard to calculate I think.


Lots of fixes were made. Everything would have broken if they hadn't been fixed.

Why are you sceptical about this?


Lots of people seem to come up with it now. It's been two decades, and it's brought up by people who use it as an example in their other other agendas:

"look at y2k: we spent so much money, and nothing happened!" (so far, so good) "...therefore, if we hadn't, nothing would have happened anyway!!" (uh, based on...?) "...and it was all a massive scam, just like any massive present issue!!!" (ah, there's the underlying reason)


People I've met definitely think it was an Apollo 13 type emerging problem rather than a largely preempted issue.


What you're saying here is that there was no crisis.

The fact that people worked on date issues for a decade does not mean the public scare campaign in the late 90s was justified.

You could equally claim airplanes would crash if they weren't refuelled, or people would die if garbage wasn't collected. In those cases, the media is educated enough to know the claims are pointless.

In the case of y2k, they had little understanding of how enterprise IT worked, and lots of businesses who were happy to encourage that ignorance.


And the dotcom bust came when it did in part because the Y2K spending was lifting all boats. A bunch of companies shifted from buying too much to buying not much.


I'm not sure why you're being downvoted. It wasn't really related to the dotcom bust per se. But the falloff in Y2K spend was part of the overall drop in IT spend on consultants, system refreshes, etc. We use dot-bomb as a shorthand but there were clearly a number of things that happened in the same general timeframe that made it something of a perfect storm.


To add a little nuance:

A global retailer you've definitely heard of that used to own stores in Germany spent a lot of time preparing for Y2K. This was a long and painful process, but it got done in time.

But problems still slipped through. These problems ended up not being that big and visible because a large percentage of the code base had just been recently vetted, and a useful minority of that had been recently updated. Every single piece had clear and current ownership.

Lastly, there was an enormous amount of vigilance on The Big Night. Most of the Information Systems Division was on eight hour shifts, with big overlap between shifts at key hours (midnight in the US), and everyone else was on call.

As midnight rolled through Germany, all of the point of sale systems stopped working.

This was immediately noticed (the stores weren't open, but managers were there, exercising their systems), the responsible team started looking into it within minutes, a fix was identified, implemented and tested within tens of minutes, and rolled out in less than an hour.

Pre-Y2K effort and planning was vital; during-Y2K focus, along with the indirect fruits of the prior work, was also critical.


I remember having to go into work at 4:00 AM to replace the shift that spent the night in the operations support center.

Fortunately, the guys that spent the previous year sweating the details got it right. We had one system that printed year 100 in its log which was only used by humans.

Unfortunately, I missed the party and cash the night shift got.


> that printed year 100

Did it print the year "100" or "19100"?


At the Living Computer Museum in Seattle they have an IBM S/360 that prints out we’re in year 19120.


You know what, this made me realize a hilarious alternate reality/way things could have gone, where nothing got fixed in time, EVERYTHING printed dates like that, and after 2000 years we changed the dates again, with different aesthetic factions around whether to represent the date as "100" or "19100". (I would personally have been all over 5-digit years myself.)

Hmmmm. After thinking about that for a minute I realize that math and aesthetics do not mix and that one single format for years makes a lot of computers not get confused. I rest my case. :(


I’m sure I’ve seen websites in the last year or two from the 90s that still do this



Maybe it knows something we don't...


Is that the machine John Titor was after??


It wouldn’t matter because that is a point-of-use detail, eg

    printf("year: %s%d", rand()%2 ? "19" : "'", year);
Where the real problem is the upstream year value.


It looks like it matters. If it prints "100" then it's representing the year 2000 internally as the year 100 when the year 1995 is represented internally as 1995. If it prints "19100" then it's representing 2000 internally as 100 when 1995 is represented as 95. That second scenario doesn't cause any problems -- 2000 is still 5 years later than 1995, etc.

(Or, if the log is expected to print 95 instead of 1995, then 100 is what we expect it to print for 2000, but the difference still matters -- 19100 in that case would show that something was seriously wrong.)


I don’t remember any more.

We fixed it a few days later, but it didn’t cause much trouble so it was largely forgotten.


I love these kinds of war stories of incredibly capable people. That was great planning for the unknowns.


I've worked in silicon valley for the past 11 years, in startups as well as two so-called FAANG companies. There are a lot of brilliant people out here.

So that is well known. What's not well known is this: there are pockets of brilliant people elsewhere, including Bentonville, Arkansas.

Especially in the years running up to 2000, I had the pleasure and honor of working with some of the most talented people in the world, even though we were all looked down on by many people on the coasts.


I knew a guy that worked at Walmart for years, right out of college. He said that the biggest issue he had with onboarding people was impressing upon them the scope of Walmart's enterprise. You don't just "move fast and break stuff". If your clever hack causes an outage in pharmacy, then you're directly affecting millions of people across the country that need medicine.


Yup.

It's funny, because for the first 2/3 of my tenure there, WalMart Information Systems Division did move extremely fast....and yet almost never broke anything.

It was an amazing thing to see.


Story I heard from someone who was at Walmart recently was that the surest way to get in trouble was to finish in a week what they had planned to take three months.


Things in Bentonville definitely started going downhill around 2004, and by the time I was laid off in 2009, most of the magic had been lost.

My contacts over the past 11 years have said that the division has continued its slide into bullshit.

Given how fantastic things were, it makes me very sad.


That's life in a cost center.

It is hard to contemplate just how destructive the "cost center" / "profit center" distinction has turned out to be. For us, though, it means we had better stay the hell out of cost centers.

A business performing the duties of another's cost center, where costs may be amortized over multiple customers, can shift identically same tasks into a profit center.


Sounds like a very listenable collection of anecdotes and war stories. I would be very interested if you ever share :)

(You're almost certainly already aware of http://old.reddit.com/r/talesfromtechsupport - but just in case. And hint hint.)


You are very kind. I'll keep that in mind.

I was aware of /r/talesfromtechsupport but haven't frequented it much. I will take another look.


Yeah...previous to moving to the west coast (working not for a FAANG, but a well recognized name that competes for much of the same talent, and offers comparable compensation), I was working in Atlanta.

I would argue the caliber of people I worked with there, from no name universities, or known names, but not equivalent to many people (Georgia Tech vs Berkeley, say), was generally better there. It of course varied person to person, but the teams were generally stronger, senior devs were more senior, juniors tended to be hungrier to learn, etc. Maybe it was just the lack of the FAANGs sucking up the best people.


Indeed, there was a comment here a few years ago from someone who was rejected for an interview, the stated reason being "Sorry, we're only accepting candidates from top tier schools. We've never heard of Georgia Tech."

I was like "WTF? They've probably never heard of Waterloo either!"

(For anyone unfamiliar, these may not be household names, but they are definitely top tier CS schools.)


I worked for a hedge fund that was like that. You only got in as a fresh grad if you were in a top tier school, or you had a strong referral from an existing employee. I got in on a referral, despite going to a highly ranked engineering school, but not MIT or CMU.

Edit: we also hired a few idiots out of MIT. Remember one instance when an intern from MIT sent an email to a wide audience something to the effect "your API is breaking my code, fix your API". This was in C++. They were asking for something that didnt exist and then blindly dereferenced the nonexistent first element causing a segfault. My response was to the effect of "youre dereferncing an empty vector. API is fine. Fix your fucking code."

The interns original email was sent to just my team, but CC'd nearly all of front office. In my reply I added all of backoffice so they were aware of the idiot. Never heard from them again...


The problem is there is no GT equivalent around for states.

The coasts tend to have clusters of many top caliber technical colleges.


I was in Minneapolis for Y2K... working release management/build engineering for a commercial ERP vendor, surrounded by Cray veterans (including my direct manager). The brainpower we put to Y2K was huge, but then again, we'd been selling product for a few years at that point as a "solution" to Y2K... just buy a new ERP system, and abandon your old in-house stuff rather than trying to fix it.

What a lot of Silicon Valley people don't get is that there are people just as smart as them everywhere.


While I don't have direct knowledge of Bentonville, the general impression that I have formed is that they are one or two cuts above any other enterprise of their size.


I once worked with an F500 retailer who had Bentonville 'graduates' among their IT staff. Sharp people among other sharp people. It takes a lot to keep the lights on pretty much everywhere. Walk in to any big box store and look around for how many gadgets and systems there are.


Most everyone would be amazed at much networked equipment is in a WalMart supercenter. By the time I left, there were four routers, four big core switches, a couple dozen 'edge' switches, and scores of access points. And that's just the networking gear. As of 2009, my team's network discovery system had identified tens of millions of separate IP addressable nodes on the global network.

At that scale, everything had to be almost perfectly automated.

And, to be clear, this was not a static, boring network. Stores, home offices, distribution centers, you name it, were being added, moved, expanded, deallocated 24 hours a day, all over the world.

And the device configs were far from cookie cutter. An eye-wateringly complex heuristic was required to dynamically create router, switch and access point config that would allow a given store to function.

And, one quarter in the early 2000s, we (Network Engineering) achieved six sigma (99.9999%) global uptime.

Things did start going downhill around 2004 or so, a few years after Kevin Turner became CIO.


Yeah walmart is a really beautifully ugly beast of a company. On all levels, not hust tech...

Ill have to look into kevin turner though, i dont know who that is....

Anything you’d care to share/point at?


https://en.wikipedia.org/wiki/B._Kevin_Turner is the guy but it doesn't give any mention as to why I (and many of my peers) believe that his tenure as Walmart CIO had a strong negative influence on ISDs trajectory.

I don't have much time right now, but I'll share a moment when I saw him interacting with some people during a crisis.

We got hit really hard with https://en.wikipedia.org/wiki/Code_Red_(computer_worm) even though only a minority of our systems were running Windows at the time. There were, at the time, a pair of Windows servers in each store, and each store was connected via a 56k frame relay circuit to the home office, along with a modem connected to a standard phone line for dial backup in case that was down.

Anyway, Code Red rapidly spread through all of these servers, and together they created enough traffic to basically bring down the frame relay network. We were, at the time, slightly over-subscribed at the T1 termination points, so if most of the stores on a termination leg maxed out their individual 56k links at the same time, the whole leg would be effectively down.

(Apologies for the unexpected dive into old networking stuff...it kind of just came out. :)

So most of our stores were down hard, meaning that most card transactions would not work. There was a facility to allow credit cards to go through without verification with a relatively low charge ceiling. But especially back then most transactions were NOT credit, and so weren't working.

This was a bad, bad situation, and it persisted for a number of days. It might have garnered some national news attention.

There were a number of war rooms going on. My team was in one, and we were working towards a solution by using a transient dial backup connection to reach into the store router and apply a draconian ACL.

ANYway, back to Kevin Turner. We saw him talking to a group of VPs and directors outside of our war room. He dismissed most of them, except for one, with whom he continued to have a very heated conversation.

Turner was definitely the aggressor; he got closer and closer to this person, got louder and louder. He ended up backing him against a wall and shouting in his face for a few minutes before walking away, jabbing his finger in his chest.

Just one story. We didn't like KT very much.


Heh, I had the same network back in that time too! 3640s all the way down!

I had to recover the password on one when i inherited that network. I will never forget that password: Feet4Monkey.

Ill also never forget my ix.netcom.com password that was initially assigned to me....


> ix.netcom.com

Nice, that was a blast from the past. I used netcom.com for a few years in the early 1990s. Good times.


> I once worked with an F500 retailer who had Bentonville 'graduates' among their IT staff.

May I ask which one? Just idle curiosity.


Home Depot. Even ~15 years ago, they had an amazing amount of technology to support a single store, let alone the supply chain.

I once worked in Point of Sale (POS) at Darden and several POS vendors and consulting companies. Even 25 years ago, your average fast food or mid-range (e.g. Red Lobster) restaurant had a network of ~10 or more computers running various POS, inventory, ordering, scheduling, and other processes.


Walmartians?

Walmart has had a pretty savvy tech presence in SV for decades...


I generally hate being woken up at night or having to work at night. But there is something electrifying in the atmosphere when you get a bunch of smart people in a room at 2 in the morning to solve a major problem as quickly as possible. Some part of me really enjoys late night war rooms.


The software maintenance research literature from 1999 to 2002 ish is littered with this kind of stuff as researchers got to embed in teams doing one off massive maintence projects to fix y2k stuff.


I remember hitting a Y2K bug before 2008. We were working on a huge release, updating all of our external dependencies to the last version and moving from 32 to 64 bit. A 3rd party commercial lib we were using at the time would parse/format %y as a 2 digit year. There was a macro definition, that when defined would handle a 4-digit year, but it was off by default.

We caught it early in development, but it was still a pain in the ass rebuilding all of the projects that depended on it.


I went on holiday over actual Y2K, flew to Australia and spent the Big Night on the beach, completely incommunicado. It focused the whole team, we had to be OK for it, because if we weren't we were fucked ;)

We did multiple dry runs, setting server times to 31/12/99 23:55 and waiting 5 mins to see if anything died. Mostly it was OK, most of our systems were OK, but a few were dodgy and needed some changes.

The Big Night rolled around and we were OK. The planning had worked, and nothing fell over. I got the email on Jan 1st - all good.


I spent the night in the processing center in the basement of a small regional bank helping run tests and look out for anomalies. It was the culmination of years of work and a lot of money to make sure that it was a non-event.


When i worked at intel in the mid-late 90s - i recall talking to an analyst one night in SC-5 on the balcony - and they were having problems rebuilding the financial system’s fields to be long enough strings for the amounts of money intel had...

Did banks go through some of the same growing pains


Why the generic description? Are you unable to say which retailer?


The only global retailer that we've all heard of that owned stores in Germany in 1999 but doesn't now (note the "used to own stores" in the comment) is Walmart: https://www.theguardian.com/business/2006/jul/28/retail.mone...


That's a bingo. I wasn't particularly trying to obfuscate.

I'm curious how you so quickly figured it out though.


As a German I noticed that you said global retailer. Only Walmart is (was) truly global imo and left Germany.


Just being cute. It's WalMart Stores, Inc.


Thanks, just wanted to nudge you to make the example as complete as possible.


Couldn't the rollover be simulated ahead of time by simply setting the date forward? Seems crazy that you had to have people on the ground fixing things as the clock struck 0.


All systems in the gigantic clockwork of interconnected legacy would need to be rolled simultaneously- and it wasn’t unreasonable to expect that there are systems in that mix that nobody knew could be affected. To this day I read about systems that come up as a surprise when they start to fail and nobody owns them and the persons responsible have long since lived on.


Two decades ago, I was roughly familiar with the details, after the fact, but I can't remember now.

As sibling comments have noted: these systems are an exceedingly complex pile of separate but interdependent pieces.

The rollover from 31-Dec-1999 to 1-Jan-2000 had happened thousands of times before the actual day came. Some piece/assumption/etc had been overlooked.


Sure. On systems under your control.

It's the network of interacting systems that prevents this from being so simple.


Yep, the super fun of getting a Y2K compliant specification of how the data will come over the interface only for it to fail and have '1910' instead of '2000' in the 4 character date field that worked in test. People on site to correct problems were a needed item for some even after testing.


You can do that with test systems, but not live ones. People object to having their bills dated 98 years ago and overdue on the day they were issued ;)


From someone who went through it and dealt with code, it was a real problem but I also think it was handled poorly publicly. The issues were known for a long time, but the media hyped it into a frenzy because a few higher profile companies and a lot of government systems had not been updated. In fact, there were still a number of government systems that were monkey patched with date workarounds and not properly fixed well into the 2000's (I don't know about now but it wouldn't shock me).

There was a decent influx of older devs using the media hype as a way to get nice consulting dollars, nothing wrong with that, but in the end the problem and associated fix was not really a major technical hurdle, except for a few cases. It is also important to understand a lot of systems were not in a SQL databases at the time, many were in ISAM, Pic, dBase (ouch), dbm's (essentially NoSql before NoSql hype) or custom db formats (like flat files etc) that required entire databases to be rewritten, or migrated to new solutions.

My 2 cents, it was a real situation that if ignored could have been a major economic crisis, most companies were addressing it in various ways in plenty of time but the media latched on to a set of high profile companies/government systems that were untouched and hyped it. If you knew any Cobol or could work a Vax or IBM mainframe you could bank some decent money. I was mainly doing new dev work but I did get involved in fixing a number of older code bases, mainly on systems in non-popular languages or on different hardware/OS because I have a knack for that and had experience on most server/mainframe architectures you could name at that time.


>dbase

At the time I was managing a dBase / FoxPro medical software package...we were a small staff who had to come up with Y2K mitigation on our own.

Our problem is we only had source code for "our" part of the chain...other data was being fed into the system from external systems where we had no vendor support.

Thus our only conceivable plan was to do the old:

  If $year<10; 
    date="20$year"
  else 
    date="19$year"
It worked in 99.9% of the cases which was enough for us to limp thru and just fix the bad cases by hand as they happened. Eventually we migrated off the whole stack over the next few years so stopped being a problem. I'm sure many mitigation strategies did the same....


A much more insidious problem with the Y2K bug was the leap year calculation. As you point out, the 20-digit-year thing was relatively easy to fix.

https://en.wikipedia.org/wiki/Year_2000_problem#Leap_years


I still love the fact that if you only implemented the first rule or had the knowledge to implement all 3 rules, it would totally work, but if you implemented 2 of the 3 rules you were wrong.

It taught me a great lesson about results not proving correctness as how you got there could bite you later.


I don't get why people don't know all three rules. In elementary school when the calendar is taught, the complete rules of the leap year were simply taught by the teacher. We even joked about people born on the 29th of February. Why wasn't this taught everywhere?


Heh, this sentence seems particularly relevant to this whole discussion...

> This method works fine for the year 2000 (because it is a leap year), and will not become a problem until 2100, when older legacy programs will likely have long since been replaced.


Wouldn't it be this?

    date="200$year"
Or does foxpro somehow know to zero pad a 1 digit number?


It's been a very long time since I've worked with dBASE/FoxPro, but from what I remember it stores data in a fixed-width format. So for a column to store an integer, it will zero pad the front of it to fit the width of the column.


This is what I remember a lot of too. While they are little hacks which are imperfect, they bought companies enough time to resolve the issue more thoroughly.


and it was pretty embarrassing when I forgot to update this same thing in 2010


Man, I've just remember that I had to fix this kind of bug in 2010 and I am sure I just bumped the number to 20. I guess somebody had to be fixing it lately, I just hope they didn't just replace 2 with 3.


I'll bet they did. Or, if they were proactive/out performer, they changed it to 4 or 5 and just solved the problem for the next 20-30 years.


> I don't know about now but it wouldn't shock me

We'll be up for a few Y2K bugs at the start of every decade because 'year += year < n ? 2000 : 1900' was such an easy workaround, for n=20,30,40,...

https://www.pymnts.com/news/payment-methods/2020/new-years-b...


There was a related story here on HN recently. A system was assuming a 101-year-old was actually only 1 year old, presumably because of 2 digit arithmetic.

https://news.ycombinator.com/item?id=22356373


They were a modulo 100 1-year-old. Modulo 100 ages: feature, not a bug.


There's less cause to use small data sizes for timestamps and date codes now. Storage has grown by orders of magnitude, the idea that a numeric data type would only be large enough to store a 2-digit year or that you would want to save disk space by abbreviating an extra 2 letters is foreign to a lot of new developers. And the 20-year-old systems are slowly dissapearing...


Perhaps you could expand on this as I was not involved in Y2K myself. Storing dates with 2 digit years doesn't seem to make much sense as a storage saving mechanism for me.

If you want to store only the year, why not store the years since 1900? That way in a single byte you can go all the way up until 2155. If you want to store it bit-aligned rather than byte-aligned why limit yourself to 00-99? 7 bits can store the year from 1900-2027.

If you want to store day + month + year with a two digit year you have 365.25*100=36525~ options. A two byte value can store up until 65535 options, which can hold all days until 2079.

If you are storing your dates as flat strings (DD-MM-YY or so) then you are already wasting such a large amount of bytes that "storage concerns" seem moot. You are already using 8 bytes to store a date that can be stored in 2 bytes, at that point why not use 10 bytes?

String formatting and display seems more obviously affected to me than storage.


> If you want to store only the year, why not store the years since 1900? That way in a single byte you can go all the way up until 2155.

You're making an assumption in that statement that there's a byte and it's eight bits in length. This was not always true. Punch cards didn't really work that way, and fair number of CPU's didn't either.

Early machines used various decimal systems and 6-bit characters were widespread also. This explains a bit of why the C language and descendants often have native support for Octal literals and Unix permissions are built around three bit groupings.

On a personal note, even in the very early 90's, I also remember using a CDC machine (a descendant of Seymour Cray's 6600) that supported 60-bit words with 6-bit characters. Pressure to move to 64 bit words with 8-bit bytes resulted in dual mode design that could be booted either way.


CDC...if it was like the CDC system I worked on back in the early 80s, the 6-bit chars (64 distinct chars, including control chars) did not include lower case Latin. Our system was munged so that lower case alphabetic chars could be represented by prefixing a backslash. I wrote my dissertation on that system, and fortunately the editor I used displayed a backslashed char as lower case.


This is a good question. I found a copy of Fujitsu COBOL 85 manual [1] (the next release was post y2k, in 2002) and was surprised to see that CURRENT-DATE, INTEGER-OF-DATE use 4-digit years.

However some functions from that time (eg ACCEPT, p. 544) do use 2-digit years. I think you have it exactly right, it was about facilitating string formatting/display: ACCEPT took input from the terminal screen. (I think it is, approximately, gets() + strptime(), though I don't see any talk of error handling). Programmers probably just stuffed the result into a record before writing to DASD (disk), optimizing for programmer time and program complexity over storage.

Perhaps 2-digit years for ACCEPT were an optimization for data entry workers?

A modern z/OS COBOL guide [2] has 4-digit years throughout as far as I can see.

Whilst I'm here, I'm also going to say: "VSAM", "PDS", TSO", "LRECL" and "ABEND" (just because I haven't heard those words for a quarter century and now I'm feeling nostalgic).

[1] http://testa.roberta.free.fr/My%20Books/Mainframe/Fujitsu%20...

[2] https://www.ibm.com/support/knowledgecenter/SS6SG3_4.2.0/com...


What about "JCL" ? Those ABENDs don't just show up on their own.


When 5MB disk storage system is the size of a small car and costs $50k, you fight for bytes. Or if your data is stored on magnetic tapes and reading in a dataset can be costed by the kilobyte, takes minutes per megabyte, and requires a million dollar's worth of equipment and dedicated staff, you fight for every byte.

The phone I carry today is hundreds or thousands of times more powerful than the $5,000,000 mainframe I was working around Y2K and the apps on it - most delivered freely - make it far more capable than the raw processing power difference would indicate.


If you are fighting for bytes then storing dates as "DD-MM-YY" strings is very inefficient, as you are wasting a factor 4X on storage compared to just storing the days since 1900 as a 2-byte integer. I understand that storage used to be expensive, but I don't understand where the two-digit year comes into play as I can't envision an efficient storage mechanism that is limited to exactly the years 1900-1999.


You need also to have the code to do the conversion. If you store, in a byte, the years since 1900 (or 1970) then every time you load and/or process a date to be displayed you need to add 1900 (or 1970) and then convert from int to str which takes time (machines were slow) and code.

And the code takes space.

You may think the space is trivial, but I remember working for nearly two weeks on a commercial system to save around 15 bytes on a system, and it was worth it.

Then the culture was embedded ... you worked to save bytes, and didn't do things that would cost more bytes. And the systems worked, and didn't need updated, so they persisted.


BCD (Binary-coded decimal) comes to mind; which can store values 0-99 in one byte and with native support in older CPUs.


One of my tasks for Y2K prep was to modify an in-house-developed record-based "database" to accommodate a the century change. Originally developed in the '60s the year had originally been a single digit because everyone thought commercial DBMS' would be viable in few years so there would be a replacement before the decade rolled over. :-| They'd grafted decade handling into the thing before I got there so it could survive the '70s because they had a lot of expensive data in various file sets that contained the data. Converting to a commercial database was expensive and represented a huge business risk, so...


If you have eight bit bytes, you would of course store these as three two digit BCD numbers requiring one byte each, and need no conversion logic. (The code for printing BCD numbers is already there, because you need it to print amounts of money anyway).


A whole byte for a year? What a waste in 1995 when you've got just 640KB for DOS, your code and your data.

I looked up my old code. Shave off a bit of the year and:

  struct Date { 
       unsigned day : 5; 
       unsigned month : 4; 
       unsigned year : 7; 
    };
a full date in 16 bits. (well, in borland C++ 5 at least)

and as it was an accounting application, working with the bitfields was so much nicer than ordinal days you would use nowadays.

month and years was all you really cared about anyway when summarising data. in finance all months have 30 days anyway.


the problem came from the 60's when memory was really expensive on mainframes and you had things like drum storage


> why not store the years since 1900

Most input systems were punch cards, which were designed to be fixed-length textual input directly from a user at a card punch machine. Teaching users how to map from “years since 1900” to the correct keys on the card punch is a lot more confusing and error prone than having the computer automatically convert from "71" to 1971.


That makes a lot more sense, but if it is just input then why not do the conversion of the string "71" to the efficient byte representation of 1971 when it is moved into the storage. If the computer can do a mapping of "71" -> 1971 then why can't it also do a mapping of "10-10-71" -> 26214 (days since 1900-01-01).


Then you would need another column on the card, and there are even fewer columns on a card than there are bytes in RAM.


The punched card, for a long time, _was_ the storage.


More importantly we have figured out algorithms for time that account for timezone, daylight savings time, different calendar systems, and probably something else I'm not aware of. These all work by counting seconds since a fixed date.

Note that the algorithms were known since at least the late 1960s. Storage space even then shouldn't have been enough of a concern. However people still screw it up all the time.


This is the accurate point. We had epoch date tracking since the 1960s or so and the coding of a two digit year was out of laziness and convenience in most cases. And not bashing anyone, but most of the systems I saw using the lazy method were more business systems where there wasn't thoughtful engineering a lot of the time, but more just get it done. Those two don't have to be mutually exclusive IMO, but in the 70s/80's it seems a lot of the time it was, simply because IMO we had predominately really two classes of developers, scientific and business where business devs took less of an engineering mindset and more get it done in a specific timeframe so they "hacked" it out. Plus it was new, so stuff happens.

Sadly this stuff still goes on today, just in slightly different ways. Data type and data structure choices are critical, and it isn't as important to me that someone know how to write an algorithm for a RB-tree (or pick your DS), but it is critical an engineer knows which data structure/type to choose for specific use cases. This is one of my biggest complaints with the way a lot of engineering interviews are structured today, they want you to regurgitate known algorithms which is a waste of time and personal memory, find out if the person knows when they'd use a specific data structure and why the rest is easily researched. I mention this because to me the fundamental flaw of the y2k problem was one of data structure and data type choice.


Seconds since the epoch doesn’t work for times in the future. Consider a contract that will terminate at 2025-02-28 12:00 in New York: will that be -05:00 or -04:00? It could change.


And in that vein, no one (sane) is writing their own date libraries anymore, there's no reason to try to customize that for your application.


>> And the 20-year-old systems are slowly dissapearing There's still COBOL code out there running core business systems that's older than I am, and I'm closer to 60 than I am to 50.


One system I worked on showed the date as year 3900, because the function for getting current year would return 2 digit for year < 2000, and full 4 digit for >= 2000. So the code did 1900 + GetYear(). Replaced it with just calling GetFullYear() or something like that.


It a shame that we as engineers can't just say "this maintenance is required to fix this known issue. It's not a huge deal but will cause trouble if it's not dealt with". Instead we have to be all doom and gloom and tell management that the company/world will end.


Colleague was just dealing with a clients of their who was still using TLS1.0. They're running classic ASP on Window Server 2008, and can't (effectively) migrate. Colleague had been raising the alarm for months (since they started on the project) that "this is going to mean all your systems will stop working in early 2020" but no one seemed to care or understand.

They did, last week, put in ... haproxy as an SSL terminator in front of the main server, and will test a switch over this week. This was 8 months of foot-dragging for about 3 hours of setup/config, and a couple more hours of testing. When all your clients are hitting a web server, and their browsers will all stop rejecting your certs, things will get ugly fast - as in "your business will effectively stop functioning". It just sounded like "doom and gloom" but... how do you message this effectively? It requires the receiving parties to actually understand the impact of what you're saying, regardless of terms you use.


Money, Money is the way you talk that makes business listen.

You say "If we don't solve this by X date, we are looking at losing the ability to take in revenue and possible law suits for failure to perform. It will take Y amount of time to accomplish this"

If you don't couch things in terms of time, money, resources, client impact. They will not care. You can say "Hey, it is super bad that we are running Tomcat 7.0.0 there are a lot of security vulnerabilities" and what they will hear is "blah, blah, blah, we can delay this".


I hope your colleague had a paper trail of the alarms he raised when it came time to point fingers.

For executives with limited IT experience I honestly don't know if there is a good solution, other than having them deal with the disaster and having a clear paper trail that points the finger in their direction. They won't make the same mistake twice.


He did/does.


> How do you message this effectively?

Assuming the entire org is using a captive proxy with an installed CA, identify/isolate every upper-level management person's system(s) in some way, and reroute their access to all internal applications through a deliberately horribly misconfigured HTTPS reverse proxy running TLS 1.0. Their modern browser will explode.

"AAAA, sorry, that thing we've been telling you about for months hit us earlier than we thought it would but thankfully it's only hit all the internal stuff"

"Wait so this is what all of our customers will see?"

"Yes, all of them. Nobody will be to reach us, none of our APIs will work, and we will also break our SLAs on every single contract."

---

You probably want to use GPO to disable bypassing the security warning prompt, and/or set up HSTS for your domains beforehand.


We've been forced to deal with the TLS issue by our software product's customers. Some of our technically-minded folks have been raising the issue for a while but new features are sexy and sell, and fixing not-yet-broken code doesn't. Amazing/discouraging that only the threat of immediate loss of six figures of income gets any attention.


Yeah... "let's do Fitbit integration!" seems to win out over "let's make sure we can upgrade core systems to make sure they don't break in March".


Similar story, had an older version of MySQL on a 2008r2 server until a few weeks ago.

Advocating for 2 years to migrate off that box.


That client wouldn’t happen to be a small private college in WNY, would it?


No. :)


I'd argue if it wasn't for the media "doom and gloom" a lot of companies (maybe even the vast majority), would not have had their C level folks divert money towards fixing the issue.

The simple reason is that any one company may have been liable for a very small portion, which if it failed, would not have caused much trouble. But that failure combined with many other failures down the entire chain of connected and unconnected software would have added up to something much greater than the sum of parts.

We saw similar stuff happen with Flash, Silverlight (which wasn't as reported a concern since silverlight usage was so low, but I saw it within my company).

The media pressure was a significant reason why every company needed to have a plan to deal with it.


> but the media hyped it into a frenzy because

Because they make more money if people are scared and clicking refresh every few minutes.


Yes, it was a real crisis. This revisionist history that some are now saying it was no big deal. It was a big deal and many people spent many hours in the 90's assuring that the financial side of every business continued. I am starting to get a bit offended at the discounting of the effort put in by developers around the world. Just because the news didn't understand the actual nature of the crisis (Y2K = primarily financial problems) is no excuse to crap on the hard work of others. It is sad that people that got the job done by working years on it get no credit because they actually got the job done.

I see this as a big problem because Y2038 is on the horizon and this "not a big deal" attitude is going to bite us hard. Y2K was pretty much a financial server issue[1], but Y2038 is in your walls. Its control systems for machinery that are going to be the pain point and that is going to be much, much worse. The analysis is going to be a painful and require digging through documentation that might not be familiar (building plans).

1) yes there were other important things, but the majority of work was because of financials.


This is always such an annoying problem:

1. Shine light on problem and make sure people hear about it

2. People respond and fix it

3. Outcome not as bad as you said it could be /because it was fixed/

4. Some time later "That wasn't a big deal"

No, it wasn't that big of a deal because we worked hard to fix it!


Mark my words, if the containment measures work and Covid-19 goes away, this is exactly what people will be saying. Drives me crazy.


Yeah, I'm not sure if the original question was asked because of this, but I've had a couple conversations where we've compared the COVID-19 situation to Y2K.

Even in this case, we have evidence of what happens from delaying response (eg: China, Italy) yet if proactive steps like shutting down schools, events and gatherings are effective you'll get people complaining it was all an overreaction.


To be honest I'd choose that timeline in a heartbeat over what I think we are barrelling to... But yes, it will drive me crazy if that happens.


I have a vivid recollection of reading a Scientific American article in late '98 or early '99 that claimed that no matter how much time and $ was put into fixing Y2K before 1 Jan 2000, there was going to be a disaster; it was just a question of having a bigger or smaller disaster, but even with best effort months. I have recently looked for that article, without success. Does anyone recall it/ have a PDF (image) of it?


Might be the October 1998 article "Y2K: The End of the World as We Know It", but its behind a paywall and I don't subscribe.


This is more prolific in software than just Y2K. At a company I work with, if everything is going smoothly and things are working, devs get a ton of crap for not getting more done, no matter the pace and productivity. Of course they get crap when things aren't working also.

When people don't understand something, they can't tell the difference between a lot of effort to make something work smoothly and "it's clearly not that big of a deal"


It was also a big problem for Medical records as well right? I recall hearing that there was a lot of work around making sure that immunization records and things like that were accurately converted.


Yeah. I seem to remember some MUMPS people talking about stuff they were having to do. I programmed Perl at the time, and MUMPS made no sense to me (it looked like line noise), but they loved it.


Y2038 will be interesting because one wonders how many engineers will be around with the skills to address it. Even now it might be hard to scrape together a team.


I wonder how many systems still being sold today have 32-bit time stamps? I get the feeling you are dead on right about the skill problems, but I do wonder how many folks are only consumers and not originating any of the programming?


>Y2038 is in your walls

Fuck me, as this is true


I was enlisted in the USAF and was assigned as part of a two-person group to deal with Y2K issue for all unclassified systems on the base. We were outside the normal AF procedures because the base didn't have wartime mission. The two people assigned to the group were myself (E-3 or E-4 at the time) and a very junior (but smart) lieutenant.

We basically inventoried every unclassified computer system on the base. If it was commercial, off-the-shelf software that could be replaced we recommended they replace it. If it could not be replaced with newer version (because it ran software that could not or would not be replaced) we replicated and tested it by changing the computer hw clock. In all cases we recommended shutting down the computer so it wasn't on during the changeover.

Most home-grown systems were replaced with commercial software.

One interesting case was a really old system, I think it had something to do with air traffic control. It was written by a guy who was still employed there and he was still working on it. I got to interview him a bunch of times and found the whole situation fascinating and a little depressing. Yes, he was storing a 2-digit year. He didn't know what would happen when it flipped. He didn't feel like there was a way to run it somewhere else and see what would happen (it's very difficult to remember but I think it was running on a mainframe in the comm squadron building).

The people in charge decided to replace it with commercial software. Maybe the guy was forced to retire?

Overall the base didn't have any issues but only because they formed the "y2k program management group" far enough ahead of time that we were able to inventory and replace most everything before anything could happen.


Old Y2K project manager here. We had a multiyear project, found issues, fixed, and deployed fixes. Both hardware and software issues. I faced off against internal business, external clients, internal audit, internal compliance, and external regulators. We had IIRC nine potential failure dates including 2000-Feb-29. My ultimate project documentation required twenty 5 inch binders and a hand cart to deliver the package to our local CEO for sign off. Pretty sure he didn't read anything and simply just signed the cover page. We rocked the project, I passed all aggressive audits and earned myself a nice bonus that year for having a successful Y2k rollover.

Then came 2000-Feb-29 and it happened, I had a risk management system hosted out of the UK that just didn't work. Had to file the system failure through to internal global management and domestic regulators.

I was thrilled. First because that system owner had refused to conduct global integrated testing so I could blame the SO. Had the request, negotiation, and finally the outright refusal in writing. The failed system was relatively trivial domestically. Risk wasn't calculated one day on a global platform that and that risk didn't hit my local books. Ha ha sucks to be you. Most importantly, I was thrilled because I could point to the failure and say "see, that is what would have happened x100 if we hadn't nailed the project." It was a great example for all the assholes who bitched about the amount of money we spent.


I worked at a regional bank. Like many banks, we offered mortgages, so starting in 1970, the 30 year mortgages had a maturity date in 2000, and our bank had begun the process of adapting its systems from 2-digit to 4-digit dates.

Basically all of our software was written in COBOL, and most COBOL data is processed using what we'd consider today to be string-like formats. And to save space (a valuable commodity when DASD (aka hard drives) cost hundreds of thousands of dollars, and stored a few megabytes of data) two-digit dates were everywhere.

I started in 1991. The analysis had been done years before, and we knew where most of the 2-digit problems were, so it was just a matter of slowly and steadily evolving the system to use 4-digit dates where possible, or to shift the epoch forward where that made sense.

Every few months we'd deploy a new version of some sub-system which had changed, migrate all the data over a weekend, and cross off another box in the huge poster showing all the tasks to be done.

External interfaces were the worst. Interbank transfers, ATM network connections, ATM hardware itself, etc, etc. We mostly tried to switch internal stuff first but leave the APIs as 2-digit until the external party was ready to cut over. Similarly between our internal systems: get both ready internally, migrate all the data, and then finally flick the switch on both systems to switch the interfaces to 4-digit.

Practically, it meant that we our development group (maybe 30 people?) was effectively half that size for 5 or 6 years in the early 90's as the other half of the group did nothing but Y2K preparation.

All of these upgrades had to be timed around external partners, quarterly reporting (which took up a whole weekend, and sometimes meant we couldn't open the branches until late on the Monday after end-of-quarter), operating system updates, etc, etc. The operations team had a pretty solid schedule booked out years in advance.

We actually had two mainframes, in two data centers: one IBM 3090 and the other the equivalent Armdahl model. We'd use the hot spare on a weekend to test things.

It was a very different world back then: no Internet, for a start. Professional communication was done by magazines and usergroup meetings. Everything moved a lot slower.

I left that job before Y2K but according to the people I knew there, it went pretty well.


I worked for Columbia/HCA, now HealthCare of America, at the time and we started gearing up for Y2K in January, 1997.

Every system, every piece of hardware - both in the data centers and in the hospitals - had to be certified Y2K compliant in enough time to correct the issue. As I recall, we were trying to target being Y2K ready on January 1, 1999 but that date slipped.

A "Mission Control" was created at the Data Center and it was going to be activated on December 15, 1999, running 24 hours a day until all issues were resolved. Every IT staff member was going to rotate through Mission Control and every staffer was going to have to serve some third shifts too.

I left Columbia/HCA in June, 1999 after they wanted to move me into COBOL. I had no desire to do so and I took a programming position with the Tennessee Department of Transportation.

I remember my first day on the job when I asked my boss what our Y2K policy was. He shrugged and said "If it breaks, we'll fix it when we get back from New Year's".

What a difference!!!


> I remember my first day on the job when I asked my boss what our Y2K policy was. He shrugged and said "If it breaks, we'll fix it when we get back from New Year's".

I'm a little surprised. TDT is in a critical business too (transportation).


I worked for Hyder, the Welsh Water and Gas authority, on their Y2K project, from March 1998 to November 1998.

Their billing and management system was written in COBOL, and contained numerous Y2K bugs. If we did nothing, then the entire billing system would have collapsed. That would mean Welsh people either receiving no bills, or bills for >100 years of gas/water supply, depending on the bug that got triggered. Very quickly (within days) the system would have collapsed, and water/gas would have stopped flowing to Welsh homes.

Each field that had a date in it had to be examined, and every single piece of logic that referenced that field had to be updated to deal with 4 digits instead of 2.

I wasn't dealing with the actual COBOL, I managed an Access-based change management system that catalogued each field and each reference that needed to be changed, and tracked whether it had been changed or not, and whether the change had been tested and deployed. This was vital, and used hourly by the 200+ devs who were actually changing the code.

We finished making all the changes by about December 1998, at which point it was just mopping up and I wasn't needed any more. I bought a house with the money I made from that contract (well, paid the deposit at least).

The cost was staggering. The lowest-paid COBOL devs were on GBP100+ per hour. The highest-paid person I met was on GBP500 per hour, enticed out of retirement. They were paid that much for 6-month contracts, at least. Hyder paid multiple millions of pounds in contract fees to fix Y2K, knowing that the entire business would fail if they didn't.

Still less than the cost to rewrite all that COBOL. The original project was justified by sacking hundreds of accounts clerks, replaced by the COBOL system and hardware. By 1998 the hardware was out of date, and the software was buggy, but the cost-benefit of a rewrite made no sense at all. As far as I'm aware Hyder is still running on that COBOL code.


Diolch!


I've also found myself musing on a similar question, but one where you may have a different temporal perspective at this particular moment: In six months, are we going to collectively believe that the Coronavirus was nothing and we massively overreacted to it? Because if we do react strongly, and it does largely contain the virus, that will also be "proof" (quote-unquote) that it wasn't anything we needed to be so proactive about in the first place.

Unsurprisingly, humans are not good at accounting for black swan events, and even less so for averted ones.


> Because if we do react strongly, and it does largely contain the virus, that will also be "proof" (quote-unquote) that it wasn't anything we needed to be so proactive about in the first place.

Even if every other country has Y2K levels of success containing the Coronavirus, we can still point skeptics at the example of places like Italy to prove it was a real threat.


Neither the current pandemic nor Y2K really fit the definition of a black swan event, since they were completely predictable (and predicted).


In my opinion (emphasis opinion), "black swan" includes the concept of timing... that a pandemic would occur is inevitable, but you have no idea on the timing. Market crashes are inevitable, but you have no idea on the timing. Volcanic eruptions are inevitable, but you have no idea on the timing. etc.

Things that are inevitable only when you encompass time spans longer than a human life (it has been approximately one and a half average human lifespans since the previous pandemic) may be predictable at that large aggregate scale, but on useful scales they are not. Or, to put it another way, if you've been shorting the market since 1918 for the next pandemic crash, you went bankrupt a long time ago.

Y2K is only a black swan for those not in the industry, since that one is obviously intrinsically timing based. The UNIX timestamp equivalent is also equally predictable to you and I, but to the rest of the world will seem even more arbitrary if it's still a problem by then. (At least Y2K was visibly obviously special on the normal human calendar.) But I wouldn't claim the term for that; call it a bit of sloppiness in my writing.


I assume that we'll have a fairly decent idea as to what worked, and what didn't, as the differing responses mean we're effectively carrying out multiple experiments at once.


I was involved in several of the efforts at the time including building the communications systems for the "studio NOC" at AT&T in NYC. I started hearing about vulnerable systems about 5 years before 2000 and we were doing serious work on those systems about 2 years before. I predicted (to friends and family who didn't always care to believe me) that it would be a non-event because disruptions would be localized in smaller systems (we were expecting local banks and credit unions). Even I was blown away by how few of those systems had problems. So know when people say Y2K was no big they fail to recognize the work that went into to ensuring it was a non-event.

There's a very current equivalent - if we're good about social distancing, people may talk about COVID-19 the same way.


You're darn tootin' it was real. It was only through the dedicated, focused efforts of thousands of unsung IT heroes that we averted catastrophe.

Just because it didn't happen doesn't mean it couldn't have.

Even in the late 80's I had to argue with some colleagues that we really shouldn't be using two-digit dates anymore.

I worked with 80-column punched cards in the 70's, every column was precious, you had to use two-digit years. When we converted to disc, storage was still small and expensive, and we had to stay with two-digit years.

See: http://www.kyber.ca/rants/year2000.htm


Well, there were fallouts, but few disastrous ones.

First, enormous amounts of money was spent on repairs to the extent that they could be done. I know of some 50-year-old processes that didn't have the original source any longer. Significant consultant time was used in what at times resembled archeology.

Second, there was a little downturn in new projects after the turn, as budgets had been totally busted.

There was one consultant who preached doom and gloom about the collapse of civilization when that midnight came. He went so far as to move his family from NYC to New Mexico. He published on his web page all sorts of survivalist techniques and necessities. When the time came, his kids, who apparently didn't share the end-of-the-world view, woke him up and said "Dad!! New Zealand is dark!!!" but of course it wasn't.

The lesson there was that there was a tunnel vision about exactly how automated stuff actually was. While there were enormous systems with mainframes, Sun servers, workstations doing all this work, what the tunnel vision brought was the perception that excluded the regular human interactions with the inputs and outputs and operation of these systems. Not so fully automated after all.

There were a few disasters--I remember one small or medium grocery chain that had POS systems that couldn't handle credit cards with expiration dates beyond 12-31-1999 and would crash the whole outfit. The store was unable to process any transaction then until the whole thing was rebooted. They shortly went out of business.


Yes. I worked for a COBOL vendor at the time and we had customers and colleagues tell us how many things would not have functioned without the time spent remediating it — not planes falling from the sky but, for example, someone at a household-name credit card company saying they wouldn't have been able to process transactions.

This was victim of its own success: since the work was largely completed in time nobody had the huge counter-example of a disaster to justify the cost. I'm reminded of the ozone hole / CFC scare in the 1980s where a problem was identified, large-scale action happened, and there's been a persistent contingent of grumblers saying it wasn't necessary ever since because the problem didn't get worse.


Yes and no.

There were a lot of two-digit dates out there which would have led to a lot of bugs. Companies put a lot of effort into addressing them so the worst you heard about was a 101 year old man getting baby formula in the mail.

The media over-hyped it, though. There was a market for books and guest interviews on TV news, and plenty of people were willing to step up and preach doom & gloom for a couple bucks: planes were going to fall out of the sky, ATMs would stop working, all traffic lights were going to fail, that sort of thing. It's like there was a pressure to ratchet things up a notch every day so you looked like you were more aware of the tragic impact of this bug than everyone else.

That's the part of the crisis that wasn't real, and it never was.


I worked New Years 1999 when I was at Red Hat.

Leading up to the change over there was a lot of work to make sure all the systems would be OK, and that underlying software would also be OK, but keep in mind, auto-update on the Internet wasn't super common.

I ended up getting one call from a customer that night where they had a valid Y2K bug in their software, and since it wasn't in Red Hat's system, they moved along to their next support person to call :)

It was a thing, but much less of a thing because of the work put into getting ahead of it.


In 1998 we tested the new computers we sold and many failed or gave odd results when the date was changed to 2000. By mid 1999 almost none of the computers had any problems if you advanced the date.

Also one of the major results of the Y2K bug, IT department finally got the budgets to upgrade their hardware. If they had not gotten newer hardware I am sure there would have been more problems.

Finally, in my area the main reason companies failed from IT problems is because of problems with their database, but it turns out their backup are not good or have not been done recently. Many companies tried to be cheap and never updated their backup software, so even if they did backup their data the backup software could really mess things up if it used 2 digit dates to track which files to update.

Things go very bad if you lose Payroll, Accounts Payable or Accounts Receive-able.


I worked at a bank at the time and can say that we started working on it back in 1996, and all the systems were in place and tested by early 1999 so we had no issues. It was absolutely a crisis; back in 1995 one of the mainframe programmers did an analysis on her own and determined that, not only was it a problem, but that the system would be hopelessly corrupted if it wasn't fixed. They spent, if not seven figures, at least high-six figures to get everything ready. One thing that was drilled in from management was that no one was to talk about it because it might be perceived that "real" work wasn't getting done. :\


I was not working on fixing Y2K issues, but I did notice the impact it had on systems that hadn't been patched. It's the typical IT conundrum, when you do a good job no one notices and you don't get rewarded for doing a good job; the only recognition comes when things fail.

Some historians seem to think that it was a real crisis in which the US pioneered solutions that were used across the world: https://www.washingtonpost.com/outlook/2019/12/30/lessons-yk...


- Why did you catch that?

+ because it was going to fall.

- Are you certain?

+ yes.

- but it didn't fall. You caught it. The fact that you prevented it from happening doesn't change the fact that it was going to happen.

Minority Report , 2002

https://www.youtube.com/watch?v=IVGQHw9jrsk

People worked for years in the late 1990s replacing systems that were not Y2K compliant with new ones that were.

It is becoming ever more common to question the veracity of disaster averted through effort. And it is very dangerous.


> It is only in the last few years that it has become common to question the veracity of disaster averted through effort.

No, it isn't. Questioning whether Y2K was overhyped started before Jan. 1, 2000 and accelerated on Jan 1, 2000 when there weren't major breakdowns. If you are too good at mitigating a problem before it manifests, there's a good chance lots of people will doubt there was a problem to mitigate. On the other hand, if it's a sui generis problem like Y2K, it's by definition too late for their doubts to impact mitigation efforts for the one potential occurrence, so it doesn't matter all that much. For a recurring problem, where those doubts can impact preparedness for the next potential occurrence, that's a bigger challenge.


My apologies. I have edited the comment after that, because I didn't feel that "It is only in the last few years" was entirely correct.

However, I feel that this tendency is getting worse. A denial of the role of expertise. The question "was Y2K real?" is a political issue now, and it's not because of Y2K specifically, it's as a comparison to more recent events.


Some folks think they're clever by repeating "Economists have predicted nine out of the last five recessions," but they don't seem to understand the Fed's purpose.


Not only was it real, some of the fixes were kludges put in place that would only last for 20 years. But that was fine, because surely we'd have done longer term fixes by then?

Except we didn't:

https://en.wikipedia.org/wiki/Year_2000_problem#On_1_January...


Well when most of the bad Y2K code was written they knew it was bad, they just didn't think civilization would last until the year 2000. It was the Reagan years, after all.


If you're driving down the road, see an overturned cart in your path, and safely avoid it, was the cart a danger to you? Nothing bad happened, so was the cart a hoax? The Y2K problem was, for a number of organizations, precisely such a cart, and it was successfully avoided to the extent nothing seriously bad happened as a result of the bug (really, an engineering trade-off which lived too long in the wild) so we can either count it as a victory of foresight and disaster aversion, or we can say it was all a hoax and there was never anything to it. Guess which conclusion will best let us avoid the next potential disaster.


Like others have said, it was real and handled well. They knew about it for years before so there was time to fix the issue.

The panic was also very real despite not being proportional to the actual problem, but just like any other media-induced widespread panic, it served as a means to make lots of profit for those in a position to do so. Media companies squeezed every last drop of that panic for ratings... well into the year 2000 when they started spreading the story that Y2K was the tip of the iceberg, and the "real" Y2K won't actually start until January 1, 2001.

As an immigrant to the US, I got to see the weird side of American culture in how people tend to romanticize (for lack of a better word) post-apocalyptic America. Kind of like the doomsday hoarders of today are doing. It's like they think a starring role on the Walking Dead is waiting for them, except in real life.


Was employed by a large bank in their web division. We fixed many Y2K bugs that would have been triggered. Was on duty New Year's eve and there were a few Y2K bugs that surfaced but nothing show stopping. My anecdotal opinion is that the panic to fix these bugs likely prevented some larger cascading effect/catastrophe. Remember, this was 1999 when testing/QA practices were not always de rigueur. For some shops, Y2K mitigation might have the first time their code base was subjected to any sort of automated tests :-))


Fun story related to this:

During the Y2K panic Sun Microsystems (IIRC) announced that they would pay a bounty of ~$1,000 per Y2K bug that anyone found in their software. As you noted, there was very little automated testing at the time so these problems were really hard to discover.

James Whittaker (a college professor at the time) worked with his students to create a program that would parse Sun's binaries and discover many types of Y2K bugs. They wrote the code, hit run, and waited.

And waited. And waited.

One week later the code printed out it's findings: It found tens of thousands of bugs.

James Whittaker went to Sun Microsystems with his lawyer. They saw the results and then brought in their own lawyers. Eventually there was some settlement.

One of James' students bought a car with his share.


It wasn't a crisis, but it was a real problem that needed to be, and was, fixed in plenty of time. It didn't surprise anyone in the industry as it was well known throughout the 90's it was coming. The biggest problem was identifying what would break and either fix or replace it. Many companies I dealt with at the time humorously did both: they had big remediation projects and as soon as they finished, decided to dump most of the old stuff for shiny new stuff anyway.


Very anecdotal, but here is my take:

For the place I worked at (large international company) it was a G*d send opportunity. All the slack that had been build up in the past by "cost reducing" management suddenly had a billable cost position that nobody questioned.

Of course there where some actual Y2K issue solved in code and calculations, but by large the significant part of the budget was spend on new shiny stuff, to get changes approved and compensate workers for bonuses missed in the previous years.

We had a blast doing it, and the biggest let down while following the year roll over from the dateline and seeing nothing like the expected and predicted rolling blackouts.


I know this is redundant, but I have to add to the signal:

YES. It was real.

I was finishing an engineering degree (CSE) in 1992 and several of my peers took consulting jobs to work on Y2K issues. For nearly a decade a huge amount of work was done to review and repair code.

Y2K is the butt of many jokes, but the truth is: it didn't happen because the work was done to fix it. Sort of ironic.


There were parts of our telecom infrastructure that weren't ready but got fixed before y2k. A certain mobile phone switching vendor (think cell towers, etc.) ran tests a year before to see what happened when it rolled over and the whole mobile network shut down (got in a wedged state where calls would fail, no new calls, signalling died). They fixed it and got customers upgraded in time.


I dont think I even had a mobile phone in 1999. Probably not a critical system back then.


I got my first cell phone in 1996. It worked really well (GSM network) and SMS's were available. So I can easily imagine that if you relied on it, it seemed like a critical system to you personally or as a business.


I was 10 and got my first phone around that time. As far as I know Americans were behind in mobile phones up until the iPhone.

There were a lot of 2G pager systems used for emergency alerts


In 1999 to 2000 I was working as a freelancer At a state agency. From the change from 99 to 00 I got paid 3 times a few days apart, always the same amount. Later turned out that the indicator that a person was paid was not working thx to y2k. So somebody clicked on payout a few times. They fixed it for the employees before but not for the freelancers. I gave back the money, which was difficult on its own as there was no process for it.



I was the lead console operator during the midnight rollover on NYE for an aging fleet of state-owned mainframes and minicomputers which were affected by the bug. Some of the machines could not be updated with a fix, so we were almost completely uncertain as to how these machines would behave. Testing on many of those machines was a huge investment. A lot of effort went into checking and re-checking job code. It was a lot of work for everyone. As mentioned here by others, it could have been (would have been) worse were it not for the heroic efforts of programmers to bug-proof their job code in the run-up to the rollover. As the lead console op, my responsibility that night and morning was to try and ride any trains that decided to jump the tracks. The skills I developed then still serve me well today, and I will forever be grateful to those grayest of graybeards for the trust I was extended when chosen for that role. Everyone on the payroll was there for the rollover except for the next shift's operators. For my part, it was in the end a lot of preparation which thankfully was not needed. I must admit to having a drink before that shift started. But when the rollover came, all was quiet. And after a few nervous hours, we poured some champagne with the hero programmers who were there in the room watching their jobs run without any issues.


Can anyone provide an example of a country or a company that was not at all prepared for Y2K (there must be one somewhere?), and suffered disastrous consequences? That would seem to be the best way to answer the original question, but I haven't seen any such example provided.


I didn't run into any Y2K problems - for UNIX/Linux itself this was mostly a non-issue due to times being stored in at least 32-bit time_t at that point. Individual applications may have had their own Y2K related issues of course, but I didn't run into any.

However, one issue I did run into nearly two years later was when UNIX time_t rolled over to 1 billion seconds. The company I worked with at the time was running WU-IMAP for their email server, plus additional patches for qmail-style maildir support. We came into work on September 10th 2001 and all the email on our IMAP server was sorted in the wrong order.

Turns out there was a bug in the date sorting function in this particular maildir patch (see http://www.davideous.com/imap-maildir/ - "10-digit unix date rollover problem"). I think we were the first to report it to the maintainer due to the timezone we were in. First time for me in identifying and submitting a patch to fix a critical issue in a piece of open source software! My co-worker and I were chuffed.

Of course, we swiftly forgot about it the next day when planes crashed into the NY World Trade Center.


There's a very good ~20min documentary of the Y2K bug which exactly discusses your question: https://www.youtube.com/watch?v=Xm5OiB3CPxg (by LGR)


Came here to say this. Great channel and explanation of the Y2K.


It was like most other big IT problems that are properly anticipated -- a ton of work went into making sure it wasn't a problem, so everyone assumes there was nothing to worry about and all the IT people were lazy and dramatic.

But that couldn't be more wrong.


Yes, it was real.

Keep in mind that it was also used as a significant contributing factor to replace a lot of major legacy IT systems (especially accounting systems) at big organisations (a lot of SAP rollouts in the late 90s had Y2K as part of the cost justifications).

The company I worked for ran a Y2K Remediation "Factory" for mainframe software - going through and change to 4 digits, checking for leap year issues, confirming various calculations still worked.

I worked on a full system replacement that was partially justified on the basis of (roughly) we can spend 0.3x and do y2k patches, or spend X and get a new system using more recent technologies and UIs.

There were still problems, but they were generally in less critical systems as likely major systems had been tested, and were remediated or replaced.

Keep in mind that there was often much more processing that occurred on desktop computers (traditional fat client) - so lots of effort was also expended on check desktop date rollover behaviour. Once place I worked at had to manually run test software on every computer they had (10's of thousands) because it needed reboots and remote management was more primitive (and less adopted) at the time.


I worked at IBM as an intern during the crossover, and there was a TON of internal activity. It might seem like there was nothing going on, but in fact, there were a lot of interns (like our entire intern class of about 200 that I know of, across entire IBM in all disciplines, research, database, microprocessors, AIX, QA, mainframes etc) who were basically doing the same thing - Y2K readiness for our respective departments.

I worked in QA in one of their bank teller application development branch offices, so all I did for weeks was enter in date times between 99 and 00 into the software and test that the fixes were successful.

The unique thing about Y2K was that the problem was well understood and came with an actual deadline, so you could project manage around it.

Any normal bug couldn't be project managed this way, and you can't just throw interns at regular problems, whereas with Y2K, if you had the money, you can just assign people to look at every line of code to look for date handling code.


I worked hard on y2k issues in 1998-1999. It was a real thing for my company at the time. It was a crisis averted. In the 80’s and 90’s I worked on many systems where the equivalent of “max date” or “end of time” was expressed as 12/31/99. The way that these systems expressed, stored, entered, and validated dates all had to be reworked in a series of major overhauls.


I personally worked in Y2K support at the time on PC hardware. Most motherboards we tested worked, but some needed BIOS updates and one model needed a new BIOS fix which didn't exist. We swapped out the bad motherboards, updated software, and had no problems.

In the UK, there were some medical devices (my memory says dialysis machines) that malfunctioned over the issue.

There is an important lesson about the behavior of the media in this. They whipped out people into a survivalist, doomsday prepper frenzy over an issue that could be solved simply by updating BIOS, software, and/or hardware.

With that said, the effort was very expensive because so much software and hardware needed to be audited at every company.


There are really a couple of issues in this topic. First, was there a potential problem due to dates? Second, was the massive scare campaign justified?

The answer to the first question is yes. There was a potential problem. However the companies and government departments that were affected had started planning in the early 90s, and they prepared during the decade. Many took the opportunity to embark on huge system upgrades. It was just one of many issues CIOs dealt with.

The answer to the second question is no. The huge disaster scares were not justified. Banks, airlines, insurance companies and government departments had already fixed their systems, just like they fix other problems.

What happened was that consulting companies, outsourcers and law firms suddenly realized there was a huge new market that they could scare into being. They started running campaigns aimed at getting work from mid size businesses.

The campaign took off because it was an easy issue for the media and politicians to understand. It also played into the popular meme that programmers were stupid. The kicker was the threat that directors who failed to prepare could be sued if anything went wrong. Directors fell into line and commissioned a lot of needless work.

In summary, there was the professional work carried out by big banks, airlines etc, generally between 1990 and 1997, and the panic-driven, sometimes pointless work by smaller firms in 1998 and 1999.


> There was a potential problem. However the companies and government departments that were affected had started planning in the early 90s, and they prepared during the decade.

I can point to several huge companies who did nothing until 1998 or even 1999. The media scare helped with that (priority, money) a lot.


It was very much real, lots of effort was spent to make sure everything worked correctly, especially around older billing systems which used only 2-digit years in fixed width records.

Because of all the preparation and upgrades being done, I think only incident we had when Y2k migration manager sent out "all clear"-email after rollover - Unix mail client he used formatted date on email as "01/01/19100" - though I suspect he knew of the issue and didn't upgrade on purpose just to make a point.


Ah, bad Perl code, outputting a four digit year by doing "19" . $year instead of 1900 + $year. I fixed a lot of those in 1999.


Accross fortune 500 cos, smaller cos, all government depts in every country of the world nobody stuffed it up and had a total horror story of death, destruction and bankruptcy.

Nobody. None.

Everybody got a good landing in the pilot's sense of a good landing being one you walk away from. Think of your boss, all the people you work with and ever have. And they all suceeded.

So crisis, no. No way everybody pulls it off if it were a real crisis. But damn it made sales easy by consultants to all of the above who spent big. "Planes will drop out of the sky!"

A very powerful way to sell is through fear. We got sold the iraq war on weapons of mass destruction that could kill us in our beds here! And this was used and abused by consulting firms to make sales to managers and boards of directors who have no clue what a computer is and what it does and think hackers can whistle the launch codes etc. That fear based sales job happened on mass and was a vastly bigger phenomenen than y2k. But having said that there were so many people who bought the fear sales job that employed them that they still believe it. Many will post here about it and you can weigh it all up for yourself.

So yeah there were y2k issues, some got dealt with in advance, some didn't but nothing like the hype of 1999. Nothing like it.


1) certainly, there were real Y2K issues that had to get fixed 2) however, what IT workers in general don't realize, is how many of their systems are broken all the time, in ways that people just learn to live with or work around; this is not all of them, but it does include more than IT folks realize 3) I worked as an engineer in the semiconductor industry at the time, and "it's not Y2K compliant, and cannot be upgraded" was a way to get obsolete equipment replaced, in a way that bypassed normal budget controls. Engineers, salesmen, and managers all engaged in a sort of unspoken conspiracy to get the new equipment purchased. However, this doesn't mean it wasn't a good thing that it happened. Which makes one wonder whether a certain amount of brokenness in accounting and software controls is not necessary for the economy to function 4) countries like Japan, Russia, etc. spent a tiny fraction of the effort in Y2K preparation, and they sailed through. This was in part because of U.S. overhyping it, but also because we were using interconnected computer systems more than other countries were at that time.

So, it's a mix. It was real, it was well-handled, but there was also some hype, and even some hype that served a real (covert) good purpose.


Yes, it was real. While in college, I was working as a programmer (Perl mostly) for a book publishing company and several database systems wouldn't boot up after the new year. It turned out they didn't bother upgrading their software. This was common, but by then, most software companies had patches available - probably because of the hype. In this case, I believe the fear provided by the media actually helped avoid a much bigger crisis.


I'll add my own little data point to the pile. I changed jobs in 1998 so I fixed problems at two companies, small to mid-size. At each place they would have had problems if they weren't found and fixed. At second company they asked for volunteers to get paid extra to stay the night, eat pizza, watch movies and be prepared to fix things if they went south. They had enough volunteers that no one had to be voluntold. They weren't needed.


I have always wondered the same thing. I came to the conclusion that it's pretty difficult to determine that.

I lived and worked as a software developer through the Y2K "crisis" (although I wasn't working on solving the crisis myself). Everyone was very worried about it. Nothing really went wrong in the end.

Was that because there was no problem? Or because everyone was worried about it and actually solved the problem? I don't think it's easy to tell the difference really.


I think so too, but there is some evidence of a lot of legacy systems needing critical updates. Manufacturing, oil processing, military hardware and other stuff. The problem was that it was nearly impossible to understand how failures would manifest. Just that there will be failures and some of them might be very critical. Luckily, all the important stuff got fixed in time that everything went pretty smoothly, but there were still some problems in non-critical systems.

That said, if we need to jump start the economy again, maybe we could come up with a fake problem this time. How about we say that 2025 will be a huge problem for many critical IT systems? Or does a timestamp does produce a nice round number in the near future that is a good sell?

edit: There is this: https://en.wikipedia.org/wiki/Year_2038_problem


It's only hard to tell if you don't talk to the developers who were working in the late 90s.


>> It's only hard to tell if you don't talk to the developers who were working in the late 90s.

I think you mean "working ON it". Talking to developers as a broad group from that time wouldn't necessarily produce any useful information.

The person you replied to was himself a developer working in the late 90s. During the late 90s, I talked to a lot of developers, but only a small percentage of them were on Y2K jobs.


There were a lot of Y2K related tasks in the course of many developers generally activity. Where I worked at the time, there were not developers solely dedicated to Y2K work. The Y2K issue pretty much caused an employment boom for software developers though.


I was fixing Y2K bugs at a company founded in 1997. Unless you never touched date stuff, I can't imagine being oblivious to Y2K issues working in IT in 1999.


The startups I worked in around that time - we were using recent hardware with 4 digit years and unconcerned about Y2K issues.

So while we were -aware- of the Y2K issue, it didn't impact any of us in a concrete fashion. We would talk about people we knew on Y2K projects, which were mostly mission critical legacy systems.

So it's not inconceivable for devs in the 90s to have only cursory awareness of the -real- issues that the people who worked on Y2K projects were facing and solved.


It especially caused a boom for COBOL programmers, but not so much Java developers.


It didn't turn out so well for some COBOL programmers...

https://medium.com/@donhopkins/cobol-forever-1a49f7d28a39


We were all on Y2K jobs, in the sense that for the decade of the 90s you'd get looked at very weirdly if you tried to produce code and systems using two digit years.


I guess I have a different perception. If you were trying to produce code in the 90s using two digit years, the weird looks had less to do with Y2K than "why would you save incomplete data to save a couple of bytes"?

For context, most of the devs I worked with at that time finished their CS degrees within the previous 10 years, and were not legacy systems programmers. Doing 2 digit years was inconceivable to them.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: