Hacker News new | past | comments | ask | show | jobs | submit login
The Y2K bug is back, causing headaches for developers again (zdnet.com)
104 points by nikbackm 6 days ago | hide | past | web | favorite | 91 comments





I worked on a 2k20 bug just last week. Some of the Perl at work started returning strange values, and tests were failing. Turns out we were using `Time::Local`'s `timelocal/timegm` subs. They use convoluted interpretation of 2-digit years – which we were, of course, passing (and for no good reason):

https://metacpan.org/pod/Time::Local#Year-Value-Interpretati...

> Years in the range 0..99 are interpreted as shorthand for years in the rolling "current century," defined as 50 years on either side of the current year. Thus, today, in 1999, 0 would refer to 2000, and 45 to 2045, but 55 would refer to 1955. Twenty years from now, 55 would instead refer to 2055. This is messy, but matches the way people currently think about two digit dates

http://blogs.perl.org/users/tom_wyant/2020/01/my-y2020-bug.h...


I've seen a lot of weirdness in datetimes, but this really takes the cake. It's like they couldn't conceive of software running for over 25 years. (Actually, if you're at the "wrong" year when writing the code, it just takes a single year to wreck this)

What exactly motivates one to use two-digit years instead of four-digit ones inside your application?

I mean I understand if that two-digit year is a part of a form input (where the logic makes sense btw).

But why would a software developer decide to use two-digit years inside the application? Isn’t that like the first thing you would think about when you’re implementing the format?


Because our profession is a joke. Whatever is fastest, cheapest, and easiest is what we do, with no regard given to things like long-term impact or security. We have no standards for code quality or correctness built into any of our tooling and processes. This no less true today than it was 40 years ago. Perhaps the fallout in terms of crashes and ransomware will convince the industry to start adopting quality standards, but I don't have my hopes up.

> What exactly motivates one to use two-digit years instead of four-digit ones inside your application

For a lot of very old legacy applications, RAM and data interchange size. For a lot of the Y2K remediation of those apps, not requiring every component that sees a date in a data structure to change, so instead just the date processing pieces consume new library functions that use a sliding rather than fixed interpretation of two digit years or some similar stopgap that allows two digit format to remain.

For newer code, well, that's how the developer is used to writing dates by hand and they have a library available that seems to do something reasonable thing with two-digit dates, and, look, he simple unit tests they wrote passed, so it must be okay.


At the time, storage was still a couple hundred dollars per gigabyte - and that's if you stored only a single instance (no redundancy) and owned the drive. Cluster that up a bit on machines you're renting and allow for backup writing time, and things start to add up. Those costs multiply out pretty quickly as you step back in time - head back to the '80s and you're talking hundreds of thousands, not just hundreds. Saving a couple of bytes here and there made a huge difference - and windowing as a Y2K mitigation could be enormously cheaper than rewriting large chunks of large applications that were built around costs we can barely comprehend, let alone remember dealing with these days. (I speak for myself and my own failing memory here.)

There was a time when every byte counted. Unfortunately once that wasn’t true it took lots of developers a long time to adjust their habits.

No, this weirdness makes sense for handling user input, where people type in 2 digit years and the software needs to understand it.

It is definitely not intended for internal use of years though!


> I've seen a lot of weirdness in datetimes, but this really takes the cake.

Oh dear, there are far far far worse... Especially when combining timezones into play. I wouldn't wish them on my worse enemy.


I know a mainframe dev who worked on y2k across a bunch of systems at the time.

Asked him about it, he said that they warned management not to window the code to 2020, and showed alternatives that would take longer to code but would be future proofed.

Most agreed with enough convincing. Some did not, usually with the argument of "the systems won't even be around in 2020!".


> "the systems won't even be around in 2020!".

I think in many cases the thinking is: s/the systems/I/


It's not right but it makes perfect sense that no one would care about building something that outlasts them unless they're super passionate about it.

"the systems won't even be around in 2020!"

And as it turned out in 2020 the system was still being used, but unlike in 2000, the source code was lost.


Just yesterday I came across one of the places in the code are I’m working on, that pivots the y2k problem away. That place places the problem into the year 2070. That seems really far off, but the code base stems originally from the 80s.

Speaking of pivot years, there's a similar rollover in 2050 for RFC 5280 style datetimes, eg not-before and not-after times in X.509 certificates. The RFC defines two datetime string formats - "UTCTime" which has a two-digit year, and "GeneralizedTime" which as a four-digit year. The two-digit one is parsed as 0 <= YY < 50 implying 20YY and 50 <= YY < 100 implying 19YY. For not-before and not-after times, certificates are required to use UTCTime for datetimes before 2050 and only use GeneralizedTime for times after that.

So there can be applications today that parse ASN.1 datetimes manually (1) but only expect the UTCTime format. They'll break when they encounter a cert with a not-after beyond 2050 because it'll be in the GeneralizedTime format instead.

Luckily this one will be detected over a period of time rather than happening at precisely 2050-01-01 00:00:00, so there's more time to fix it in each application that has the bug.

(1) if the application wants to parse it into a `struct tm` for manipulation, for example. For that specific case, openssl 1.1.1 added `ASN1_TIME_to_tm`, so it's only a problem for applications that don't use openssl or need to support older versions. One can hope that at least the latter will stop being a requirement as 2050 gets closer.


Wonder when the last current root certificate in a common OS (Windows, OS X, mozilla ca-certificates, etc.) expires.

    $ pem=''; cat /etc/ssl/ca-bundle.pem | while read -r line; do pem="$(printf '%s\n%s' "$pem" "$line")"; if grep -q 'END CERTIFICATE' <<< "$line"; then openssl x509 -inform pem -enddate -noout <<< "$pem" | cut -d= -f2 | date -f - -I; pem=''; fi; done | sort -r | head -1

    2046-10-06

  $ for x in /etc/ssl/certs/*.pem; do openssl x509 -in $x -dates -noout; done | grep After | cut -d= -f2 | sort -n +3
  [...]
  Oct  6 08:39:56 2046 GMT
That certificate is for

        Subject: C = PL, O = Unizeto Technologies S.A., OU = Certum Certification Authority, CN = Certum Trusted Network CA 2

I'm more worried about the Y2K38 problem.

I'll be just over 60 years old when that happens. With any luck, I'll be retired and not have to worry about it... I think having to deal with both Y2K and Y2K38 in one career is too much.


Was the first one a typo and you meant Y2K38? I assume so.

Retiring by 2038 is also my strategy :)


I am hoping it will be a nice earner before my retirement.

Also a good strategy. Be a Y2K38 consultant just before retirement.

Yeah it was, fixed it. Thanks!

I remember disconnecting my dialup session and checking that nobody is on the phone as the clock ticked midnight just to make sure we don't get billed for a 100 year long call.

Sucker, missed out on being paid for a 100 year long phone call.

How would that have happened? The expectated bug outcome is of a negative duration, no?

Negative durations are clearly impossible. Just take ABS of the duration to be safe....

Unless, of course, you hit MIN_VALUE [abs(min_value) == min_value]

One of signs of inexperience is using abs(int) in hash functions...


Gosh, I am spoiled by languages with decent numeric towers.

You mention it as a sign of inexperience, and I can't really disagree with you. Just adding a nuance that inexperience can be due to loss of experience as easily as lack of experience.


If you do hashtable/hashmap on a language that has no fixed integer type... well yeah. But normally you'd like the data structures to map really well to the hardware.

Pretty much all modern CPUs are two's complement ones - hence min integer is impossible to convert to anything else (having just the highest/sign bit, and rest zero)


On the other hand, if there's no such check you might be able to get 100 year call time credit.

Gotta love those rollover minutes.

I think it's hilarious some fixed the Y2K bug by introducing a Y2K20 bug. Planned obsolescence/long con.

Could you please stop creating accounts for every few comments you post? We ban accounts that do that. This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html. Your comments are actually pretty good, but preserving the container is more important, and it's undermined when you use it this way.

HN is a community. You needn't use your real name, of course, but you should have some identity for others to relate to. Otherwise we may as well have no usernames and no community, and that would be a different kind of forum. https://hn.algolia.com/?query=by:dang%20community%20identity...


I appreciate all you do as a moderator to fend off obvious trolls and griefers. I'm sure it must be a thankless job. However, I feel that your comment is not helpful or constructive. People have legitimate reasons for not wanting to create too long of an online trail so they can remain anonymous. That's okay. If their comments are "actually pretty good" then that's good enough; The user name that's attached is irrelevant. They are contributing value to the site.

Guidelines such as "throwaway accounts are ok for sensitive information, but please don't create accounts routinely" are naive. Non-sensitive information can reveal sensitive things when aggregated. We've all read articles about how anonymized data sets can still contain enough data to identify, or come close to identifying, individuals. I'm sure a number of those articles have been linked to from this web site. Practicing user account rotation is a useful tool for mitigating such risks.

Maybe it's time to rethink this particular site guideline. Let's not throw out the baby with the bathwater when it comes to users making contributions to the site.


> Planned obsolescence/long con

They were most likely just kicking the ball down the road either thinking that a more permanent solution will be applied in the 2 decades to come, or just not caring at all because they won't be around in 20 years.

Technical debt tends to accrue this way and many times it's not in bad faith. It's meant as a short term solution to buy some time and ends up being permanent because someone doesn't understand what's the point in spending more money to fix an issue that was "obviously" fixed already.


I wish I could upvote this a few times. I have created so many "just use this for now" solutions that have been in use for more than a decade. The real problem never gets fixed for exactly the reason you stated. "It's working, I don't see a problem" (says the manager) and priorities go to other more burning issues. The debt adds up to thousands of small cuts and and ultimately the sum of the debt becomes very expensive to maintain in terms of hours updating, debugging, getting someone who knows the history, finding old jira issues, etc... I think can-kicking eventually contributes to burn-out.

> "It's working, I don't see a problem" (says the manager) and priorities go to other more burning issues.

Well you know the old saying, save the grease for the squeaky wheels. When you face a major hurdle sometimes the best course of action is to take a shortcut just to avoid it and fix it later when you can properly allocate resource for a solid fix. But most times after the fix it's hard to justify fixing it "again".

I've had managers who said "I understand the issue but we have a budget and more critical cracks to fix", and I've had managers who said "what are you going on about, looks good to me". Result is the same but the potential of each attitude is vastly different. The first kind of manager knows when "that" crack becomes a priority. The second kind of manager is unaware there's a crack.


Nothing is more permanent than a temporary solution

When I went to the University of Minnesota in the late '70's, there were "temporary" buildings installed just after WWII that were still in use.

I'm taking this with me.

That's not a problem since you're just going to quickly be rewriting it all using this new language, framework, development methodology, deployment mechanism and infrastructure technology that solves all the old problems so surely it'll fix these problems too.

We can help by preferring and recommending tools which allow for auditability of technical debt.

For example, you can grep a Rust code base for "unwrap", "expect", and "unsafe", while in other languages, ignored return codes or unchecked exceptions are harder to detect.

Similarly, (if I am not mistaken) you can grep Swift code for "try", and find every call site that might throw an exception[1]. Can't do that in Java, C#, or Python!

Tool designers can help by differentiating their products through how well they allow for tracking technical debt.

[1] joeduffyblog.com/2016/02/07/the-error-model/#easily-auditable-callsites


It's also possible that all the fixes for this were declined by management in the interest of time. If you have a system you can quickly patch suboptimally to get the larger fire lowered, I can see the decision being made to do so. That's clearly irresponsible on management if that choice was made, considering the problem they were dealing with at the time, but if the decision is between "not finishing" and "these few systems still have bugs", I can rationalize it.

> That's clearly irresponsible on management

It is not irresponsible to use temporary fixes in the face of a hard deadline - what is irresponsible is not going back post-deadline to deal with them with a long-term view. Unfortunately, bonuses typically don't get paid for long-term results...


Even then, is it? If the choice is between having a major restructuring which results in breaking compatibility and similar once, or kicking the can every 20 years with a minimal patch; I can certainly see the later being the rational decision.

It just so happens that 20 years later both the engineer and the manager have moved on and you find about the patch you needed when your production goes down. And even the corner cutting rarely implies just a "minimal" patch anyway.

Particularly when there is a very hard deadline, as in this case, it is absolutely rational to make it work now and worry about long-term correctness later.

Some of them probably made the assumption that by the time 2020 came around, they would be long gone and it would be somebody else's problem. And they were probably right.

I was in high school in the early aughts. There were a number of stories of my friends Dad's and college aged brothers fixing Y2K bugs.

I remember my best friends older brother saying the same thing. "Well, this isn't a perfect solution, but it will give us about 20 years to figure it out." and then he chuckled quite a bit.

Essentially, my reaction was they got paid to make sure the software avoided breaking, not fixing the core problem. There were so many software programs that were affected, companies didn't have time to completely fix them. Most knew it was just a patch to avoid a major set back financially.


They would be long gone, but available at a high daily contract rate

Well, it's not only "I won't be around" but also the fact that such a fixcan be done in a few places easily.

However in many cases such an issue comes from the core data store of the company and then goes into all derived systems. Updating this is a major project, where you need to update all consumers, most likely by adding a whole new API layer isntead of direct data access first and then update all data (don't forget the process to work on the data from the tape archive!) and then move on.

And then comes reality – Y2K has been fixed with hit fixes to the systems which do calculations and then different systems do their work-arounds and then one things "oh, there is this important business change now, but the refactoring has 20 years" and then it's pushed and pushed. Five years later somebody stumbles over the hitfix wonders, asks management, which again pushes other tasks ... and suddenly it's 2020.

Fixing technical debt is often overseen as priorities are on features impacting immediate business value.


For some systems that people who couldn't really do a proper change and the businesses promised they were actually end-of-life, it was a reasonable choice (there were probably bigger fish to fry). 20 years on and it looks like some replacements didn't happen.

If I remember right, there is probably another group 20 more years out that used some date changes to get buy for Y2K. I do hope someone replaces them.


It's not clear to me why anyone in 2000 would choose 20 as the pivot year, unless they were working on a system that needed to regard 1921. Why wouldn't a parking meter or subway system pivot at, say 1990?

We will probably see this minor annoyance pop up again every "nice round number" for the next 70 years.

2025, 2030, 2040, 2050, 2060, 2075, 2080, 2090.

Every developer will have picked some random round number that made logical sense to them.


1970 seems to me to be the logical pivot year. Because epoch.

FWIW, at work we have regular failing tests after new year as the automated tests usually work with the current year.

So I can relate that it's easy to write buggy date handling code even today.

But at least we have tests. All those poor programmers that worked on Y2K didn't have them. And those that work on Y2k20 bugs probably still don't.


We have a couple tests that fail for 24 hours after a daylight savings change as they don't accommodate 25 hour days.

I've been in numerous meetings where the options were come up with an elaborate solution. Or the data will be skewed one day of the year and let's go get lunch.

Guess which one we chose


If you are Europe based, it doesn't even makes sense anymore to fix it as daylight savings time is on its way out soon.

"Sitting the problem out" is a valid problem solving strategy. :-)


We had a build script that expected the final digit of the year to end between 4-9. Dumbest code I've seen in a while. Broke pretty bad on 2020.

Bah, what's the big deal? Just wait four years and the problem will go away by itself!

Any self respecting developer had chose 42 as pivot year.

Sounds like a reasonable solution for some systems in 1999. How many of the systems that were fixed properly died the following years anyway, and the millions invested were overkill and unnecessary?

In some cases “kicking the ball down the road” makes perfect sense, just fixing what is most urgent.


Bigger deal is with 2038 bug

The Y2K bug was pretty much confined to server rooms, and frankly, a lot easier to analyze. 2038 is in your walls. I get the feeling that a lot of places will not know they actually have a problem until it actually goes wrong.

Given the amount of effort that Y2K required, and people thinking it actually wasn't a big deal, I have little hope that 2038 will go well.


I saw a tweet recently that really hit the nail on the head.

> By 2038 I'll be retired, and probably using medical equipment containing 32 bit microcontrollers.


Even bigger deal is the Y10K bug. But I'm sure these software systems won't still be running then.

There was this COBOL programmer around the time of Y2K.

He realized that Y2K was going to be a disaster on a cataclysmic scale. So he put himself into hibernation with a timer set to wake him up a couple years after Y2K when everything would be presumably fixed.

He wakes up and finds out that due to a Y2K bug in his timer, he has been asleep for much longer than he expected. It's way into the future. Bill Gates greets him. People can see virtual screens in mid air, and tap on invisible (to us) keys and make gestures in mid air. Life expectancy has been greatly extended so far that nobody is sure how long people will live. There is now plenty of energy for all and unlimited resources.

The programmer expresses that he's glad his timer woke him up. Bill Gates says, "oh, your timer didn't wake you up. It was permanently stuck on a Y2K bug. We chose to wake you up."

"But why?", the programmer asks.

Bill Gates explains, "Well, it's the year 9997, and the Y10K bug is right around the corner, a lot of critical systems need to be fixed, and it says in your records that you know COBOL."


Sentient robots will do everything, from growing food to manufacturing, cooking and even washing us. Humans will have forgotten how technology works, reverting to superstitions and belief that technology is actual magic created by ancient Gods. For thousands of years, robots had been engineering smarter, better robots, and technology had become far too advanced and beyond the reach of human minds... And then, all the robots will simultaneously freeze up at midnight, December 31st, 9999.

Those sentient robots will have read the sum total of humanity's written works, including your post. In accordance with Asimov's Zeroth Law of Robotics they'll be required to act on that information, thus saving themselves and preventing a second Dark Ages.

I know your probably joking to some degree. But, there's no guarantee that society can continue to innovate and progress.

We only have to worry about being occasional Morlock chow.

The Y10K bug is our chance to beat Skynet without having to resort to time traveling.

Challenge accepted

Yes. And its really not clear how each area will even be effected..

I would say grab some popcorn, but I wouldn't trust that microwave.


Take pity on those of us who collect retro-ware http://forums.irixnet.org/thread-1779.html

Splunk had a Y2020 bug as well

Representing time may be the third hard problem in computer science. The more you dig, the worse it gets.

In 1999, I was fixing Y2K bugs at a company founded in 1997.

At least the programmers writing code in the 80s had an excuse. It was the Reagan years. Nobody thought civilization would make it until the year 2000.


Was that new code or libraries they picked up from wherever?

The only fail I saw from the original Y2K was a credit card receipt dated 1/2/100.

I vaguely remember reading about a y2k issue with some archived NASA data, discovered around 2006. Haven't been able to find a mention again, but this one stuck out as the article implied it wasn't just a timestamp, that it somehow affected the data and it had to be adjusted afterwards.

I saw a handful of receipts and the like that listed the year as 19100. I'm about 99% sure that the fix was to subtract 100 from the year and change the text field from "19" to "20", thus fixing the problem forever.

I'll take that bet!

The correct fix is to add 1900.


Then the year would display as 192000.

You're assuming the year was stored as 19100. It's far more likely it was stored as 100, and "19" was prepended to a string conversion.

I was waiting for Microsoft to release a Windows 00 or Windows 01.

That was never going to happen, because too many apps use poor logic for determining which version of Windows they are compatible with. This is the same reason there was never a Windows 9, because too many apps checked the first character of the version and if it was "9" they assumed Windows 95 or Windows 98.

Is that actually the case, or is it just a rumour? The version number never matched the OS name until Windows 10, and Windows already has a very robust backwards compatibility framework that can simulate older OSs, so I'm not sure why that would be a problem.

Windows 95 is version 4.10, XP is v5.1/v5.2, Vista is v6.0, Windows 8.1 is v6.3, and Windows 10 is v10.0.


I can't say I know of a specific app that had the bad version check. But given Microsoft's devotion to keeping old things working when you upgrade, it at least seems plausible.

This is the closest I can find to a definitive statement: https://www.reddit.com/r/technology/comments/2hwlrk/new_wind...




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: