Several European Central Banks were unable to process transactions ;-). Can't provide more details, to protect the innocent. Since then, it has been a standard within the Testing group of that vendor to always have a running platform setup with a date 6 months ahead ( clock sync from a different source ). Something I added to my own Software Engineering and Testing standard practices.
During this time of one hour “Fall Back” on the clock, as is our Basis team best practice (SAP Note #7417, #102088 and others) we intend to gracefully shut down all Production applications prior to 2:00 AM, at which time the “fall back” happens that will revert back clock time by one hour. We would then wait for clock time to advance past the “new” 2:00 AM Local time. And start from 2:05 AM EST will bring up applications.
Like, I’m not convinced “turn it off” isn’t an improvement.
They're that bad.
Empirically, it's associated with a higher rate of medical errors, which is a big part of the drive for electronic records.
What I like to do to avoid this situation is to use property-based testing to test that the code works for all dates in the desired time span. For Python I use hypothesis  in combination with freezegun  for that. Here is an example of how that could look like:
from datetime import datetime, timezone as tz
from freezegun import freeze_time
from hypothesis import given, strategies as st
# TODO: Refactor before 2038!
if datetime.now(tz=tz.utc) > datetime(2038, 1, 19, 3, 14, 7, tzinfo=tz.utc):
raise Exception("This code is not working after UNIX time overflow")
@given(dt=st.datetimes(max_value=datetime(9999, 12, 31, 23, 59, 59), timezones=st.just(tz.utc)))
>don't test behavior with different dates
It basically tests with different dates every time it's run.
That seems different from what I understand by "continuous integration".
Long quiet pause, then a series of curses. So we deployed at 2am instead.
> Something I added to my own Software Engineering and Testing standard practices.
Making sure it works at the business domain level is another thing entirely.
Most tests are written with a fixed date, or now(). Instead, using now()+delta can help discover future bugs and give you some time to remedy the issue. Often annoying to write tests like that, though, without them breaking for unrelated/timing issues compared to a fixed value.
It's a nice early-warning system for Y2K or Y2038-like problems.
This is fuzzing: https://en.wikipedia.org/wiki/Fuzzing
Fuzzing would be testing the date parser by passing it '25-44-2021' to see what breaks.
Invalid input is invalid input
Inputs don't have to be invalid for it to be fuzzing.
Here's the original link https://github.com/rails/rails/issues/5239
Unfortunately, they fixed the date and his bender account got renamed.
Without this, there's no value in running a duplicate environment like this. Automated tests won't capture human interaction perfectly...
If you entered a birth date, the software would calculate the pet's age, and we noticed we had the odd quirk, like a 400 year-old cat etc. I couldn't see an obvious pattern to the anomaly, and it didn't occur often, so I created a support ticket.
After a couple of days, I received a response that 'dates were hard' and we should correct the ages manually as needed.
So there you have it: Dates are hard.
(No, the bug wasn't ever fixed.)
Choices to represent dates as an integers (seconds from an epoch), or as strings, or as arrays, or as other data types all come with a lot of non-obvious consequences and edge cases to deal with. I always use a provided "date" datatype from a library, and never try to roll my own with base data types if I have any choice in the matter. And they are still hard, depending on what range of dates you need to be able to handle.
Well, it's 2022 and 2038 doesn't look that far away now. Hopefully I'll be retired by then and won't have to deal with it. But, something tells me that's not going to happen.
Something like this is unacceptable.
Date bugs are hard, but this one isn't.
In another universe The 400 Year Old Cat could have been our 500 Mile Email.
Its just another form of y2k, or y2038.
Being an amateur developper, I never tried to touch a date in another way than though a specialized module (arrow in Python, moment or luxon in JS, ...).
I know only a few such modules so everything that remotely looks as time/date computation is a nail for my universal hammer.
Dates are hard.
Oh yes. The nightmare of software that reinvent the wheel.
Instead of using ISO 8601  they feel the need to do something else.
I currently suffer with how time was botched in the otherwise great backup program Borg 
 https://xkcd.com/1179/ - yes, xkcd has an entry for everything
See for instance https://github.com/borgbackup/borg/issues/4832
A long integer is 32 bits. Since it's signed, one of those bits are used for the sign and 31 are used to represent the actual number. With 31 bits, you can express numbers from 0 - 2^31 or 2,147,483,648. 2,201,010,000 is larger than 2,147,483,648.
If they used the format YYYYMMddhhmm, then their numbers for every year would have overflowed the field. Because 201,801,010,000, for example, is too bit to fit in the long. Every year past year 99 would be too big.
But, if it's, say 2008 and you realize that putting the thousand and hundreds will always make it overflow, 801,010,000 would not make it overflow. So why not just cut off thousands and hundreds? Because it's not like we'll run into a 1999/2000 problem before the software is replaced.
It has affected 100% of the Exchange servers I'm overseeing. In every case, a complete halt to mail was corrected by a fix only escribed on Reddit and Twitter.
Seriously, if you can’t add, subtract, multiply, and divide them, then they aren‘t numbers and you shouldn’t use a numeric data type for them. This is CS 101 knowledge. How badly does MS run their engineering?
You’ve missed a ton of other scenarios here, the relevant one being a compare operation with a < or > output. These are version numbers. Version numbers the way they’re used here increment, they are not random strings, which is what you’re effectively suggesting.
If the version numbers here were represented as strings they’d need to be converted to a number at some point to compare them, and you’d probably run into the same problem you have here.
Of course I’m sure you’d have the perfect, completely applicable, and universally accepted solution then as well…
They still aren't compatible with number types. They are groupings of numbers at best.
2.9 is an earlier (not less than) version than 2.10.
3.1.6 isn't a (strict) numerical format.
5.4.11-beta1 has non-numerics.
The problem is they shoved the individual parts into a single 32-bit integer, which just isn't enough space so they had to compromise and make some of the numbers have extremely small ranges.
If you "just" increment you leave no room for showing patching in your versioning strategy.
Perhaps you're missing what I meant by 'increasing' - it's just a 1:1 conversion of the version number to a single integer, it's not like you're assigning every version a new number. 3.9.3 would become something like 3009003, and 3.10.1 would become 3010001. If you compare those representations then 3010001 always compares higher, and there's also nothing preventing you from releasing 3.9.3 after 3.10.1.
My point was that doing a format like the above (major, 3 digit minor, 3 digit patch) is not fundamentally different from storing each individual number as it's own integer, which most version libraries I've seen do at some point. The problem is just that the range of each number is restricted significantly by requiring 3 digits rather than if you used individual 32-bit or 16-bit integers for each one.
A "Version Number" is not just a "Number", as I said right at the start.
Just to clarify, having an integral incrementing number as the whole version number strategy explicitly prohibits the use of major/minor/patch.
Unless you intend on using 0s (instead of fullstops) as delimiters I suppose.
Edit: as you demonstrate in your post. I'll stop posting now :)
Does “1.2.0” + “2.3.0” make sense in your world?
And if you’re like what about ordering and comparisons, then sure, you have a partially ordered set.
And not everything can be mapped 1-1 with the integers; reals for example, and therefore arbitrary strings. (Of course, that's only theoretically; with a limited string length you can obviously manage)
You're right that the ordering is the key feature of a version representation; munging it into an integer gives you that for free but risks getting to overflow quite easily (as in this case). I guess the conclusion is, use a language like Rust or Ruby when you can define ordering easily on structured data.
That should fix a lot of version number problems (prolly not all) and i've been stealing the idea for my own versioning lately.
> Version numbers the way they’re used here
And none of that applies to them. Yes, there are versioning schemes that work differently, irrelevant to the example here.
Not necessarily. Some projects go from 2.9 to 3. I'm not sure what you are trying to prove.
I think version numbers are a partial ordering at best.
Every non-number, as you put it, is just a number to a computer. Comparing binary is faster than sorting strings, and it makes zero sense to waste memory simply so a human could potentially read it.
Otherwise, a timestamp, a date, a commit hash or an increment would do.
For comparison, I think some here would be shocked to learn their IPv4 address is stored as an unsigned 32-bit integer. Its not a number, and definitely not faster to use as a string.
10000 can be a valid year according to ISO8601, but sorted as a string it would come before year 2000, which would obviously be incorrect.
Here's from the ISO:
> 3.5 Expansion By mutual agreement of the partners in information interchange, it is permitted to expand the component identifying the calendar year, which is otherwise limited to four digits. This enables reference to dates and times in calendar years outside the range supported by complete representations, i.e. before the start of the year  or after the end of the year .
Did you mistake ISO-8601 for RFC-3339? Even there lexicographic sorting isn't guaranteed, since you can have a pure date or a pure time or a date-time combined, and there's a choice between "T" and " " to separate dates from times in a date-time but it's much more likely to work than with ISO-8601.
In Math we call that subtraction!
20.40 == (20.39 + 0.01)
#if BUILD_VERSION >= FIRST_VERSION_THAT_WORKS
The bug here is that they picked an encoding that was obviously going to overflow in a Y2.022K bug, not with the technique.
Speaking of LINUX_VERSION_CODE, it had a similar problem recently. See the article at https://lwn.net/Articles/845120/ which was summarized in this year's LWN.net retrospective at https://lwn.net/SubscriberLink/879053/aaea44782e8c760d/ as follows:
"Sometimes the most obvious things can be the hardest to anticipate; consider, for example, the case of minor revision numbers for stable kernel releases. A single byte was set aside for those numbers, since nobody ever thought there would be more than 255 releases of a single stable series. But we live in a different world, with fast-arriving stable updates and kernels that are supported for several years. It should have been possible for us, and for the developers involved, to not only predict that those numbers would overflow, but to make a good guess as to exactly when that would happen. We were all caught by surprise anyway."
All computers work in finite representations. All values can overflow. Good engineering is about working within that world and not trying to Quixotically design its absence.
I'd bet the conversation about this way back when it was being decided was "will 3 digits for daily version/build be enough" and the answer "go with 4, better safe than sorry".
Is it possible that at Microsoft's scale this would actually make a significant difference?
Edit: reading all the comments here, I get the feeling high level programmers don't fully understand how things are represented at the low level, there is no purpose in storing a version/serial as a string.
If the OP bug was really triggered by a string of numerical characters not fitting into a 32bit integer, all the trouble of saving a few bytes on the client system were not worth it.
Plenty of data is stored in varied types for processing that are efficient for a computer, for example DNS has been using date-based serial numbers successfully for decades, and stored internally as uint32.
It's a little silly that a test didn't catch this problem, but using a version number somewhere high up in the long makes total sense to me. After all, version numbers can be anything you define them to be, as long as they're unique. You'd like them to be sortable, but they don't even have to be.
I don't think it's that they don't care, I suspect many simply do not understand what is under the hood in the first place; and they don't need to when doing web development. The fact that some here suggest storing version as a string is some how faster and safer is a big give away.
That does mean rolling your own "greater-than-or-equal" operation though.
2.2 is greater than 2.12 but v2.2 is actually an early predecessors to v2.12
2.02 is smaller than 2.12
"202201010001" > "202112122359"
evaluates to True in Python, SQL and many other languages.
Plus if they weren’t then even numerical comparisons would fail eg 1012021 (Oct/1/2021) < 5102020 (May/10/2020) so you argument is moot.
"%04d%02d%02d" % (999,1,1)
Under the hood these systems will convert to some numerical format to compare which version is newer. You store it as a (proper) numerical format or you serialize it into one later, but at the end of the day it’s a number because that’s how you compare it.
The actual fix is exactly what the original comment states: a version is not a number.
You expect Microsoft Exchange to stop being used some time in the next 20 years?
Even more likely, we'll have new protocols the current version doesn't support, so the bug will have been long fixed or noticed.
But in your alternate timeline the issue would only have been hit in 2043, and since in our timeline it was not noticed before it was hit, there is no reason to believe it would have been noticed in the alternate.
So the exact same issue would have occurred, a few years later.
I fail to see why that would be an improvement.
> Even more likely, we'll have new protocols the current version doesn't support, so the bug will have been long fixed or noticed.
That is a completely unsubstantiated assertion.
I never suggested it was any improvement except for the advantage of time; although I shouldn't have used the word 'correct'. Signed 32-bit dates have a well known limitation that people will likely hunt and fix over the next 20 years. But yes, its all unsubstantiated, no doubt there.
The real main point of my post you first replied to is that it doesn't matter that the version was stored as an integer. The failure happened when they used their own format and didn't confirm it was properly bounded. Everyone here is having a hard time grasping how their data is stored and processed at the low level - your 'string' is still just an array of 8-bit bytes.
Storing an only-increasing number as any form of integer is a perfectly acceptable, and efficient way for a computer to compare and process. Version numbers are one of those. Phone numbers are obviously not.
Everything in software takes data in one form and processes it into another. There are no 'strings', there are no version numbers. There is only binary. They simply did not give enough space to store it, thats all.
DNS is a perfect example, which stores serial numbers internally as an unsigned 32 and has worked for decades and will continue to. But they chose the format of YYYYMMDDnn which will last far longer than a 2 digit year.
The harbour bridge has lights on it.
For what financial benefit?
The fireworks display itself, I can't understand where the financial benefit is and it costs millions of dollars. ($7 million as far as I'm aware)
I think that you could make a financial motivation statement for making good software (it makes it less likely for people to switch, for example), but my broader point is: why does everything have to be financially motivated?
Because, a lot of stuff that we enjoy doesn't seem to be primarily financial in nature.
>For what financial benefit?
Same for the Opera House. Imagine all of those stock photos of Sydney. Now narrow those down to the nighttime shots of Sydney. Imagine those without lights on the bridge or opera house. For that matter, any of the lights on any of the buildings. What do you have left? A really boring photo. Nobody wants to visit a city with really boring photos. What's the financial benefit of that?
An aesthetically pleasing scenery would be one of those benefits.
Evoking desire is key.
We agree that the Free Market(tm) incentives are to maximise desire even at an up-front loss (Fireworks, Lighting the bridge) but the parent said that it's free market economics that prevent Teams and Windows from being desirable to use.
Is someone wrong or am I misunderstanding something?
Is there more profit in awful things? Why does the Harbour bridge have lights then?
That is so so so different from a group of engineers building a product and totally not grasping that while it technically works, it is not pleasant for the end users. It takes a certain level of asshattery to assume that the devs are going out of their way to make it this way.
There is the same financial incentive for MS to fix their Y2k22 bug.
The incentive structures are built to make financial incentives take precedence with most things
Bad enough to be a 2.53 Trillion Dollar company.
some numbers^* cannot be meaningfully multiplied or divided. For example, 3^rd place, 30° F, or 37.388N,-122.067S.
Some numbers can only be added to by numbers of a different type. For example x°C + y°K, 3rd grader + year, absolute position + offset. Similarly, subtracting two of these numbers results in a different unit. e.g. position - position = offset
it would be great if someone could comment the exact terminology for what I'm trying to describe.
- nominal scales (e.g. man/woman),
- ordinal scales (elementary school/middle school/high school),
- interval scales (degrees Celsius) and
- ratio scales (human height, degrees Kelvin).
Only ratio scales may be multiplied and divided. Interval scales may be subtracted and added. Ordinal scales may only be compared (less/greater/equal). Nominal scales may only be compared in terms of (in)equality.
You may also be looking for fields: https://en.m.wikipedia.org/wiki/Field_(mathematics). You could probably also describe the relation between different number types as linear operators over a vector space
There are libraries that implement this concept in other programming languages (C++, Java, C#, etc.)
For uint, you actually have x + y = (x + y) mod MAX_UINT.
For float, sometimes x + 1 = x.
The issue is not that date is not a number, but that people poorly reason about common cases of date arithmetic. Similar stuff comes up for floats and ints but they're just easier to reason about.
Would you spell this out for us?
You’re right that they shouldn’t be implemented a numeric data types.
That said, dates do have properties of modular arithmetic, so in a strange sense they are very much numbers…
Most bugs are stupid and should be obvious. They happen.
You can look at your old code and say WTF, so judging decisions in somebody's else complex system without having any idea about requirements and constraints seems pointless.
The problem here was basic overflow, they reached a version number exceeding the capacity of the numeric type.
My grandparents first phone number was 13. Not because they got in super early and had the 13th phone number in the country, but because that was the 13th number for their exchange. When a bunch of exchanges were put together everybody got another number in front of theirs. Their phone number is much longer today, but it still ends in 13.
I guess you could treat the phone numbers as strings, but I don't know if you can do that with the Danish zip numbers.
2400 Copenhagen NV (North West)
2412 Santa Claus, Greenland
2450 Copenhagen SV (South West)
3790 Hasle (close to Roskilde)
3900 Nuuk (Geenland)
3992 Dog sled patrol "Sirius"
The US used to prefix phone numbers with letters (like MElrose 1-2345) as an aide-mémoire during the transition to 8 digits with area codes. And you'll still see businesses advertising with letters in place of digits (like 800-GOT-JUNK which is a junk removal firm) but they never get entered into phone number fields like that. The best reason to have phone numbers as text though, is the digit groupings . In North America, it's 3-3-4 but other countries group their numbers differently to help people remember them. So allowing users to enter numbers in their format you're making it easy for them.
 The main reason to do this is if you're in the business of sending out mailings. The US postal service gives discounts if you sort and bundle by ZIP code, and mixed ZIP and ZIP+4 values would make that hard.
You can cluster them for distance without storing them as an integer. Just throw them in a graph data type and compute edges having actual integral weights. The ZIPs themselves are not what you want to be doing your math on.
The efficacy of any computational operator on a specific representation does not impute the same for all instantiations since the semantic link between identity and representation may only exist in the mind of the programmer and not the computational device.
> Because they sure are autoincrement integers in almost every DBMS.
The autoincrement occurs before it's a database ID. The generator for database ids is a sequence, that has no bearing on the semantics of the database ids.
As many people already commented, it's reasonable to encode things as numbers as long as you keep in mind there's an encoding operation with it's possible flaws.
Is it better this way?
(You can see the problems clashing with each other if you search for discussions about defaulting ids to 32 or 64 bits. So they are not completely solved, the practices are just good enough that things don't usually break on practice.)
At the end of the day, all you have on a computer is a bitfield to mess around. You will always have to deal with encoding at some point.
So sure, store a date in an “integer” but if you try to treat it like one instead of a bitfield you’ve made an error and if your storage isn’t wide enough to hold all dates that’s gonna be a PITA later.
I think the real problem is ... how many good engineers and engineering managers nowadays want to work at MS?
Sadly, sometimes you can, regardless of whether you should.
Anyone who stores ZIP Codes as an int should have his dev license revoked. You've just corrupted the data you store for a hundred million people in the northeast.
I'm currently dealing with a situation where a system developed by an offshore team stored Social Security numbers as integers. They had no idea that an SSN can start with a zero, and didn't even do a basic web search to see what the possible range of values is before designing the database and application.
Only if you're 100% sure your date is completely clean. Only very rarely is this the case, especially with ZIP Codes because the data almost always traces back to human input.
The initial query may be quicker, but then you have to compensate for the missing digits elsewhere, likely multiple times. You have to consider the expense to the whole system, not just to one query.