I talked to someone who worked in this team. A mere 30 min conversation opened my eyes to the complexity and he wasn’t even scratching the surface. It’s hard for most people to appreciate the true complexity here. I agree that accountability is needed but even a trillion dollars won’t be enough for bug free code - especially with dates.
The nature of this specific bug isn't something related to hard date stuff, like leap years, leap seconds, DST, or time zones. It's just that they formatted version strings as YYMMDDHHMM (e.g. 2201010001) and tried converting that number to an int32, causing an overflow once 2022 struck.
It's not even really important that it's a date bug from a testing perspective; it's just a bug and a signal of an insufficient testing pipeline. You make a product (including a definition file), you test it before publishing it. You have canaries. If testing it happens to involve moving the date forward, you do that.
How hard is it to write a test that checks every date in the next 5 years to make sure we know the function will not break without giving us sufficient time to publish a fix in the regular version pipeline?
Good point. It feels like what you're describing is like time travel to erase the bug from the timeline completely, vs. getting a one-time advanced warning. I guess for date-related bugs where you have no other way to avoid them, you would need to set the clocks forward at least as far as you have the capacity to get a fix in, regression tested, and get your users enough time to do the same.