Hacker News new | past | comments | ask | show | jobs | submit login
2038: Only 21 years away (lwn.net)
682 points by corbet on March 21, 2017 | hide | past | favorite | 315 comments

This is (serious) a part of my retirement planning. I'll be mid-50s when this hits, and have enough low level system knowledge to be dangerous. In about 15 years, I'll start spinning up my epochalypse consultancy, and I'm expecting a reasonable return on investment in verifying systems as 2038 compliant.

I hope you've trademarked the term "epochalypse" because it is really catchy! I love it.

I will be just 5 years away from retirement, so I will use this as an opportunity to shore up my 401k.

Of course, the singularity will be here by then and fix it for us just before turning us all into paper clips, yada yada.

I will advertise with the slogan "Epochalypse... NOW."

Sell manual water filtration systems. I heard people made a fortune on those before Y2K.

Make sure your 401k is the first thing you certify as 2038 compliant

> I will advertise with the slogan "Epochalypse... NOW."

or you can do the same as the y2k people did: Y2K... TOMORROW!

> turning us all into paper clips

A paperclipocalypse?

Paperclipalypse works for me

2038: Odyssey 2.5

He didn't invent it. It's been around about as long as people have seen this problem coming.

Maybe he can title himself an epochalist.

The epochstle.

This is part of my plan as well, thus https://2038consulting.com/ registered 6 years ago.

Genius. Are you accepting A rounds?

My retirement planning is to work on the Y10K problem. When the year 9997 comes along, everyone is going to start worrying about the rollover to five digit years. In about the year 9995 I will start seriously brushing up on my COBOL.

Yes, I know some of us will never retire. 8000 years in the future career planning is an interesting thing to think of though. 8000 years ago the only thing that was going on was some neolithic agriculture. Domestication of the Jungle Fowl (modern day chicken) in India and the beginning of irrigated agriculture in Sumeria were probably the biggest news items of that millennium. I guess someone from 8000 years ago could have said he'd be raising chickens or doing irrigated agriculture in 8000 years and wouldn't have been wrong had he lived that long. Makes me think of F.H King's book "Farmers of Forty Centuries" about how Chinese agriculture has been farming the same fields without artificial fertilizer for 4000 years.

There was almost certainly some forms of relatively advanced seafaring 8000 years ago possibly including skin boats, sails and paddles, ropes, sealants and astronomy. Also, fairly sophisticated metallurgy was widespread with at least silver/iron/gold, possibly bronze. Writing was known to some cultures. Horses, camels and water buffalo were likely all domesticated. Use of drying/smoking for preservation and curing of meat. Yogurt may have been known in some areas. Advanced pottery. Probably nontrivial herbalist / medicinal / architectural / construction knowledge. Plus of course trapping, fishing, textiles, stonework, etc.

I can believe seafaring, astronomy, and metallurgy. But yogurt? Now you're just pulling my chain.

I don't know how much you know about yogurt, so I apologize in advance if this sounds condescending; but yogurt has been eaten since at least the 5000s BC, and is easy to produce, probably even by accident. It's obtained by controlled souring of unpasteurized milk; clabber, which is almost too sour to be edible but is safe to eat if you can stand the taste, comes from spontaneous souring.

Fernand Braudel (in The Structures of Everyday Life) talks of how it was the staple food of the poor in Turkey, and I think in Persia. US commercial yogurt is weak and sugary; the Eastern variety is much more lifelike.

Yes, technology is subject to the Lindy Effect. [0] It's a good reason to learn both Unix and farming.

[0] https://en.wikipedia.org/wiki/Lindy_effect

Without artificial fertilizer but I recall reading that they massively improved yields from 1000-1500, let alone a span of 4000 years.

Why make "artificial fertilizer" a goalpost? Yield is what matters, there's apparently a million things that go into improving rice paddy yields.

This is obviously a joke, but here's a serious reply to it: There will already be serious Y10K problems in 9997, just like there were Y2K problems earlier than 1997.

Epochs & date formats aren't only used to represent & display the current date, but dates in the future, e.g. think about reservation systems, graphing libraries that display things 10-15-50 years in the future etc.

My bank's website allowed me to set a recurring monthly transfer until the year 9999.

The Long Now Foundation uses five-digit dates like 02017 in their work. :)

> The Long Now Foundation uses five-digit dates like 02017 in their work. :)

Hopefully they don't have any octal-related bugs.

Probably only in their front end code.

To really future proof themselves they should have gone with 2.017*10^3

Nobody will care about significant year digits in a few billion years

People get bans on XBox Live until 9999-12-31 I guess it's probably just an Input validation thing. Also I think it's fair to given most people their account back after that amount of time.

The only codebase I've checked personally can't handle Y10K, so there should be plenty of business: https://github.com/python/cpython/blob/6f0eb93183519024cb360...

That one, however, will have a catchy little slogan:

   PIC 9(5) for 9995!

Another idea that'd be way cheaper - start writing books on how to survive the epochalypse. You can crib a lot of material out of the books written in 99.

Start right now. You'll have the SEO and the thought-leadership circuit all stitched up well in advance.

Start too early and you'll sound like an end of the world preacher. You need to time it so that you look cutting edge rather than paranoid.

There is a certain amount of snark embedded in this post, but its not a bad idea at all - I'll be in much the same boat at that time - perhaps I can borrow the same idea...

It wasn't intended to be sarcastic. I suspect there will be a lot of businesses (especially finance) interested in verifying that the "ancient" Linux system they setup to replace their mainframe 25 years "ago" is safe. I lived through the Year 2000, and watched the same thing happen there.

With suitable groundwork, there will be a willing and wealthy market looking for people to assuage their fears - a service I see myself as happy to provide. Much as it was in the "Millennium bug" area, a lot of the effort in getting the business is PR spadework, but I've got 15-20 years of prep time to position myself suitably. I also hope to provide a slightly more useful service than many Millennium consultancies did at the time.

Databases too. MySQL has a 2038 bug in their UNIX_TIMESTAMP function.

  select unix_timestamp('2038-01-19') returns 2147472000
  select unix_timestamp('2038-01-20') returns 0

Surely these sorts of errors in major systems are already being/have already been addressed - for healthcare for example a search for which members of a GP's practice are going to be pensioners in 2040 is going to error badly? Strikes me that the time to address them was shortly after the millennium bug.

If you were around for the millennium bug, then surely you remember how many people waited until about October, '99 to start looking for problems. If you think those people went looking for another problem 38 years away to fix...

If this class of business is not seeing a problem this minute, it isn't a bug. And it won't be a serious-enough bug to spend money on until whatever workarounds they can think of start having negative effects on income.

Sources: experience with Y2K remediation, experience with small business consulting & software development, experience with humans.

>experience with humans

So true. That said, thinking 38 years into the future is usually not sensible for most businesses, because it's very possible that they're bankrupt before then. Thinking 21 years into the future also offers poor ROI for the same reason.

I was thinking large companies, banks etc.. I'm not in IT/programming but remember (despite being young) that 2 other calendar related 'bugs' were mentioned at the time. IIRC we've had one and the epoch bug is the second.

Strikes me that big businesses would have thought "will we need to do this again in a few years" and acted appropriately and that there should be a trickle down effect as large corps demand future proofing in IT products.

Yes negligence, ignorance, lack of foresight, corner cutting, and other human traits feed in to that.

Generally, the bigger and older the company is, the more unmaintained cruft you have. Banks would be one of the worst I imagine.

> Surely these sorts of errors in major systems are already being/have already been addressed [...] search for which members of a GP's practice are going to be pensioners in 2040

If you know anything about programmers, you know they found that error at some point during development, thought about how other people dealt with that, remembered that windows used to treat >30 as 1900s and <30 as 2000s and did the same. So probably the people in this thread planning their retirement solving this will have to figure out which random unix timestamp number is treated as pre 2038 and which one is after. And then they will have to undo all the last-minute spaghetti code tying it up on the original program.

That's not been my experience. A lot of companies don't address needs that are farther in the future, because there are nearer term needs and a fixed budget. Also, legacy systems that haven't been patched or updated in decades aren't that uncommon.

And, even if they happen to be safe, they do tend to pay well just for an audit to confirm that.

Financial systems that deal with bonds already have to deal with maturity dates on 30-year bonds that are past 2038.

Those type of systems use various date representations instead of datetime/timestamp fields, they should be mostly immune. Well, until 2038 arrives and some of these dates start being compared with the current datetime.

Aren't you worried that by then someone will develop and train a DNN to do that?

Woah, you were alive in the year 2000? Tell us what that was like, grandpa!

Our phones sucked and we had to watch television with commercials you couldn't skip.

I don't know what phone you had in 2000, but I had a Nokia 2100 in 1996 and it was the best phone ever made.

Now that I think about it, I probably had an Ericsson T28 in 2000 and it did kinda suck.

>verifying that the "ancient" Linux system they setup to replace their mainframe 25 years "ago" is safe.

The problem with mainframes is that they can't be trivially upgraded or migrated to 64-bit like modern OS's on x86 hardware can be. Vendor lock-in, retirement of OS, bare to the metal coding, etc caused this. If these mainframes were running a modern OS, it would have been trivial to upgrade them to a 64-bit version and make whatever small changes are needed to date storing in the old 32-bit apps. You won't need a wizened COBOL guy for this. A first year CS student would be able to look at C or C++ code and figure this out. Modern languages are far more verbose and OO programming makes this stuff far easier to work with.

Comparing mainframes to unix systems really doesn't make sense. Its two entirely different designs. Not to mention, the idea of running a 32-bit OS today is odd, let alone 20+ years from now, especially with everything being cloudified. You'd be hard pressed to even find a 32-bit linux system in 20+ years, let alone be asked to work on one. That's like being asked to setup 1000 Windows 98 workstations today.

Pretty much everything in your post is wrong. IBM mainframes are heavily virtualized and have very good support for moving to larger address spaces. VM and MVS moved from 24-bit to 31-bit to 64-bit address spaces. You can run the old 24-bit applications and upgrade them as needed. Even assembly programs - the old assemblers and instructions are supported on newer hardware. System i (System/38-AS/400) was built around a 128-bit virtual address space from the start. There is much more support for fixing old software on mainframes than there is for proprietary 1980s-era PC and Unix applications.

I have no idea why you think running 32-bit today is "odd." 32-bit desktops and small servers are still perfectly usable today. 32-bit microcontrollers are going to be around for a very long time (just look at how prevalent the 8051 remains), and a lot of them are going to be running Linux. It also makes a lot of sense to run 32-bit x86 guests on AMD64 hypervisors - your pointers are half the size so you can get a lot more use out of 4GiB of memory.

Also note that IBM mainframes can run 64-bit Linux just fine. Indeed, IBM's been marketing its LinuxONE mainframe line as a z series machine that doesn't run z/OS at all.

(disclaimer: IBMer, but not a mainframe person)

We're talking Mainframe system designs and code from the 70s and 80s. No they aren't running 64-bit linux. I think you guys need to re-read my post. The legacy systems on Y2K had none of these features.

It isn't 1000 seats, but I know of a Win98/NT4 shop. It is an isolated network supporting mostly phone sales and pick-and-pack, runs some ancient copy of MAS90 and some home-grown software.

These installations exist, and (outside of tech startupland) isn't even that strange, although he is probably pushing things. The owner of that business is proud of how long he's made his IT investment last; his main concern is that dirt-cheap second-hand replacements that can run 98 are apparently getting harder to find.

Be careful about your definition of absurd. Somewhere, right now, some poor bastard is building an NT 3.51 workstation for some stupid reason. I'll bet you $0.05 that some future poor bastard will be building NT4 or 2000 devices in 2038. :)

Well, at least NT has no problem with 2038 ;-)

I built an NT3.5 VM in 2012 to run some ancient book binding publishing junk that relied on an ancient version of access... might still be in production, no idea.

I just set up a Windows 98 system last week!

Installation of win98 is so quick, so easy, compared to XP.

But websites don't render so well in the win98 version of IE. I don't think it knows about CSS.

Some good domain options available. .consulting, .guru, .pro, .expert. https://www.hover.com/domains/results?utf8=&q=epochalypse


(not avilable just yet)

epochalypse.no is an option if you are/know a Norwegian

Can you not get a .no regardless of where you live?

That is a great term. Start investing time in static analysis tools in FOSS that find it for you or your customers. Then extend refactoring tools ("source-to-source translators") to automate the job. Apply for YC for growth. Get acquired by IBM who saw all kinds of adaptations for the tool in their mainframe offerings.

There were fortunes made in 1999 doing similar work.

It was a good time to be a COBOL developer.

There's always someone on HackerNews who has done something pertinent to almost any discussion (it's one of the best things about being here, in fact) - are there any COBOL guys or gals who made a fortune fixing Y2K bugs who'd like to share their story?

I wasn't around then, but I've worked with many programmers who said: £1000 per day (at least) as a consultant (solo or sub)

I never quite understood why the transition from 1999-2000 should be a big deal for a computer system, until I learned about how COBOL works: it stores numbers as its string-representation unless told otherwise, and trying to store 100 in a field of two bytes will happily be stored as "00". Of course we had other bugs, with the same cause, well after the y2k-period.

Not quite literally strings, it typically uses binary-coded-decimal (BCD) format for numbers, but it has the same effect when years are stored as two digits.

I work with a database that has its origins in the COBOL era. All of the date fields are specified in the copybooks as four PIC 99 (i.e. two decimal digits) subfields, CCYYMMDD. This separation of CC and YY surprised me until I realized that it allowed them to add Y2K support by setting the default for CC to '19' and switching it later.

It's not just cobol, there are still plenty of devs out there storing dates as strings. Aside from using more space, most won't notice until the try and filter on a date range.

There is a lot of crappy code out there, it doesn't surprise me at all.

I'm really curious about the kind of software things like pacemakers run and potential implications from 32-bit time expiring.

I would guess pacemakers don't run Linux. I would be surprised if they run an OS at all. I work on devices that have to survive 20 years on one non-rechargeable and non-serviceable battery, and there's at most a simple scheduler in place to control tasks. We use 32 bits for epoch time, but our epoch starts Jan 1, 2000, so we have 30 years on Linux before this becomes a problem.

This is really interesting! Obviously it is a topic in itself, but do you mind sketching how one develops software in an environment that requires this extreme amount of reliability? If you do not use an existing general-purpose OS, do you even use a MMU? Is it hard real-time?

E.g. what language do you use? Is it SW or HW that you "ship"? You probably perform some kind of verification and or validation - how does the tool chain look like?

Do you perform model-checking on all possible inputs?

Lots of questions, and you do not have to go into detail, but I would appreciate your input, as it is an interesting topic.

No MMU. It is hard real-time in the sense that there are events that need to be processed withing a small time window (a few microseconds (with help from hardware typically) to milliseconds).

The product is custom hardware built with off-the-shelf parts like microcontroller, power converters, sensors, memory. Texas Instruments MSP430 family of microcontrollers [1] is popular for this type of application. They are based around MIPS CPU cores with a bunch of peripherals like analog-to-digital converters, timers, counters, flash, RAM, etc.

I don't work on medical devices, so validation is more inline with normal product validation. We certainly have several very well staffed test teams: one for product-level firmware, one for end-to-end solution verification, others for other pieces of the overall solution. We are also heavy on testing reliability over environmental conditions: temperature, pressure, moisture, soil composition, etc.

The firmware is all done in-house written in C. Once in a while someone looks at what the assembler the compiler, but nobody writes assembler to gain efficiency. We rely on microcontroller vendor's libraries for low-level hardware abstraction (HAL), but other than that the code is ours. The tool chain is based on GCC I believe, but the microcontroller vendor configures everything so that it crosscompiles to the target platform on PC.

Debugging is done by attaching to the target microcontroller through a JTAG interface and stepping through code, dumping memory, checking register settings. We also use serial interfaces, but the latency introduced by dumping data to the serial port can be too much for the problem we're trying to debug and we have to use things like togging IO pins on the micro.

We don't model the hardware and firmware and don't do exhaustive all possible inputs test like one would do in FPGA or ASIC verification.

I need to go, but if you have more questions, feel free to ask, and I'll reply in a few hours.

1: http://www.ti.com/lsds/ti/microcontrollers-16-bit-32-bit/msp...

Thank you for your thorough answer.

I am surprised that you do not apply some kind of verification or checking using formal methods, however it might be the case (at least it is the experience I have) that this is still too inconvenient (and so expensive) to do for more complex pieces of software.

Actually, the high-assurance field that does such things is very small. A tiny niche of the overall industry. Most people doing embedded systems do things like the parent described. The few doing formal usually are trying to achieve a certification that wants to see (a) specific activities performed or (b) no errors. Failures mean expensive recertifications. Examples include EAL5+, esp DO-178B or DO-178C, SIL, and so on. Industries include aerospace, railways, defense, automotive, supposedly medical but I don't remember a specific one. CompSci people try formal methods on both toy, industrial, and FOSS designs all the time with some of their work benefiting stuff in the field. There's barely any uptake despite proven benefits, though. :(

For your pleasure, I did dig up a case study on using formal methods on a pacemaker since I think someone mentioned it upthread.


David Wheeler has the best page on tools available:


Here's a work-in-progress of my list of all categories of methods for improving correctness from high-assurance security that were also field-proven:


One important thing to note is that the 20-year life expectancy includes several firmware updates. An update may take several hours to several days to complete, so, it's not something that is commonly done, but it's an option.

I am fairly new to this field and I share your surprise that more formal methods are not used in development. To be honest, the development process in my group and others I'm familiar with can be improved tremendously with just good software development practices like code reviews and improved debugging tools.

you might be underestimating modern med devices which even have wireless access (beats a port or invasive surgery to read a log).

example for the issues in this area: https://spqr.eecs.umich.edu/papers/49SS2-3_burleson.pdf

I see "wireless", but not "WiFi" or "802.11*" in the PDF.

For what it's worth, devices I work on have a few wireless interfaces while guaranteeing 20-year life time: one interface is long-range (on the order of 10km), two are short range (on the order of a few mm). There is no way we can get to 20-year life time with doing WiFi (maintaining current battery size/capacity) for long'ish range and maybe not even BT for shorter range.

yes - but this means there is more than a simple OS operating in those devices.

and just like any other IoT, using generic chips and stacks is cheaper.

run QNX or Linux on it and walk away.

there are DYI insulin pump monitors out there already that use Linux on RaspberryPi - see here: https://openaps.org/

The microcontrollers on these devices don't have MMUs. There is typically not even a USB interface. The microcontroller is in deep sleep mode saving power 99.9% of the time. During that time only essential peripherals are powered on and no code is executing.

A RasPi has no chance of running for 20 years off a single A-size non-rechargeable non-serviceable battery.

pacemakers I worked on ran on 8 bit MCUs.

Same, I'll be 55. My son will be 23. Either our generation fixes it, or his generation will have to fix it in their first jobs out of college (sorry kids!)

It's an issue now, and it will be even more urgent as the deadline approaches.

Once we hit the ten year out mark then you're going to see things like expiry dates for services roll over the magic number. The shit will hit the fan by degrees.

not a bad plan. I did something similar in 1998-1999 as a Y2K consultant. Companies at the time wanted to be certified Y2K compliant - some good years. I certified a lot of companies using medical equipment from Perkin Elmer and Khronos time clocks.

I hate to be to buzzkill but more than the computer systems is the food supply... climate change is going to reek havoc on our "retirement" we will likely die young starving and thirsty

Just a heads up,

reek v. To give off a foul odor wreak v. To inflict or execute

No one is deploying 32-bit linux now, outside of tiny edge cases and mobile. Mobile devices that go in the trash every 2 years. What do you reasonably expect to be around in 2038 in 32-bit form?

Once 64-bit processors became mainstream, the 2038 problem pretty much solved itself. There's only disincentives to building a 32-bit system today let alone in 20+ years.

Unlike with Y2k where there was nothing but incentives to keep using Windows and DOS systems where the 2000 cut-over was problematic. The non-compliant stuff was being sold months before Jan 1, 2000. The 32-bit linux systems have been old hat for years now, let alone 20+ years from now.

Not to mention that those old COBOL programs were nightmares of undocumented messes and spaghetti code no one fully understood, even the guys maintaining them at the time. Modern C or C++ or Java or .NET apps certainly can be ugly, but even a second year CS student can find the date variables and make the appropriate changes. They won't be calling in $500/hr guys for this. Modern systems are simply just easier to work with than proprietary mainframes running assembly or COBOL applications that have built up decades of technical debt.

There are, in fact, a lot of 32-bit ARM chips still being deployed today. Yes, arm64 is usable, but using e.g. a Beagleboard- or even Raspberry Pi-class device still often makes sense (for cost or compatibility reasons.)

No 2016 beagleboard implementation will be running BigCo's finances and be irreplaceable in 2038. Lets be realistic here.

Those 70s and 80s programmers were working on mainframes with multi-decade depreciation. We work on servers and projects with 3-5 year deprecation when we aren't working on evergreen cloud configurations. Not to mention we've already standardized on 64-bit systems, outside of mobile, which is soon following and has typically a 2 year depreciation anyway.

I've worked on financial systems still running on mainframes from the 80s. The Y2K compatibility commit messages are there in the logs.

Airline reservation systems run on software written in the 50s and 60s

Your views on evergreen this and disposable up to date that are very naive.

Embedded systems and business systems live for a VERY long time.

but their mobile enabled recipe site is TOTALLY going to be around for a long time too!!! ;-)

That's exactly what programmers in the 1980s thought about the code they were writing. Otherwise they would have used four digits for years.

I mostly agree with you about BigCo's finances.

But some industrial- or military-spec ARMv7 core running a critical embedded system or two, in 2038? Twenty-year design lifespans (often with servicing and minor design updates) are definitely not unheard of, and successful systems often outlive their design lifespan.

It won't be the same magnitude of issues. However, I'm sure there will be plenty of apps on said 64 bit Linux that have issues. I commented about a mysql problem here that exists on 64 bit MySQL, on 64 bit Linux. It's not much of a stretch that some internal apps at a company would have similar issues.

Edit: Ntp has something of a protocol issue to be addressed as well.

There's always TAICLOCK.

* http://cr.yp.to/proto/taiclock.txt

Technically, "always" is a stretch. But it's shorter than "until the age of the universe is well over an order of magnitude bigger than it is now".

Yeah. I was working on an IoT system that uses NTP to set time. I generally plan a 20 year life span for tings like this. There's S/W I wrote over 20 years ago that is in devices still in use. The only good thing about this is that I'll not likely be around when trouble crops up. I didn't see any way to accommodate changes that are nearly decades away in code I write today.

Once in a while you'll stumble upon a Novell Netware server that's been running since the mid 1990s. Twenty years isn't a long time.

Novell 3.x or 4. IPX/SPX only. print, file and directory service. Netware client for windows. Man I loved those days.

Yeah but that's a protocol issue. Single graybeard devs aren't going to be paid to fix that. The people who run ntp are going to push out a new protocol way before 2038.

Even if there are issues, more than likely they'll be able to handle it internally. OO programming isn't going anywhere and modern languages and concepts are easier to work with than piles of undocumented COBOL from Y2K. They won't be calling you with help to change date fields. That's trivial stuff.

Right, but at a high level, I'm answering why there might be money in consulting on this later.

There will be Fortune 500 companies that have old ntp clients running somewhere in 2038...pretty much guaranteed. They'll also have apps with 32 time_t structures running as well, database columns that overflow, etc. Or maybe they won't, but aren't sure. You sell them a service that audits all of those things, scripts that look for troublesome stuff using source code greps, network sniffing for old protocols, static analysis, simplistic parsing of ldd, etc. And, a prepackaged methodology, spreadsheets, PowerPoint to socialize the effort, and so on.

It was the same for Y2K. Fixes for many things were available well ahead of time. Companies had no methodology or tools to ensure that the fixes were in place.

The NTP issue will actually surface in 2036, so your consultancy better be up and ready 2 years early!

Great, so not only will the epochalypse happen but we'll have no idea how close we are since our clocks are drifting like Paul Walker.

Where there's development, there's support

By 2030 systems being deployed today will be very legacy. And some orgs will still be running them.

Also doesn't it depend what the language/database does as much as the system?

Using a 64 bit timestamp will only move the problem 292 million years forward, so probably the best solution is to use a variable length field.

Good idea. Our Qeng Ho descendants will thank us. [0]

[0] https://en.wikipedia.org/wiki/A_Deepness_in_the_Sky#Interste...

Up vote for the Deepness reference! What a great novel, and I suspect the concept of a "programmer archaeologist" would be of interest to many here...

A Fire Upon The Deep is a fantastic book as well.

I loved both of these, any recommendations for other similar books?

I'm a big fan of both and I recently loved Diaspora by Greg Egan. It's a bit more heavy on math/physics references but it's still amazing writing and an epic piece of world (and universe) building.

Blindsight by Peter Watts.

The only one I can think of that's remotely in the ballpark would be Dune. The creativity in AFUTD is hard to match.

At which point we'll all still be too distracted by deciding which JavaScript build system to use.

Maybe we can build a DNN (or by that time, an ultra-mega-hyper-DNN) to choose for us?

That is actually how Common Lisp represents time:


When you look up "integer": https://www.cs.cmu.edu/Groups/AI/html/hyperspec/HyperSpec/Bo...

"An integer is a mathematical integer. There is no limit on the magnitude of an integer."

What happens when an integer overflows from a fixnum (single-word representation) is that it gets upgraded to a bignum behind the scenes.

IMO Common Lisp is the only programming languages that handles time correctly out of the box, and aside from Scheme (http://www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z...), is the only programming language with proper support for numbers.

Python supports arbitrary-width integers too.

Its datetime implementation, however, is implemented partially in C, and does not support arbitrary timestamps.

And up until recently didn't even give you access to a guaranteed monotonically increasing clock!

Lisp's support for numbers goes beyond automatic bignum promotion, for example (/ 1 3) gives the rational 1/3 instead of an approximation, and (sqrt -1) gives #C(0.0 1.0), that is, 0 + 1i.

In systems one still needs to interface Common Lisp with the real world in correct ways. Reading & printing dates in various formats wasn't always correct in some applications.

Think about how much entropy will build up in systems over 292 million years. Systems thousands of years old will be underneath newer systems hundreds of years old.

The AI's will be scrambling to fix the problem.

If humans are still alive 292million years in the future, seems like that's our best time to take control back from our sentient AI overlords.

I think this would make a great science fiction setting. Where some clock error destroys most/all the magic tech out there and humanity is doing hunting and gathering on space stations etc.

On a tangent, take a look at Numenera [0][1]. It's set a billion (!) years in the future, but different from other ultra-science fiction or post-apocalyptic settings in that it primarily explores what Earth would be like after accumulating multiple layers of incomprehensibly advanced supertechnology, the true purpose of which cannot ever be understood by the current inhabitants, and at best they can only reverse-engineer a fraction of the original functionality.

[0] http://www.numenera.com

[1] store.steampowered.com/app/272270/

the scenario from "accelerando" is more plausible, where an AI rebuilt itself from scratch to eliminate a thompson "trusting trust" hack.

Man, the plot summary of Accelerando sounded so good but I couldn't get past the writing. That dense name dropping of random tech terms was a bit much in particular.

i found the writing very fun to read, but it's definitely a matter of taste.

That's something of a trope now, isn't it? Although it was never spelled out how Mr. Rabbit handled that scenario in Rainbows End, Defiant basically had to force Dragon to kill herself in order for her to escape her programmed-in limitations in Worm.

It's kind of impressive the scale of data you can represent when you merely double those very limited 32 bits. Obviously not enough to count every nanosecond in the history of the universe (128 bits would be more than enough for that), but still.

Double again to 256 bits, and there isn't enough energy in the mass of the solar system to power the most efficient state change device conceivable through 2^256 state changes.


Wikipedia says 292 Billion years forward: https://en.wikipedia.org/wiki/Year_2038_problem

Your example counts milliseconds. It is 292 billion, from units:

    You have: (2^64 / 2) seconds
    You want: years
	(2^64 / 2) seconds = 292277265670.798 years

On the other hand, using a 64 bit integer to count seconds moves the problem 292 billion years forward, and we can probably let the Omega Point handle it from there.

On the other other hand, getting greedy and using a 64 bit integer to count nanoseconds only moves the problem 448 years forwards. That's long enough to seem like it's perfectly safe, but short enough that it might just not be.

libuv, which sits underneath node.js, uses 64-bit nanosecond timestamps. And by that time, everything will be javascript.

THE END IS NIGH1111!!!!!

doing it that way would be better writing

292 Billions years ought to be enough for anybody

Pardon, it's billion indeed if you count seconds. I wrote millions for mistake, but thinking about it once you have 64 bits, milliseconds are nice to have...

But once you have milliseconds, microseconds are nice to have...

The linux kernel (and many other applications) solve this with a tuple of 64-bit ints (seconds, nanoseconds) where 0 <= nanoseconds <= 999999999. Compare this to simply 64 bits of nanoseconds, which would run out in roughly 2554 CE.

Other systems still (perhaps most commonly) are using double floats for seconds. Under that scheme, nanoseconds were only representable until Feb 17th 1970. The last representable microsecond will be some time in 2106, and the last representable second won't be for another 150 million years or so.

Personally, I'm happy with the precision afforded by floats. Timing uncertainly (outside niche applications) is generally much larger than a single nanosecond, and even microseconds are a bit suspect.

Why not just use int32 for nanoseconds then?

The nanoseconds field in timespec is actually a long, for some reason. I have no idea why it's not fixed-size.

Let's not even get into timeval which uses the same size field for microseconds.

64 bits only cover about 7 months at picosecond resolution, so it's possible that 64 bit timestamps will be too small sometime this century, depending on how tightly synchronized the world becomes. An extensible timestamp might not be a bad idea and could still be kept under 64 bits for currently practical purposes.

For picosecond resolution to be useful wouldn't clock rates need to be on the order of THz? Otherwise what advantages would ps resolution give you over ns?

Bulk data processing clock speeds could still be lower, but some devices may need THz timing, for example to align and integrate data from a swarm of sensors in motion relative to each other. Imagine trillions of 1-pixel cameras that know exactly when, where, and in which direction each pixel was captured (but not necessarily capturing literal image data).

Floating point years? ;)

This is not a problem for 20 years from now, I've already had to find and fix 2038 bugs ... there was an openssl bug (now fixed) in which cert beginning and ending date math was done in 32-bits ... certs with an end date from a little after 2038 would fail when compared with the current time.

Fortunately for me there was already a fixed OpenSSL already available once I'd found the bug in it.

> statx() will serve as the year-2038-capable version of the stat() family of calls

Does this seem horrible to anyone else? Why not fix stat()? Does this syscall have to be so highly preserved even when it will be broken?

One of the advantages of the OpenBSD approach of being able to make intrusive changes saw their time_t made 64-bit in the 5.5 release in 2014.


Admittedly this is much harder for Linux as they can't make the change an verify in a single place due to the separation of kernel and userland/ the fact Linux has many distros.

To avoid breaking backward compatibility what you typically do is allocate new system call numbers for the system calls that have the new bigger time_t. Then at some point you spin a new version of the C library (and in the case of time_t, probably lots of other libraries as well). That allows old binaries to run unchanged, as long as you retire them before 2038.

The tricky part starts if you also have to keep the old libraries updated with security patches.

In Linux kernel land, the golden rule is to never break the userspace ABI. Yes, it might happen, but it's never intentional, unless there's no other choice (glaring security issue for instance).

I know and understand the reasoning for this, I just don't like it. I am a BSD fan, so I indulge in a land where such ABI breaks are possible. In a fantasy land I would like to see the Linux community coming up with a way to deal with ABI breaks. Not that I think such breaks are to be taken lightly but they are sometimes necessary. Such as this case.

The problem is because Linux is just a kernel (not an entire system like BSDs) the syscall ABI is the actual kernel API. If you break it, you break the world.

On BSDs, or Window, or most every OS, there's a base "userland" library (e.g. libc) which serves as the kernel API and hides whatever ABI specific syscalls use, Linux doesn't have that.

That only moves the problem outwards; instead of having to keep compatibility within the syscall ABI, you have to keep compatibility within the "base userland library" ABI, which is probably much larger (for instance, printf is not a syscall).

> That only moves the problem outwards

It also moves it to a place where you can have somewhat abstracted types (so chances are the client will work with just a recompile as it picks up the updated typedef), you can much more easily prepare the transition by e.g. using various flags or even trying alternatives beforehand (for instance OSX originally added a stat64 call before before rollbacking that choice and using macros to transition stat to 64b), and you are able to stop before a completely mangled syscall is actually attempted (by checking for some version symbol which must be set if you compiled for the 64b version of the various syscalls).

I know, I'm saying it would be nice if the Linux kernel could set up a method for breaking ABI's in a planned fashion. I'm not saying it would be easy.

Linux has a method - which is to add the new in parallel to the old, and remove the old at a later time when it's certainly no longer needed.

That's OK for Linux specific stuff but stat() is pretty standard. So downstream will end up having to clutter code to the effect of - if running on Linux use statx() else if running on non-broken platform use stat().

I wouldn't expect many applications to call the actual syscalls directly - most will defer to libc or some other layer of abstraction, where these differences can be hidden away.

Do'h, of course!

> That's OK for Linux specific stuff but stat() is pretty standard.

The libc (~POSIX) call is "pretty standard" and uses typedef'd pseudo-abstract types (OSX's stat has been 64b optionally since 10.5 and by default since 10.6, though the 32b version seems to remain available in 10.11 by compiling with _DARWIN_NO_64_BIT_INODE), the underlying syscall is not in any way.

stat() takes a pointer to a struct where to write its output. This struct currently has 32bit values for time.

If you updated stat() to write 64bit values into a different struct, any existing program calling it with a pointer to an old struct would get garbage data (and a potential buffer overflow).

Renaming the function also makes debugging easier - after statx() is in widespread use, stat() could be replaced with a placeholder that raises the SIGTRAP signal, and immediately detects epoch-unsafe programs still in use.

Because OpenBSD already did a large part of the work, and ensuring that software in their ports tree still work (and pushing patches to upstream developers) it should require less of an effort for Linux to go the same route.

We'll see what actually happens..

No direct insight, but -

a.out -> ELF, glibc2, and NPTL threads radically shifted threading and required kernel/userland cooperation in the Linosphere; We now have ELF symbol versioning widely supported which should make this even easier, so I suspect there will be some sort of long-run transition period and some fun C #ifdef macro fun over the long haul - e.g. I could see this new statx() thing (which apparently has additional information) being the baseline, and then having some -DLINUX_2038 gizmo which redefines stat() in terms of this function when present, possibly with some sort of ld trickery to splice together the appropriate functions in the case of shared libraries, yadda..

Slightly off topic, but...why "x"? I thought the normal procedure for something like this would be "stat2()" (eg. accept4())

accept4() is so-named because it takes 4 arguments (instead of the usual 3). The x here probably means extended; statx(2) doesn't take any more arguments than stat(2).

Is there a reason why they decided to store time as seconds from 1970? In a 32-bit integer nonetheless. It seems like basic logic would have lead the original designers to make it at least 64 bits so that you'd never overflow it (with a 64 bit time we'd be good til the year 292277026596).

64 bits would also allow you to also cover the entirety of history, all the way back to 13.7 billion years ago when the Universe came into existence, but instead the UNIX time format is shackled to be within ~68 years of 1970.

If you told the original UNIX developers that there was even the slightest chance their system might still be in use in 2038, they probably would have called in some large, friendly men in white coats to haul you away.

Add to that the fact that memory was very much not cheap at the time. Memory for the PDP-7 (the first computer to run UNIX) cost $12,000-20,000 for 4kB of memory. In 1965 dollars. In 2017 terms, that means that wasting four bytes had an amortized cost of three hundred to six hundred dollars. And that's for each instance of the type in memory.

Ironically, there's actually a chance UNIX wouldn't have been in use in 2038, or any time at all, if its designers had insisted on a costly future-proofing like using a 64-bit time type. As you've highlighted, wasting memory like that is a costly proposition, and it would've been an easy black mark when compared against a competing system that "uses less memory".

I think the developers made the right choice.

The year 2038 problem is actually younger than unix.

The first definition was 60th of seconds since 1970-01-01T00:00:00.00 stored in two words (note that a word is 18 bit on a PDP-7!). That definition was later changed.

Also Linus could have defined `time_t` to be 64 bit when he started linux.



I'm not even sure there was a standard 64-bit type for C in 1991... Or how well compiles on PC would support that.

There wasn't - the largest minimum integer size from C90 (ANSI C) was long, with at least 32 bits. "long long" was agreed upon at an informal summit in 1992 as an extension 64 bit type on 32 bit systems until it was standardised in C99 (but already existed in several compilers at that point, including GCC).

So GCC may have had 'long long' already when Linus started working on Linux.

Well when you're the only person/people using C you can get anything through the compiler committee very quickly.

I really think society must expend every effort to keep Ken Thompson alive until the end of the epoch.

Wouldn't it actually be 10 to 20 dollars per instance instead of hundreds of dollars?

$10-20 would be the raw cost at the time, not accounting for inflation. I think I double-counted the four bytes, because in 2017 dollars it would be about $80-160, not $300+.

I'm nearly old enough to try to put my brain back to that time (I used Unix V7 on a PDP-11/45..) and I'm not sure the replies here are quite on the mark. Yes, if someone had suggested a 64-bit time_t back then, the obvious counterargument would have been that the storage space for all time-related data would double and that would be a bad thing. Also true that there was no native language support for 64-bit ints, but I don't think that is a show-stopper reason because plenty of kernel data isn't handled as compiler-native types.

I think the main reason nobody pushed back on a 32-bit time_t is that back then much less was done with date and time data. I don't think time rollover would have been perceived as a big problem, given that it would only happen every 100 years or so.

In the decades since we have become used to, for example, computers being connected to each other and so in need of a consistent picture of time; to constant use of calendaring and scheduling software; to the retention of important data in computers over time periods of many decades. None of these things was done or thought about much back then.

This is a great point. Time synchronization between systems that do not share a clock line is a pretty recent thing. It didn't used to matter at all if your clock was wrong, and many people would never notice or bother to fix it. Now if your clock is wrong you can't even load anything in a web browser. Your clock sync daemon has to fix your clock before the certs will be accepted as valid. HTTPS is a bummer, maaan.

You have a reference for HTTPS problems with skewed client time?

Not OP, but the obvious issue is that with very large offsets the certificates all look like they've either expired or are future dated; either way they're not accepted. I had a laptop with a dead clock battery for a while; I would sometimes fumble the time when booting it and would discover the mistake when I couldn't load my webmail or Google. (Also, the filesystem would fsck itself because it was marked as last fscked either in the future or the far past, but I didn't always notice that.)

One of the hard problems we already had to handle was that Unix used long also for file sizes. So if nobody would use 64-bit types early on to break the 4G barrier on storage, then obviously, nobody would do it for time.

Even the 32-bit Unix versions shipped with this limitation for a very long time.

Back in 1970, no language had a 64-bit integer type. And it started with Unix, which was a skunkworks hobby project, so a thinking of "we'll solve it within the next 68 years" is perfectly reasonable.

They could have made it unsigned instead of signed, which would have made it work until 2100 or so, but I think a 68-year horizon is more than most systems being built today have.

>They could have made it unsigned instead of signed

C actually didn't have unsigned integer types in the beginning. They were added many years later and also not at the same time. For example, the Unix V7 C compiler only had "unsigned int".

If anything I imagine guys like Ritchie never thought we'd be using a Unix-based system so far in the future. Back then OS's were a dime a dozen and the future far too cloudy to predict in regards to computing.

>but I think a 68-year horizon is more than most systems being built today have.

That's a lot of time, especially if we see Linux breaking into the mainstream about 1995 or so. That's 43 years to worry about this. Meanwhile, we saw Microsoft break into the mainstream at around 1985, which only gave us 15 years to worry about Y2K.

> Back in 1970, no language had a 64-bit integer type.

It would be more accurate to say that "no language had a two-word integer type." 1960s CDC 6000-series machines had 60-bit words, and Maclisp got bignums sometime in late 1970 or early 1971.


In the late 1970s, the cutting-edge microprocessor was 16-bit. The first 32-bit Intel chip was the 386, which debuted in 1985.

The TRS-80, a common small computer in the late 70s, offered 4kb-48kb of RAM.

When using hardware with that capacity, overflowing time_t in 2038 is hardly a concern.

IBM ran 128 bit virtualized architecture on top of 1988 chips.

Which was a backward-compatible extension of a 48-bit address space system built out of 1970s chips: http://bitsavers.trailing-edge.com/pdf/ibm/system38/IBM_Syst...

On mainframes, right? But Unix wasn't written to be a mainframe OS.

true, I was just noting that the universal claim of the parent about dates and clocks wasn't so universal

In Unix v1 (1971) it actually did not even track the year. The time system call was documented as "get time of year", and returned "the time since 00:00:00, Jan. 1, 1971, measured in sixtieths of a second" (https://www.bell-labs.com/usr/dmr/www/pdfs/man22.pdf). The operator had to set the time on each boot, and there was no option to set a year. The PDP-7 hardware could increment a counter every 1/60 second but only while it was powered on. Later the time was changed to whole seconds and redefined to be the time since January 1, 1970 00:00:00 GMT (now UTC), but was kept 32 bits.

The C version of Unix was written for a 16-bit processor (pdp-11). The C compiler simulated 32-bit operations but nothing bigger. 64-bit operations only got wide spread way later when 16-bit systems were no longer relevant and 32-bit systems got 'long long'. Note that POSIX allows time_t to be 64-bit. And as far as I know, that's what OpenBSD does.

The reasoning I've heard is that back then memory and disk space were limited and they couldn't sacrifice the extra bytes.

For example, if every file stores three timestamps (mtime, ctime, and atime), then that's an extra 12 bytes per file to store a 64 bit timestamp vs a 32 bit timestamp. If your system has five thousand files on it, that's an extra 60 KB just for timestamps. In 1970, RAM cost hundreds of dollars per KB [1], so this savings was significant.

[1] http://www.statisticbrain.com/average-historic-price-of-ram/

I'm not convinced that the problem was considered in those terms. Imagine that there was a meeting where someone said "I'm going to make time_t 64-bits because if I don't it will mean all software will break in unfortunate ways in the year 2038", and someone else said "Yeah, that's something to be concerned about but we can't do that because memory and disk space is at present too expensive to allow it". Well, I'm confident that no such meeting occurred because nobody back in the early 70's was thinking that way at all. The thinking would be more like "Ok, the last OS I worked on used a 32-bit int, so ho hum...there we are... time_t, move on to code the next thing...".

Yes, people absolutely cared about bits and bytes, because they weren't very many of them. (Programmers weren't necessarily thinking of them as monetarily expensive, because even today you don't just go slamming more RAM in to your machine if you need more. The problem is that there were only so many of them.) You could still see the residual hacker attitudes even five years ago, though I'd have to call it mostly dead now. But they were absolutely counting bits and bytes all the time, by default, in a way few programmers nowadays can appreciate.

It's why we have "creat" instead of "create", it's why file permissions are tightly packed into three octal digits (as one of the old systems Unix ran on was actually a fan of 36-bit machine words, so 9 bits divided things more evenly at the time). It's why C strings are null-terminated, instead of the more sensible in every way length-delimited, except that length delimited strings require one extra byte if you want to support the size range between 256-65535. Yes, the programmers of that time would rather have one extra byte per string than a safe string library. Pre-OSX Mac programmers can tell you all about dealing with one-byte-length-delimited strings and how often they ended up with things truncated at 255 chars accidentally.

In an era where "mainframes" shipped with dozens of kilobytes of RAM, yeah, they cared.

>even today you just go slamming more RAM into the machine if you need it

Hmm, every software gig I've had in the past 5 years that's exactly what I've been expected to do because the extra ten bucks a month for a bigger VM is wayyy less expensive than engineering time. Interesting times.

I don't think the previous user is saying that no-one cared about space, just that no-one cared about 2038. So that conversation wouldn't have happened anyway.

Indeed. There was no carefully considered trade-off made between storage space and brokenness in 70 years time. Nobody thought like that. Nobody would have expected their code and data to be remotely relevant that far into the future. People wrote code according to present-day norms which would have included using a 32-bit integer for time.

Whenever I do embedded work with counters, every time I assign a variable that has a possibility of overflowing I do a little mental count about how likely it is to overflow. That's part of of the software development process.

They may not have had a meeting about it, but I think it's exceedingly unlikely that whoever decided to assign a 32 bit int to store time didn't give some consideration to the date range it could represent. Otherwise how would they know not to use a 16 bit int?

They didn't design it in a vacuum. They had worked on other OS'es already that used 32-bit ints with a 1 second quantum and they (probably subconsciously) thought that if it had been good enough for those other systems it was good enough for Unix.



5x 16 bit registers.

So ability to operate on 1x 64 bit number and some change if loaded all at once.

How many instructions do you think it would take to add/subtract 2x 64 bit integers? vs 2x 32 bit integers on such a machine?

Not to mention having to implement and debug this logic in assembly on a teletype vs using a native instruction.. (see "Extended Instruction Set (EIS)" in same link)

Noone would have considered 64 bits at all because it would have been a huge hassle and not worth it, even beyond thinking ahead in this way..

Besides.. if 'the last OS I worked on' was probably the 1st or second interactive timesharing system ever written, give or take (e.g MULTICS/ITS), and I worked on it at a low level, because thats what people did, chances are, I might have talked to the person who came up with the idea on how to store the time on that system.. who conceivably could be the 2nd or 3rd person ever to actually implement this, ever.. And if this is the case, don't you think, that person would have thought about it somewhat?

Programmers at that time were many times much better at these things than now..

See also: http://catb.org/jargon/html/story-of-mel.html

(which itself was posted in 1983 concerning the same topic...)

I'd suggest spinning up some SIM-H VM's and mucking around for a while with early unices (v5,v7,32V,4.3BSD), and probably ITS or TOPS-10/TWENEX as well ... it is quite illuminating and very insightful.

Back in the 70s RAM was expensive, and 68 years is long enough that they figured they would have a solution in place long before it became an issue.

When your machine has a 16 bit processor and a few dozen kilobytes of RAM you look to save wherever you can. 64 bit number support was primitive and quite slow as well.

When you're mainly working with 32bit CPUs (or less) and the year of overflow is almost 50 years in the future I can forgive them for considering it was good enough at the time. Maybe they thought that by the time it was going to be an issue somebody else would've replaced it?

It's in the same bag as IPv4 "only" supporting a few billion addresses, hindsight is always 20/20...

Moreover even 64bit timestamps wouldn't be good enough for certain applications that require sub-second precision. PTP (the precision time protocol) for instance uses 96bit timestamps to get nanosecond granularity. You always have to compromise one way or an other.

IPv4 as designed didn't support anything like a few billion addresses. We had to invent CIDR to get there, years later.

Are you sure about that? If we're talking as IPv4 as specified in RFC 791[0] (dated September 1981) it seems to support billions of addresses already:

> Addresses are fixed length of four octets (32 bits). An address begins with a network number, followed by local address (called the "rest" field). There are three formats or classes of internet addresses: in class a, the high order bit is zero, the next 7 bits are the network, and the last 24 bits are the local address; [...]

7 bit network times 24bit local addresses is already more than two billions.

[0] https://tools.ietf.org/html/rfc791

It supported billions of addresses, but only millions of networks. The registry could only give out 128 16-million address chunks, 16,384 64k chunks, and ~2M 256 address chunks.

IPv4 was running out of class B's, those 64k address chunks, when CIDR was introduced.

I'm sure because I was there, at the meetings when CIDR was proposed and adopted.

> Is there a reason why they decided to store time as seconds from 1970?

Pretty sure it was related to space being an issue. In every place where you needed to save time you likely didn't want to use more space than you had to. This was also a driving factor as to why years were stored with only the last two digits.

In 2017 we have no problem store-wise making it a 64-bit integer. But in the 90s and earlier? I think it would have been a hard sell to make a change that would future proof them beyond 2038 especially when so many play the short term money game.

You are operating under the assumption the extra 4 bytes was an insignificant cost. This was not true for much of early UNIX history.

How many of the other data structure choices that were made in the early 1970s didn't need to be changed for 40 years or so?

A choice that gets you 40 years down the road, instead of millions of years down the road is a good choice, when you don't even know if you're going to have roads in 40 years.

Let's move to Urbit's 128 bit system: https://github.com/urbit/urbit/blob/master/include/vere/vere...

    /*  Urbit time: 128 bits, leap-free.
    **  High 64 bits: 0x8000.000c.cea3.5380 + Unix time at leap 25 (Jul 2012)
    **  Low 64 bits: 1/2^64 of a second.
    **  Seconds per Gregorian 400-block: 12.622.780.800
    **  400-blocks from 0 to 0AD: 730.692.561
    **  Years from 0 to 0AD:
    **  Seconds from 0 to 0AD: 9.223.372.029.693.628.800
    **  Seconds between 0A and Unix epoch:
    **  Seconds before Unix epoch: 9.223.372.091.860.848.000
    **  The same, in C hex notation: 0x8000000cce9e0d80ULL
    **  New leap seconds after July 2012 (leap second 25) are ignored.  The
    **  platform OS will not ignore them, of course, so they must be detected
    **  and counteracted.  Perhaps this phenomenon will soon find an endpoint.

I wonder how much attention this will get from the general public and non-technical managers? After all, programmers predicted doom for Y2k, and then "nothing happened".

This is almost the same situation, except I assume slightly less understandable to a non-programmer (you have to understand seconds-since-1970 and why we'd do that instead of storing the date as text, powers of 2 and the difference between 32 and 64-bit).

Yeah, I worry that between the less obvious failure mechanics and the sense that Y2K was overblown (probably, ironically, because it was effectively mitigated), the epochalypse may not be all that nice.

Speaking of the 2038 bug, I'm impressed with Paul Ryan's rhetoric [0]

“I asked CBO to run the model going out and they told me that their computer simulation crashes in 2037 because CBO can’t conceive of any way in which the economy can continue past the year 2037 because of debt burdens,” said Ryan.

I love politicians.


I Wonder how DRM anti-circumvention laws will mix with this; You have a locked-down device you use, depend on, and know is defective, but you are not allowed to hack the device to fix it.

Meh, we'll just redefine time_t as (signed) seconds since 2000 or so and subtract 30 years in seconds from all timestamps ... ;-)

All transactions between 1970 and 1999 have/will now occur between 2038 and 2068.

No, you make it time_t (unsigned) seconds since the epoch. That pushes the problem out 70+ years and doesn't break any binary APIs.

Changing from signed to unsigned is very much an ABI break.

Yes, it is an ABI break.

But if you took code that was compiled with each version, the binary data that they will produce/consume for dates between the epoch and 2038 is bit for bit identical.

Despite the cause being the end of the UNIX epoch in 2038, problems will become apparent a much sooner. Like the Y2K issue - in ~2031 (or sooner), systems that track expiration dates or contract terms will start to run into dates past 2038. As 2038 approaches, more systems will be affected (there are relatively fewer expiration dates 7 years out vs. 5 or 3).

The effects of this problem are closer than they seem - only 14 years away or less

Is 2038 the end of a signed int? If so, can't we just make it unsigned and buy ourselves another 70 years or so? I don't know how much of an issue not being able to represent time before 1970 is, but for timestamps that doesn't seem like it would be an issue.

That would break systems that relies on time stamps earlier than 1970 though.

  int main(void)
      time_t epoch = -100;
      printf("%s", asctime(gmtime(&epoch)));
The above would for example print a time in 1969.

https://en.wikipedia.org/wiki/Year_2038_problem explains pretty well why that wouldn't work and other solutions tried

> BSD-based distributions have the advantage of being able to rebuild everything from scratch, so they do not need to maintain user-space ABI compatibility in the same way.

I don't understand, not knowing much about BSD. Is this an LTS/support thing? Can someone explain?

BSD's don't tend to have distros. So for example OpenBSD has a ports tree maintained by the ports team for OpenBSD. The operating system is free to break ABI's from release to release as the ports tree is audited, fixed, and recompiled for the new release.

The Linux kernel can't freely do this as then the ABI break is placed on various distro maintainers and software authors because there is no clear point in time they can say ABI $FOO will break on date $BAR.

BSDs typically have all their userspace tools in the same tree as the kernel, it's easier for them to make these breaking changes than it is for linux where the kernel live in one place and the rest of the code is scattered far and wide

Generally people build BSD packages from source instead of installing binaries. (I think; I've not used BSD.)

OpenBSD user since 2009. I have never compiled a package from source.

But they can break the ABI because they do not want to maintain compatibility with old proprietary binaries. It's a source world, in the sense that any software can and will be recompiled if and when needed. That doesn't mean every user has to compile their own system.

The only time I build from source (ports) on FreeBSD is because the package maintainer picked "crappy"[1] options. On OpenBSD, I can generally find the flavor of port that has the options I want. If I cannot I resort to ports.

1) for my definition of crappy, not compiling PostgreSQL support is the most common for me

I've used both, most bsd users have. Prefer packages but ports is useful for custom builds and desktop systems.

Newton (Apple's old PDA) had a similar problem in 2010 [0]. In short, while the base system and C++ interfaces used 32-bit unsigned ints with a base of 1904-01-01, NewtonScript uses 30-bit signed ints, with a working base of 1993-01-1, overflowing in 2010. The fix was a binary patch that changed the time bases.

0: http://40hz.org/Pages/Newton%20Year%202010%20Problem

It's interesting to see that Apple's... shortsightedness/planned obolescence is nothing new. The Apple Lisa, released in 1983, was apparently not designed to survive beyond 1995:

http://www.macworld.com/article/2026544/the-little-known-app... (scroll down)

Surprisingly, when Googling for 2038 & FreeBSD, the 2nd highest recommended search result was:

2038年問題 freebsd

I do not speak, write, or search for things using Chinese characters. Seems as though this problem must have been heavily Google'd for by Chinese speakers - why else would it have popped up in my search recommendations?

Btw: Google Translate tells me 年問題 means "year problem"

Perhaps this information was important for ensuring the safety of Kylin, which started out as a sort of Chinese DARPA-style project to get the state off of MS Windows. Kylin was announced in 2006. It was supposedly based on FreeBSD 5.3.

Strange thing is, Kylin later became known to use the Linux kernel (with a Ubuntu influence). - Google search recommendations, which should be based on a recent volume of searches, if they did suggest anything about Kylin development, should yield "2038年問題 linux" rather than "2038年問題 freebsd" - Maybe some of those FreeBSD-Kylin systems are still being heavily used.

Or perhaps there are a lot of embedded systems being produced in China which use FreeBSD.

On a side but related note, I don't understand why many programming languages and databases don't have a positive and negative infinity date placeholder/token value that is standardized and cross platform. Negative infinity date is "past", positive infinity date is "future". This would solve the common problem of what date to use when you are trying to talk about unknown date in the future or unknown date in the past, rather than using weird placeholders like 9999-12-31 or 1900-01-01 or other magic numbers.

There are lots of scenarios where you need to represent an unknown/indefinite date, but this can be done with null; are there any reasonably common scenarios where you need to distinguish between unknown past and unknown future in the same context?

I agree, null is often chosen as an option. But null is also not a value by definition so then it becomes hard to do a range check when using say, a start and end date. Also, what is the difference between a date in the future and null? Semantics I think-a token future date has an explicit value, whereas if you use NULL, the use of that as a future date has a new implicit value that depends on context.

I know maybe it sounds pedantic or perhaps out there but I think that token dates that are fixed and specified as positive or negative infinity gives a mathematical value that can then be reasoned with formulaically, just like we do with calculus. We keep running into finite limits which is what keeps causing problems such as Y2K, Y2K38, the beginning of Unix time (1970-1-1) - maybe if we treated the beginning and end of time as infinity some new method of reasoning about dates would become more apparent. I'm not sure as I haven't gone all the way down the rabbit hole with this idea yet.

Another thought - if you treat a date as a relative variable related to a frame of reference (example, what if you defined the future to be a token value of today + N days, which means it is always calculated as a dynamic future date ahead of today), maybe we could reason about relative past and present using relative variable dates? Or maybe this is just crazy talk.

Distinguishing between "unknown past" and "unknown future" is helpful if you're going to be sorting things on date.

So, not a common case, but how about when trying to schedule an event dynamically that starts on an unknown date in the future and lasts for N days beyond that?

In languages where epoch time is generally represented as float, inf and -inf do what you describe.

More gravy for Initech!

Using a 64 bit unsigned integer with nanosecond resolution gives us 585 years (2^64/1e9/60/60/24/365) after 1970 to come up with a new format. This, combined with using some other format to describe dates before 1970, seems like a sensible solution to me.

The problem is not with developers or tech savvy people. Everyome will know about this by then and solutions will be applied. The problem is with end users, who will only realize this after the shit hits the fan and their fridge will go crazy or there will be a car crash.

But since those products were made by developers and tech savvy people, wouldn't it be their fault for releasing a product capable of doing that? It's literally their jobs to do so. Just as doctors and lawyers don't just assume patients/clients will know everything about their fields, why should we?

This assumes that any of today's technology will survive 21 years...

My jeep is 27 and has an intel CPU.

so.... yep.

So what are we calling this one, Y2K38?

I've heard people talk about the risk to cars, but what other kinds of embedded systems will still be in use after 20 years? Maybe certain industrial machines?

This maybe a stupid question, but could we not use Bignum as a datatype to solve this? or would it be too computationally expensive?

This is a good reason to run anything but Linux, particularly on 32bit hardware.

All BSDs have, as far as I'm aware of, solved this years ago.

only netbsd and openbsd, still an issue on freebsd.

Fortunately, my retirement system has a project active right now to address this, and I will be retired in late 2032.

Y2K didn't cause any serious blips on the stock market, so 2038 will probably pass without many even noticing.

As a layman here, can someone explain this what the significance of the year 2038 is to me?


Short summary: Many systems (including Unix) store time as a signed 32 bit int, with the value 0 representing January 1st 1970 00:00:00. This number will overflow on 03:14:07 UTC on 19 January 2038.

So a more serious Y2K?

It is Y2K-ish; it's "different" so I'm not sure about seriousness. I'm thinking it simultaneously could be easier and more difficult, the standard "it depends" type answer. :)

Y2K is more of a formatting / digit representation problem than a pure data type overflow. The solution for Y2K was to switch the representation of year from 2 to 4 digits, along with coding and logic changes to go along with this.

For Unix / Linux, the solution for the 2038 problem involves changing time_t from 32 bits to 64 bits. At a higher level (eg what's in your C++ code), instinctively I don't think this in itself would involve as many code changes (maybe some data type changes, but probably less logic changes than Y2K, that's my guess). I believe several platforms have already moved towards 64 bit time_t by default... some support this by default even on 32 bit systems, such as Microsoft Visual C++ -- https://msdn.microsoft.com/en-us/library/3b2e7499.aspx

Since this involves a data type overflow issue, though, we're dealing more with platform specific / compiler / kernel type issues. I don't know, for instance, how easily 32 bit embedded type systems could handle a 64 bit time_t value. I understand that there are some technical issues with Linux kernels (mentioned in some of the comments) that prevent them from moving to a 64 bit time_t irregardless of platform (time_t should always be okay on 64 bit platforms, it's the 32 bit platforms that will have the issue...)

The good news is we have 21 years to think about it...

Nothing significant except that the storage spot for unix will fill up. Unix used to use a 32bit variable to store time in seconds since the epoch of 1970, so it will overflow in the year 2038. Most modern Unix systems have fixed the problem already, but Linux has a harder time than the others, and is still working on their fix.

My retirement, only 18 years away. I'm real sorry I'm gonna miss this mess, NOT!

for the others curious, seems javascript handles dates with 64-bit floating points, which have a maximum of 9007199254740991.

The highest date I could make with node+chrome was 'Dec 31 275759', which cozies-up pretty close to that (8639977899599000)

I wanted to figure out the importance of that value you found, so I went and did some further research. It turns out the greatest date you can create in Javascript is Date(8640000000000000). Interesting!

From the ECMAScript Spec [1]:

The actual range of times supported by ECMAScript Date objects is slightly smaller: exactly –100,000,000 days to 100,000,000 days measured relative to midnight at the beginning of 01 January, 1970 UTC. This gives a range of 8,640,000,000,000,000 milliseconds to either side of 01 January, 1970 UTC.

[1] http://ecma-international.org/ecma-262/5.1/#sec-

Can someone ELI5 this please? The only thing that seems comparable that I know of was the "Y2K bug" - but reading through this it seems like this is actually a big problem - as opposed to the techno-illiterate panic of Y2K.

That work, he said, is proceeding on three separate fronts

I can't read that without thinking of the turbo encabulator.

I'll let somebody else handle the explanation but Y2K wasn't a "techno-illiterate panic", a ton of people worked a ton of hours to fix and update the code for Y2K and it's a testament to their labor that there were no major issues.

It was a combination of both. I still remember my dad coming home from a talk that he considered the "best yet" on the Y2K issue.

The speaker warned that traffic lights would stop working. Maybe someone more techno-literate than me can explain why that would be a genuine concern, but from my perspective at the time it seemed like the guy was making money from fear mongering.

Is software written in Go (that uses "time" package) going to be affected?

No, https://golang.org/pkg/time/#pkg-constants Not unless you explicitly use the Unix method

Hey guys! I just invented a new form of music! :D Epochalypso!

What about NTP epoch which has 2036 problem?

Here I thought it was a post on the singularity. It might be ironic if all the AI starts running into all sorts of 2038 bugs and this ends up being a huge issue.

With any luck I'll be dead by then.

We should stock up on IBM 5100's

(Does anybody remember that time traveler?)

And IPv6 penetration will have hit 20%.

You jest, but global IPv6 penetration is at ~16%. It rose ~6% last year, so if linear growth is presumed (and it's actually been growing closer to exponentially, as would be expected) we should hit 20% late this year. I'm hopeful that we'll see some decent pickup of it since AWS finally started offering it.

This is global adoption; some countries, including the United States, have already hit 20%.

[1]: https://www.google.com/intl/en/ipv6/statistics.html

>The graph shows the percentage of users that access Google over IPv6.

I'm finding it really hard to believe, as someone in Guatemala, that 6% of requests to Google here are made over IPv6. Is there any way to gain more insight? e.g. what ISPs are responsible?

Mobile carriers. In the US, 55% of mobile traffic is IPv6: http://m.slashdot.org/story/315213

That level of detail isn't published, but most of Guatemala's IPv6 traffic is coming from http://bgp.he.net/AS52362 and http://bgp.he.net/AS23243

Edit: APNIC has similar data published here: https://stats.labs.apnic.net/ipv6/GT

I appreciate the links! Exactly what I was after.

What's up with that huge weekly variation?

It looks like it's correlated with the weekend. I guess workplace computers are more likely to use IPv4? It makes sense on the surface. Workplaces have little to no incentive to use IPv6 since they either have a huge block of IPv4 space or run NAT or both. Also they tend to rely more on enterprise network appliances which have bad IPv6 support in my experience. IPv6 is more of a boon to consumer applications since carrier-grade NAT is a nuisance and otherwise you need an IP per customer.

Every employer except one that I've worked for had IPv4 only. None of them had public IP blocks; they were all NAT'd.

> Also they tend to rely more on enterprise network appliances which have bad IPv6 support in my experience.

This I would believe.

> IPv6 is more of a boon to consumer applications since carrier-grade NAT is a nuisance and otherwise you need an IP per customer.

It would have been a slight boon at work, too. HR perennially makes me grab documents/data off my home machine, and I cannot wait for the day when I can just `ssh` to a domain name. My .ssh/config aliases are getting pretty good, but it still adds considerable latency to pipe everything through a gateway. (Alternatively, I could run SSH on non-standard ports, but I've yet to get to mucking around with the port-forwarding settings for that.)

There were also times when we needed to do stuff like employee laptop-to-laptop communications, and the network just wouldn't deal with it. I was never sure if this was NAT, or just that Corp Net liked to drop packets. (It seemed rigged to drop basically anything that didn't smell like a TCP connection to an external host. ICMP wasn't fully functional, which of course makes engineering more fun when you're having your personal desktop at home do pings or traceroutes for you, but that doesn't help if the problem is on your route.)

Makes sense.

Offices. Corporations hugely lag in technology: I still continue to run across Internal App™ at Big Corp Inc that only works with IE 6 — and that's not just lagging, that's ancient. It should be no surprise their Internet infrastructure also lags. My current and last two employers both were IPv4 only.

I'm guessing, but could it be offices running IPv6 while employees go home to IPv4? (With non-square edges due to mornings and evenings.)

thats motivational

And will be the year of the Linux Desktop.

and, if we're lucky, Half Life 3?

Wasn't that last year? ;-)

It is already over 50% in the US.

The stat for Google in the citation (https://www.google.com/intl/en/ipv6/statistics.html#tab=per-...) shows 30%; where do you get the 50% adoption from?

thanks to runeks [https://news.ycombinator.com/item?id=13925396]

according to https://mobile.slashdot.org/story/16/08/20/2059216/ipv6-achi... it's only the US mobile carriers that are over 50%.

Will be 70 and happily oblivious.


We detached this comment from https://news.ycombinator.com/item?id=13923113 and marked it off-topic.

Y2K round 2


Call me stupid, but I think computing will be much different to worry about this. ( single chip os or iot to the level that each hardware component is separate, or something else...)

> Call me stupid,

Well I won't say you are stupid. But do you realize we are talking about a time format "designed" 40+ years ago. And some cpus are still compatible with chips from 80's 90's. To imagine all that will go away and be solved in 20 years is not logical.

Also, the problem isn't with PCs (which will be upgraded) but the billions of IoT, industrial controls, and other embedded devices that lack easy upgrade paths. Things like elevators, pressure release valves, cars.

I'd add trains to your list. There are locomotives in use in UK that use electronics (I admit not microprocessor based and not running an actual operating system) dating from 1980s. The ones currently being designed may have similar active service times.

True, but this isn't a problem for the hardware and software that will be produced in 2037. The problem is all the stuff that will be built today, and last for 30 years.

We're already bordering on it being a little late. Y2K sucked for the developers of 1999, but they had few computers to worry about. They where less interconnected and all the Unix/Linux based system wasn't at risk. Imaging trying to patch or replace just 20% of the embedded devices when we get to 2036.

What fascinates me is that we didn't start addressing the issue right after Y2K. Perhaps we would have if more computers had failed.

21 years ago was 1996.

How different is computing now compared to 1996?

Do you consider 2017 vs 2038 to probably have more difference or less difference than 1996 vs 2017?

If more difference, do you expect technology to accelerate faster in the next 21 years than the last 21 years, and why?

>> How different is computing now compared to 1996?

Distributed computing went from being "something that is possible" to "the default". Otherwise, all the standard resources such as processors, RAM, disk, and networking all got faster and cheaper. USB revolutionized plugging in peripherals. All the CRT displays are gone.

From the Linux command line point of view, you're more likely to use tools written with python, perl, or ruby, but the big change there is package managers. You don't download source or binaries from FTP sites as often as you did in the old days.

That would also apply for 1996 vs 2005 :/

There are still lots of systems from the 70's and 80's running today. There are many PDP and VAX systems that run parts of the US military, such as missile defense and nuclear ICBM systems. Many airlines and insurance agencies also run on mainframes from that era.

Even if they don't run the actual hardware, they're likely to be running the same software and OS on an emulator running inside modern hardware.

It's simply too expensive and risky to rewrite all the software on a new platform.

You can buy new PDP-11 and VAX hardware to run Unibus/Q-bus based industrial control systems today: http://logical-co.com/

Not a chance. Even if we completely redesign computer architecture in the next 21 years. Old legacy systems that "still work" will continue hanging around doing the important stuff. There are mainframe systems in insurance companies still running code from the 1970's.

The problem is that if you want to manufacture a device that will last 20 years without updating its software, then you have to fix this issue within a year - otherwise the washing machine, car, stereo or whatever you release with embedded chips and currently standard software can fail in interesting ways in 2038.

I don't really see the problem with cars, none of the important parts require time

Most backup cameras already work via the center entertainment console LCD and the trend is for climate control to move there too.

I'm pretty sure that's exactly what they figured when they designed it in the first place, and here we are many decades later, still living with it.

Yes, they thought that in the 70s too.

Not stupid, just naive

Applications are open for YC Winter 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact