Is there a reason why they decided to store time as seconds from 1970? In a 32-b...

mikeash · on March 21, 2017

If you told the original UNIX developers that there was even the slightest chance their system might still be in use in 2038, they probably would have called in some large, friendly men in white coats to haul you away.

Add to that the fact that memory was very much not cheap at the time. Memory for the PDP-7 (the first computer to run UNIX) cost $12,000-20,000 for 4kB of memory. In 1965 dollars. In 2017 terms, that means that wasting four bytes had an amortized cost of three hundred to six hundred dollars. And that's for each instance of the type in memory.

cookiecaper · on March 21, 2017

Ironically, there's actually a chance UNIX wouldn't have been in use in 2038, or any time at all, if its designers had insisted on a costly future-proofing like using a 64-bit time type. As you've highlighted, wasting memory like that is a costly proposition, and it would've been an easy black mark when compared against a competing system that "uses less memory".

I think the developers made the right choice.

legulere · on March 21, 2017

The year 2038 problem is actually younger than unix.

The first definition was 60th of seconds since 1970-01-01T00:00:00.00 stored in two words (note that a word is 18 bit on a PDP-7!). That definition was later changed.

Also Linus could have defined `time_t` to be 64 bit when he started linux.

https://en.wikipedia.org/wiki/Unix_time#History

http://aaroncrane.co.uk/2008/02/unix_time/

smsm42 · on March 21, 2017

I'm not even sure there was a standard 64-bit type for C in 1991... Or how well compiles on PC would support that.

caf · on March 21, 2017

There wasn't - the largest minimum integer size from C90 (ANSI C) was long, with at least 32 bits. "long long" was agreed upon at an informal summit in 1992 as an extension 64 bit type on 32 bit systems until it was standardised in C99 (but already existed in several compilers at that point, including GCC).

So GCC may have had 'long long' already when Linus started working on Linux.

gravypod · on March 21, 2017

Well when you're the only person/people using C you can get anything through the compiler committee very quickly.

jonstewart · on March 21, 2017

I really think society must expend every effort to keep Ken Thompson alive until the end of the epoch.

ufo · on March 21, 2017

Wouldn't it actually be 10 to 20 dollars per instance instead of hundreds of dollars?

mikeash · on March 22, 2017

$10-20 would be the raw cost at the time, not accounting for inflation. I think I double-counted the four bytes, because in 2017 dollars it would be about $80-160, not $300+.

dboreham · on March 21, 2017

I'm nearly old enough to try to put my brain back to that time (I used Unix V7 on a PDP-11/45..) and I'm not sure the replies here are quite on the mark. Yes, if someone had suggested a 64-bit time_t back then, the obvious counterargument would have been that the storage space for all time-related data would double and that would be a bad thing. Also true that there was no native language support for 64-bit ints, but I don't think that is a show-stopper reason because plenty of kernel data isn't handled as compiler-native types.

I think the main reason nobody pushed back on a 32-bit time_t is that back then much less was done with date and time data. I don't think time rollover would have been perceived as a big problem, given that it would only happen every 100 years or so.

In the decades since we have become used to, for example, computers being connected to each other and so in need of a consistent picture of time; to constant use of calendaring and scheduling software; to the retention of important data in computers over time periods of many decades. None of these things was done or thought about much back then.

5ilv3r · on March 21, 2017

This is a great point. Time synchronization between systems that do not share a clock line is a pretty recent thing. It didn't used to matter at all if your clock was wrong, and many people would never notice or bother to fix it. Now if your clock is wrong you can't even load anything in a web browser. Your clock sync daemon has to fix your clock before the certs will be accepted as valid. HTTPS is a bummer, maaan.

joepvd · on March 21, 2017

You have a reference for HTTPS problems with skewed client time?

wolfgang42 · on March 22, 2017

Not OP, but the obvious issue is that with very large offsets the certificates all look like they've either expired or are future dated; either way they're not accepted. I had a laptop with a dead clock battery for a while; I would sometimes fumble the time when booting it and would discover the mistake when I couldn't load my webmail or Google. (Also, the filesystem would fsck itself because it was marked as last fscked either in the future or the far past, but I didn't always notice that.)

phicoh · on March 21, 2017

One of the hard problems we already had to handle was that Unix used long also for file sizes. So if nobody would use 64-bit types early on to break the 4G barrier on storage, then obviously, nobody would do it for time.

Even the 32-bit Unix versions shipped with this limitation for a very long time.

beagle3 · on March 21, 2017

Back in 1970, no language had a 64-bit integer type. And it started with Unix, which was a skunkworks hobby project, so a thinking of "we'll solve it within the next 68 years" is perfectly reasonable.

They could have made it unsigned instead of signed, which would have made it work until 2100 or so, but I think a 68-year horizon is more than most systems being built today have.

cremno · on March 21, 2017

>They could have made it unsigned instead of signed

C actually didn't have unsigned integer types in the beginning. They were added many years later and also not at the same time. For example, the Unix V7 C compiler only had "unsigned int".

drzaiusapelord · on March 21, 2017

If anything I imagine guys like Ritchie never thought we'd be using a Unix-based system so far in the future. Back then OS's were a dime a dozen and the future far too cloudy to predict in regards to computing.

>but I think a 68-year horizon is more than most systems being built today have.

That's a lot of time, especially if we see Linux breaking into the mainstream about 1995 or so. That's 43 years to worry about this. Meanwhile, we saw Microsoft break into the mainstream at around 1985, which only gave us 15 years to worry about Y2K.

sedachv · on March 22, 2017

> Back in 1970, no language had a 64-bit integer type.

It would be more accurate to say that "no language had a two-word integer type." 1960s CDC 6000-series machines had 60-bit words, and Maclisp got bignums sometime in late 1970 or early 1971.

ISL · on March 21, 2017

https://en.wikipedia.org/wiki/Intel_8086

In the late 1970s, the cutting-edge microprocessor was 16-bit. The first 32-bit Intel chip was the 386, which debuted in 1985.

The TRS-80, a common small computer in the late 70s, offered 4kb-48kb of RAM.

When using hardware with that capacity, overflowing time_t in 2038 is hardly a concern.

avereveard · on March 21, 2017

IBM ran 128 bit virtualized architecture on top of 1988 chips.

sedachv · on March 22, 2017

Which was a backward-compatible extension of a 48-bit address space system built out of 1970s chips: http://bitsavers.trailing-edge.com/pdf/ibm/system38/IBM_Syst...

AnimalMuppet · on March 21, 2017

On mainframes, right? But Unix wasn't written to be a mainframe OS.

avereveard · on March 21, 2017

true, I was just noting that the universal claim of the parent about dates and clocks wasn't so universal

mark4o · on March 21, 2017

In Unix v1 (1971) it actually did not even track the year. The time system call was documented as "get time of year", and returned "the time since 00:00:00, Jan. 1, 1971, measured in sixtieths of a second" (https://www.bell-labs.com/usr/dmr/www/pdfs/man22.pdf). The operator had to set the time on each boot, and there was no option to set a year. The PDP-7 hardware could increment a counter every 1/60 second but only while it was powered on. Later the time was changed to whole seconds and redefined to be the time since January 1, 1970 00:00:00 GMT (now UTC), but was kept 32 bits.

phicoh · on March 21, 2017

The C version of Unix was written for a 16-bit processor (pdp-11). The C compiler simulated 32-bit operations but nothing bigger. 64-bit operations only got wide spread way later when 16-bit systems were no longer relevant and 32-bit systems got 'long long'. Note that POSIX allows time_t to be 64-bit. And as far as I know, that's what OpenBSD does.

jamoes · on March 21, 2017

The reasoning I've heard is that back then memory and disk space were limited and they couldn't sacrifice the extra bytes.

For example, if every file stores three timestamps (mtime, ctime, and atime), then that's an extra 12 bytes per file to store a 64 bit timestamp vs a 32 bit timestamp. If your system has five thousand files on it, that's an extra 60 KB just for timestamps. In 1970, RAM cost hundreds of dollars per KB [1], so this savings was significant.

[1] http://www.statisticbrain.com/average-historic-price-of-ram/

dboreham · on March 21, 2017

I'm not convinced that the problem was considered in those terms. Imagine that there was a meeting where someone said "I'm going to make time_t 64-bits because if I don't it will mean all software will break in unfortunate ways in the year 2038", and someone else said "Yeah, that's something to be concerned about but we can't do that because memory and disk space is at present too expensive to allow it". Well, I'm confident that no such meeting occurred because nobody back in the early 70's was thinking that way at all. The thinking would be more like "Ok, the last OS I worked on used a 32-bit int, so ho hum...there we are... time_t, move on to code the next thing...".

jerf · on March 21, 2017

Yes, people absolutely cared about bits and bytes, because they weren't very many of them. (Programmers weren't necessarily thinking of them as monetarily expensive, because even today you don't just go slamming more RAM in to your machine if you need more. The problem is that there were only so many of them.) You could still see the residual hacker attitudes even five years ago, though I'd have to call it mostly dead now. But they were absolutely counting bits and bytes all the time, by default, in a way few programmers nowadays can appreciate.

It's why we have "creat" instead of "create", it's why file permissions are tightly packed into three octal digits (as one of the old systems Unix ran on was actually a fan of 36-bit machine words, so 9 bits divided things more evenly at the time). It's why C strings are null-terminated, instead of the more sensible in every way length-delimited, except that length delimited strings require one extra byte if you want to support the size range between 256-65535. Yes, the programmers of that time would rather have one extra byte per string than a safe string library. Pre-OSX Mac programmers can tell you all about dealing with one-byte-length-delimited strings and how often they ended up with things truncated at 255 chars accidentally.

In an era where "mainframes" shipped with dozens of kilobytes of RAM, yeah, they cared.

callalex · on March 22, 2017

>even today you just go slamming more RAM into the machine if you need it

Hmm, every software gig I've had in the past 5 years that's exactly what I've been expected to do because the extra ten bucks a month for a bigger VM is wayyy less expensive than engineering time. Interesting times.

Nition · on March 21, 2017

I don't think the previous user is saying that no-one cared about space, just that no-one cared about 2038. So that conversation wouldn't have happened anyway.

dboreham · on March 22, 2017

Indeed. There was no carefully considered trade-off made between storage space and brokenness in 70 years time. Nobody thought like that. Nobody would have expected their code and data to be remotely relevant that far into the future. People wrote code according to present-day norms which would have included using a 32-bit integer for time.

leoedin · on March 21, 2017

Whenever I do embedded work with counters, every time I assign a variable that has a possibility of overflowing I do a little mental count about how likely it is to overflow. That's part of of the software development process.

They may not have had a meeting about it, but I think it's exceedingly unlikely that whoever decided to assign a 32 bit int to store time didn't give some consideration to the date range it could represent. Otherwise how would they know not to use a 16 bit int?

dboreham · on March 22, 2017

They didn't design it in a vacuum. They had worked on other OS'es already that used 32-bit ints with a 1 second quantum and they (probably subconsciously) thought that if it had been good enough for those other systems it was good enough for Unix.

cat199 · on March 21, 2017

PDP-11:

https://en.wikipedia.org/wiki/PDP-11_architecture#CPU_regist...

5x 16 bit registers.

So ability to operate on 1x 64 bit number and some change if loaded all at once.

How many instructions do you think it would take to add/subtract 2x 64 bit integers? vs 2x 32 bit integers on such a machine?

Not to mention having to implement and debug this logic in assembly on a teletype vs using a native instruction.. (see "Extended Instruction Set (EIS)" in same link)

Noone would have considered 64 bits at all because it would have been a huge hassle and not worth it, even beyond thinking ahead in this way..

Besides.. if 'the last OS I worked on' was probably the 1st or second interactive timesharing system ever written, give or take (e.g MULTICS/ITS), and I worked on it at a low level, because thats what people did, chances are, I might have talked to the person who came up with the idea on how to store the time on that system.. who conceivably could be the 2nd or 3rd person ever to actually implement this, ever.. And if this is the case, don't you think, that person would have thought about it somewhat?

Programmers at that time were many times much better at these things than now..

See also: http://catb.org/jargon/html/story-of-mel.html

(which itself was posted in 1983 concerning the same topic...)

I'd suggest spinning up some SIM-H VM's and mucking around for a while with early unices (v5,v7,32V,4.3BSD), and probably ITS or TOPS-10/TWENEX as well ... it is quite illuminating and very insightful.

jandrese · on March 21, 2017

Back in the 70s RAM was expensive, and 68 years is long enough that they figured they would have a solution in place long before it became an issue.

When your machine has a 16 bit processor and a few dozen kilobytes of RAM you look to save wherever you can. 64 bit number support was primitive and quite slow as well.

simias · on March 21, 2017

When you're mainly working with 32bit CPUs (or less) and the year of overflow is almost 50 years in the future I can forgive them for considering it was good enough at the time. Maybe they thought that by the time it was going to be an issue somebody else would've replaced it?

It's in the same bag as IPv4 "only" supporting a few billion addresses, hindsight is always 20/20...

Moreover even 64bit timestamps wouldn't be good enough for certain applications that require sub-second precision. PTP (the precision time protocol) for instance uses 96bit timestamps to get nanosecond granularity. You always have to compromise one way or an other.

dboreham · on March 21, 2017

IPv4 as designed didn't support anything like a few billion addresses. We had to invent CIDR to get there, years later.

simias · on March 21, 2017

Are you sure about that? If we're talking as IPv4 as specified in RFC 791[0] (dated September 1981) it seems to support billions of addresses already:

> Addresses are fixed length of four octets (32 bits). An address begins with a network number, followed by local address (called the "rest" field). There are three formats or classes of internet addresses: in class a, the high order bit is zero, the next 7 bits are the network, and the last 24 bits are the local address; [...]

7 bit network times 24bit local addresses is already more than two billions.

[0] https://tools.ietf.org/html/rfc791

jcranmer · on March 21, 2017

It supported billions of addresses, but only millions of networks. The registry could only give out 128 16-million address chunks, 16,384 64k chunks, and ~2M 256 address chunks.

IPv4 was running out of class B's, those 64k address chunks, when CIDR was introduced.

dboreham · on March 22, 2017

I'm sure because I was there, at the meetings when CIDR was proposed and adopted.

BinaryIdiot · on March 21, 2017

> Is there a reason why they decided to store time as seconds from 1970?

Pretty sure it was related to space being an issue. In every place where you needed to save time you likely didn't want to use more space than you had to. This was also a driving factor as to why years were stored with only the last two digits.

In 2017 we have no problem store-wise making it a 64-bit integer. But in the 90s and earlier? I think it would have been a hard sell to make a change that would future proof them beyond 2038 especially when so many play the short term money game.

slededit · on March 21, 2017

You are operating under the assumption the extra 4 bytes was an insignificant cost. This was not true for much of early UNIX history.

toast0 · on March 21, 2017

How many of the other data structure choices that were made in the early 1970s didn't need to be changed for 40 years or so?

A choice that gets you 40 years down the road, instead of millions of years down the road is a good choice, when you don't even know if you're going to have roads in 40 years.