Hacker News new | past | comments | ask | show | jobs | submit login
History of Zero-Based Months? (jefftk.com)
52 points by expjpi 3 months ago | hide | past | favorite | 28 comments

"Why is day of the month 1-indexed but the month is 0-indexed in C?"

This Twitter thread from November 2020[1] and its HackerNews discussion[2] seem relevant.

1: https://twitter.com/hillelogram/status/1329228419628998665

2: https://news.ycombinator.com/item?id=25195287

Why don't we enumerate days throughout the year?

Probably because most calendars started as lunar or lunar-based. There is also the case that separating the year by month could be beneficial for farmers back in the day to plan different activities throughout the year. I've always like the names of the months in the French Republican Calendar[1] because of that.

[1]: https://en.wikipedia.org/wiki/French_Republican_calendar#Mon...

Excellent historical exploration of the topic => https://youtu.be/iBRCL090PxA

Because we don't.

Which is probably because it's too fine a granularity, especially historically: even an ordinal month-day had limited use to preindustrial contexts were time-boundaries were necessary quite fuzzy owing to the vagaries of communications or transport.

Technically you don't need years either, but chunky boundaries are useful as both reference points and communication shortcuts.

We enumerate milliseconds since 1 Jan 1970 12am UTC instead. You can convert to all the other formats.

Well, given a database of leap seconds you can. And you can't convent times in the future between unix time and calendar time.

Times in the future are always unreliable, even using the "normal" system there aren't any guarantees that specific time will even exist or if it does you still need to think about "is it an absolute event or a relative event and does it need to have the time updated as a result" just the other way around.

Sure. But the important difference is: I can express the time "2nd of March 2056, 15:00 UTC", we can talk about that future time, and we can tell whether we're before or after that time. But I can't express the same time in unix time, because I don't know how many leap seconds will be between now and March 2056.

Sure you can, just allow for sticking the things you want to track against at the end like you did after saying "2nd of March 2056, 15:00". If that's leap seconds then there is no reason you can't say "<future timestamp> with leap second changes" just the same as you did there with a datetime. What you can do with a timestamp is a superset of what you can do with a datetime since you start with fewer limitations but lose none of the ability to apply or mark needed transformations to that base:

Guarantees about <future date> UTC:

- Can calculate duration from past to <future date>: No (need future leap second table)

- Can calculate duration from now to <future date>: No (need future leap second table)

- Can calculate duration from future to <future date>: No (need future leap second table)

- Can compare <future date> to <future date>: No (UTC is not monotonic)

- Can know if <future date> is unique: No (UTC is not monotonic)

- Can determine <future date> has passed: After passing

- Can determine <future date> will occur: After passing (time skip adjustments and whole date skips allowed in calendar)

Guarantees about <future date> timestamp:

- Can calculate duration from past to <future date>: Yes

- Can calculate duration from now to <future date>: Yes

- Can calculate duration from future to <future date>: Yes

- Can compare <future date> to <future date>: Yes

- Can know if <future date> is unique: Yes

- Can determine <future date> has passed: Yes

- Can determine <future date> will occur: Yes

Once you add "accounting for leap seconds" to the timestamp all but "Can determine <future date> will occur" will match up. UTC didn't gain anything over timestamps in that case the difference between the two just shrunk.

While you may be able to say "I know what the UTC encoding will look like ahead of time if I specify it in UTC encoding" that doesn't actually give you any more information about the time as you can't do anything functional with that knowledge until the time passes at which point, provided the same info you had to accurately track the UTC time, you have enough information to do the same with a timestamp.

you can if you want: new Date(2022, 0, 236) is the same as new Date(2022, 7, 24)

At least in Perl, the rationale for this (apart from copying C) is that it's very easy to reference a list (array) of month names if the month indices are zero-based.

How difficult is it to either subtract 1 before indexing your array, or have a 13 element array with a dummy value in element 0? Deprecate time.h and adopt ISO 8601 already.

Or put Dec a second time, at 0, and then % 12 works.

About as difficult as it is to add 1 before printing the month as a number.

Zero-based months is awkward when printing as a number, but convenient when indexing an array:

    const MONTHS = ["Jan", "Feb", "Mar", ..., "Dec"];

Is subtracting 1 really that much an inconvenience, given how awkward they are?

Yes, that was always my theory for why months are 0-based. Back in the days of early Unix, this might have had a tiny performance benefit that was enough for them to choose that implementation.

I think that it should be unnecessary. The address of the array could be adjusted at compile-time so that one-based index numbers will be possible.

I think people are overestimating the features of compilers. Today's compilers might struggle with that. 70's compilers even more.

The same justification exists in JS (where it's copied from Java which copies it from C). However, interestingly checking the FORTRAN IV specification documents for the PDP-10, it's not implemented as a 3 letter ASCII abbreviation. That document dates to 1975, I don't know if I can find if the date exists in the first version of the document, which should date to 1967. I was unable to find a reference to a builtin in the base FORTRAN II, which only provides 20 builtin functions, date not making the list. I think newer versions of FORTRAN77 has idate which is 1-indexed but I couldn't find it in the older standard listed on wg5's specification documents.

[^1]: https://github.com/PDP-10/f40/tree/master/doc

I think it all comes down to whether you're counting elements as the primary goal. However, months are poorly defined relative to something like an astronomical year or a standard day, so they're more like objects. That is to say, adding different months doesn't give you the same number of days as a result.

Hence, dealing with months is perhaps a bit more like indexing variable-sized objects, and for that purpose traditionally, the array-of-pointers approach uses zero-based pointer arithmetic, which is what the designers might have been thinking.

Really however, the months should probably be treated more like structs, with the number of days in that month being a data member. This would allow sanity checks, i.e. entering Feb 30 should raise an error, but not May 30.

I think it's important to be able to work with mod(%) 12 for some operations on year <--> month relationship. All years have 12 months. For day of the month you need additional logic to handle diferent lengths... and other issues.

This reason makes sense. Note that different uses will have different useful conventions, and converting between them may be necessary. (Although this is true of more than only month numbering.)

> One-based indexing for the year and day,

Year is 0 based too

Year 0 doesn’t exist [1], and I wouldn’t regard years as being indexed. How would you represent 1 BC, with a negative index?

[1] https://en.m.wikipedia.org/wiki/Year_zero

Year zero will be possible with astronomical year numbering. In this case, 1 AD will be +1, and 1 BC will be 0, and 2 BC will be -1.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact