This is a great and very interesting write-up. It has never occurred to me that there might be algorithms applicable to calendars.
It strikes me that the Gregorian calendar might encode some of the most calcified tech debt of humanity. It attempts to fit solar, lunar, and 7-day religious-oriented cycles (and who knows what else?) into one system. So many cultural and religious touchstones are oriented around the (month, day) tuple. For extra confusion add the asymmetry between cultures who place traditions on the lunar calendar. Let's not mention timezones. A big thanks to you keepers of time (libraries).
Deciding that time began with Unix and that it should simply count forward seems rather sane by comparison.
Yup. And directly related to that, the absurdity of having the second month be the one that has weird numbers of days and gets a regular extra day to keep the calendar in sync with Earth's revolutions. Adding or removing a day from the end of the year is the obvious choice, so why not do it there?
Turns out, people who came up with the leap year weren't stupid - the extra day used to be tacked to the end of a year. But then the thing you mentioned happen, the calendar was rotated right (in the positional arithmetic sense) by two, and what used to be the last month of the year is now called February.
Yeah, I never understood why they couldn't make it Dec-Feb-Jan with January at the end of the year so that Janus could still be the god of transitions, but I guess I'm no theologian so there is probably a reason.
The article leaves out optimizing is_leap_year. Here is an interesting version that is correct (in the Proleptic Gregorian calendar) for 0 <= year <= 102499 and takes only 3 instructions:
((year * 1073750999) & 3221352463) <= 126976
How this works is surprisingly complex, I am thinking of writing a whole blog post about it...
Historical minor quibble - formula correct for 1582 <= year <= 102499 (well, it's actually correct down to 1201, but the gregorian calendar didn't start until oct 1582... and then you get into the quagmire about when each country adopted the gregorian calendar, if they ever did)
The GP said the proleptic Gregorian calendar, so they are correct. (Assuming their formula is correct, which I haven't checked.) If they had said, "here's how humans in Western civilization could calculate the leap year of their current calendar back to year zero," then your quibble (and probably others) would be warranted. :-)
A less radical optimization that still might improve on the original:
const fn ordinal_date_to_calendar_date(year: i32, ordinal: u16) -> (i32, u8, u8) {
/// The number of days up to and including the given month. Common years
/// are first, followed by leap years.
const CUMULATIVE_DAYS_IN_MONTH_COMMON_LEAP: [[u16; 13]; 2] = [
[0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365],
[0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335, 366],
];
let days = CUMULATIVE_DAYS_IN_MONTH_COMMON_LEAP[is_leap_year(year) as usize];
// Approximation which is either exactly right or one too small.
let mut month = (ordinal - 1) / 31;
// Fix the approximation if necessary.
if ordinal > days[(month + 1) as usize] {
month += 1;
}
(year, (month + 1) as u8, (ordinal - days[month as usize]) as u8)
}
The basic idea is to approximate simply by dividing by 31, then correct the error.
Because no month has more than 31 days, the approximation cannot be too large. And because only a little error accumulates, if it is too small, it will only be too small by 1. So only a single check is needed to correct the error.
Also, if you want to avoid a divide instruction, you can switch to:
let mut month (ordinal - 1) >> 5;
Then you're dividing by 32 but the error is still small enough for it to work the same.
(Note that there's no bounds checking, which might be a regression compared to the original. Also, I've made the arrays bigger to get rid of special cases.)
Nice writeup and algorithm. My first instinct would have been to take the lazy way out and just use a table with 365 entries (or 366, or both depending on the implementation), but this is a lot more elegant.
It is still bit odd considering tha in the actual benchmarks the preliminary version is significantly faster than the old one. So there seems to be fairly large discrepancy between the cycles estimation and actual runtime
It strikes me that the Gregorian calendar might encode some of the most calcified tech debt of humanity. It attempts to fit solar, lunar, and 7-day religious-oriented cycles (and who knows what else?) into one system. So many cultural and religious touchstones are oriented around the (month, day) tuple. For extra confusion add the asymmetry between cultures who place traditions on the lunar calendar. Let's not mention timezones. A big thanks to you keepers of time (libraries).
Deciding that time began with Unix and that it should simply count forward seems rather sane by comparison.