They needed to have a locale matching the language of the localised time string they wanted to parse, they needed to use strptime to parse the string, they needed to use timegm() to convert the result to seconds when seen as UTC. The man pages pretty much describe these things.
The interface or these things could certainly be nicer, but most of the things they bring up as issues aren't even relevant for the task they're trying to do. Why do they talk about daylight savings time being confusing when they're only trying to deal with UTC which doesn't have it?
int main(void) {
struct tm tm = {0};
const char *time_str = "Mon, 20 Jan 2025 06:07:07 GMT";
const char *fmt = "%a, %d %b %Y %H:%M:%S GMT";
// Parse the time string
if (strptime(time_str, fmt, &tm) == NULL) {
fprintf(stderr, "Error parsing time\n");
return 1;
}
// Convert to Unix timestamp (UTC)
time_t timestamp = timegm(&tm);
if (timestamp == -1) {
fprintf(stderr, "Error converting to timestamp\n");
return 1;
}
printf("Unix timestamp: %ld\n", timestamp);
return 0;
}
It is a C99 code snippet that parses the UTC time string and safely converts it to a Unix timestamp and it follows best practices from the SEI CERT C standard, avoiding locale and timezone issues by using UTC and timegm().
You can avoids pitfalls of mktime() by using timegm() which directly works with UTC time.
I can't find `timegm` neither in the C99 standard draft nor in POSIX.1-2024.
The first sentence of your link reads:
>The C/Unix time- and date-handling API is a confusing jungle full of the corpses of failed experiments and various other traps for the unwary, many of them resulting from design decisions that may have been defensible when the originals were written but appear at best puzzling today.
timegm was finally standardized by C23, and POSIX-2024 mentions it in the FUTURE DIRECTIONS section of mktime. I don't know precisely what happened with POSIX. I think timegm got lost in the shuffle and by the time Austin Group attention turned back to it, it made more sense to let C23 pick it up first so there were no accidental conflicts in specification.[1]
[1] POSIX-2024 incorporates C17, not C23, but in practice the typical POSIX environment going forward will likely be targeting POSIX-2024 + C23, or just POSIX-2024 + extensions; and hopefully neither POSIX nor C will wait as long between standard updates as previously.
Yeah, you're correct that `timegm` is neither part of the C99 standard nor officially specified in POSIX.1-2024 but it is widely supported in practice on many platforms, including glibc, musl, and BSD systems which makes it a pragmatic choice in environments where it is available. Additionally, it is easy to implement it in a portable way when unavailable.
So, while `timegm` is not standardized in C99 or POSIX, it is a practical solution in most real-world environments, and alternatives exist for portability, and thus: handling time in C is not inherently a struggle.
As for the link, it says "You may want to bite the bullet and use timegm(3), even though it’s nominally not portable.", but see what I wrote above.
Here is some of my code that works around not having timegm. It is detected in a configure script, so there's a #define symbol indicating whether it's available.
m4conf obviously has messy configure scripts; they are just written in m4. For instance, it somes with a 120 kilobyte file m4sugar which is fully of m4 cruft.
Oh look, m4sugar.m4 is taken from GNU Autoconf and is GPLv3. The. m4conf project's master license doesn't mention this; it's a "BSD1" type license (like BSD2, but with no requirement for the copyright notice to appear in documentation or accompanying materials). Oops!
m4sugar says that it requires GNU m4, not just any POSIX m4.
I wrote the configure script in shell because that's what I decided I can depend on being present in the target systems. I deliberately rejected any approach involving m4 to show that a GNU style configure script can be obtained in as straightforward way, without convoluted metaprogramming.
There is a lot of copy-paste programming in my configure script, but it doesn't have to be that way; a script of this type can be better organized than my example. Anyway, you're not significantly disadvantaged writing in shell compared to m4, especially if you're mostly interested in probing the environment, not complex text generation.
I don't think that it's enough to test for header files being present. Almost all my tests target library features: including a header, calling some functions and actually linking a program. The contents of headers vary from system to system and with compiler options.
While m4sugar is from GNU Autoconf (which is GPLv3), it's properly listed as a dependency and ISC is GPL-compatible, so there's no licensing issue that I know of. The code explicitly seems to avoid GNU m4-specific features for BSD compatibility, and it goes well beyond header-only testing. While shell scripting is a valid approach, m4conf provides structured macros for maintainable configuration and advanced features like runtime CPU capability detection. As far as I can tell, the code is well-documented and organized, though you raise fair points about the tradeoffs between shell and m4 approaches, but as others have mentioned it (and seen in example m4 file), it is extremely simple to use and works with third-party libraries as well.
m4conf is not using anything requiring GNU stuff though, and it has been tested with BSD m4 as well. It is noted in base.m4.
The license of m4conf itself is ISC.
m4sugar could be vetted so it is can become less than 120 KB.
I don't know if it is messy, look at the example configuration file. For me, it is more straightforward and less bloated than autotools, for example.
> I don't think that it's enough to test for header files being present. Almost all my tests target library features: including a header, calling some functions and actually linking a program. The contents of headers vary from system to system and with compiler options.
It’s twelve lines or more, if you include the imports and error handling.
Spreadsheets and SQL will coerce a string to a date without even being asked to. You might want something more structured than that, but you should be able to do it in far less than 12 lines.
C has many clunky elements like this, which makes working with it like pulling teeth.
My personal rule for time processing: use the language-provided libraries for ONLY 2 operations: converting back and forth between a formatted time string with a time zone, and a Unix epoch timestamp. Perform all other time processing in your own code based on those 2 operations, and whenever you start with a new language or framework, just learn those 2.
I've wasted so many dreary hours trying to figure out crappy time processing APIs and libraries. Never again!
Adding or subtracting "months" is inherently difficult because months don't have set lengths, varying from 28 through 31 days. Thus adding one month to May 31 is weird: should that be June 30 or July 1 or some other date?
Try not to have to do this sort of thing. You might have to though, and then you'll have to figure out what adding months means for your app.
Welcome to Business Logic. This is where I'd really like pushback to result in things that aren't edgecases.
However you also run into day to day business issues like:
* What if it's now a Holiday and things are closed?
* What if it's some commonly busy time like winter break? (Not quite a single holiday)
* What if a disaster of somekind (even just a burst waterpipe) halts operations in an unplanned way?
Usually flexability needs to be built in. It can be fine to 'target' +3 months, but specify it as something like +3m(-0d:+2w) (so, add '3 months' ignoring the day of month, clamp dom to a valid value, allow 0 days before or 14 days after),
I think the parent is describing a "bring your own library" approach where a set of known to the author algorithms will be used for those calculations and the only thing the host language will be used for is the parse/convert.
It does remove a lot of the ambiguity of "I wonder what this stdlib's quirks are in their date calculations" but it also seems like a non-trivial amount of effort to port every time.
The difficulty of this problem rests on the ambiguity of the phrase "exactly 6 months", which is going to depend totally on the precise business logic. But there's no reason to suppose that the requirements of the business logic will agree with the concepts implemented by the datetime library.
The concept of a process-wide locale was a mistake. All locale-dependent functons should be explicit. Yes that means some programs won't respect your locale because the author didn't care to add support but at least they won't break in unexpected ways because some functions magically work differently between the user's and developers system.
Totally agree. Python's gettext() API feels so ancient because it can only cope with one locale at a time, and it would love to get that locale from an environment variable. Not ideal for writing an HTTP service that sends text based on the Accept-Language header.
The headline doesn’t match the article. As it points out, C++20 has a very nice, and portable, time library. I quibble with the article here, though: in 2025, C++20 is widely available.
The first rule of thumb is to never use functions from glibc (gmtime, localtime, mktime, etc) because half of them are non-thread-safe, and another half use a global mutex, and they are unreasonably slow. The second rule of thumb is to never use functions from C++, because iostreams are slow, and a stringstream can lead to a silent data loss if an exception is thrown during memory allocation.
I think that time handling is the most hard thing in the world of programming.
Explanation: you can learn heap sort or FFT or whatever algorithm there is and implement it. But writing your own calendar from scratch, that will do for example chron job on 3 am in the day of DST transition, that works in every TZ, is a work for many people and many months if not years...
Time handling is exceptionally easy. Time zone handling is hard. It doesn't help that the timezone database isn't actually designed to make this any easier.
Meanwhile I edited my comment but we're still agreeing. And adding them for example to embedded systems is additional pain. Example: tram or train electronic boards / screens
I don’t know. I’ve written that seemed like obvious simple code that got tripped up with the 25 hour day on DST transition. That’s when I learned to stick to UTC.
Debian’s vixie-cron had a bug [0] where if the system TZ was changed without restarting crond, it would continue to run jobs based on the old TZ. It checked for DST transitions, but not TZ.
In fairness, it’s not something that should happen much at all, if ever.
Assuming the unstated requirement that you want your cron job to only run once per day, scheduling for 3 am is not a software problem. It's a lack of understanding by the person problem. By definition times around the time change can occur twice or not at all. Also, in the US 3am would never be a problem as the time changes at 2 am.
Also, naming things, cache coherency, and off by one errors are the two hardest problems in computer science.
For those skimmimg the problem is mktime() returns local time, and they want it in UTC. So you need to subtract the timezone used, but the timezone varies by date you feed mktime() and there is no easy way to determime it.
If you are happy for the time to perhaps be wrong around the hours timezone changes, this is an easy hack:
import time
def time_mktime_utc(_tuple):
result = time.mktime(_tuple[:-1] + (0,))
return result * 2 - time.mktime(time.gmtime(result))
If you are just using it for display this is usually fine as time zone changes are usually timed to happen when nobody is looking.
And the answer is to use `gmtime()`, which AIX doesn't have and which Windows calls something else, but, whatever, if you need to support AIX you can use an open source library.
No, mktime() doesn't parse a string. Parsing the string is done by strptime(). mktime() takes the output of strptime(), which is a C structure or the equivalent in Python - a named tuple with the same fields.
> the time string lacks any information on time zones
Not necessarily. Time strings often contain a time zone. The string you happen to be parsing doesn't contain a time zone you could always append one. If it did have a time zone you could always change it to UTC. So this isn't the problem either.
The root cause of the issue is the "struct tm" that strptime() outputs didn't have field for the time zone so if the string has one, it is lost. mktime() needs that missing piece of information. It solves that problem by assuming the missing time zone is local time.
> then the article uses timegm() to convert it to unixtime on the assumption that it was in UTC
It does, but timegm() is not a POSIX function so isn't available on most platforms. gmtime() is a POSIX function and is available everywhere. It doesn't convert a "struct tm", but it does allow you to solve the core problem the article labours over, which is finding out what time zone offset mktime() used. With that piece of information it's trivial to convert to UTC, as the above code demonstrates in 2 lines.
> also it's about C
The python "time" module is a very thin wrapper around the POSIX libc functions and structures. There is a one to one correspondence, mostly with the same names. Consequently any experienced C programmer will be able translate the above python to C. I chose Python because it expresses the same algorithm much more concisely.
Jeez, read the article. C++20 has such an elegant solution that it has him swooning.
Being truly luxurious, the tz library supports using not just your operating system’s time zone databases, which might lack crucial leap second detail, but can also source the IANA tzdb directly. This allows you to faithfully calculate the actual duration of a plane flight in 1978 that not only passed through a DST change, but also an actual leap second. I don’t swoon easily, but I’m swooning.
That makes no sense in this context. What's the situation you're imagining where they wanted to use C but something else prevented them so they made up an excuse to call C bad?
You know sometimes people just dislike things, right?
That is fine as long as the input / output is always in UTC... but at the end of the day you often want to communicate that timepoint to a human user (e.g. an appointment time, the time at which some event happened, etc.), which is when our stupid monkey brains expect the ascii string you are showing us to actually make sense in our specific locale (including all of the warts each of those particular timezones have, including leap second, DST, etc.)
It is not, if what the user expects to stay constant is their local calendar/wall clock time rather than the UTC instant. Which is usually the case. This is a transformation that needs late binding as DST and timezone rules can change, so it can't just be handled as a localisation transformation on input/output
Until you understand that the core of unix time is the "day", in the end, you only need to know the first leap year (If I recall properly it is 1972), then you have to handle the "rules" of leap years, and you will be ok (wikipedia I think, don't use google anymore since they now force javascript upon new web engines).
I did write such code in RISC-V assembly (for a custom command line on linux to output the statx syscall output). Then, don't be scared, with a bit of motivation, you'll figure it out.
The core of the UNIX time is seconds since epoch, nothing else. 'Day' has no special place at all. There are calendars for converting to and from dates, including Western-style, but the days in those calendars vary in length because of daylight saving switches and leap seconds for example.
UNIX time ignores leap seconds, so every day is exactly 86400 seconds, and every year is either 365*86400 or 366*86400 seconds. This makes converting from yyyy-mm-dd to UNIX time quite easy, as you can just do `365*86400*(yyyy-1970) + leap_years*86400` to get to yyyy-01-01.
They needed to have a locale matching the language of the localised time string they wanted to parse, they needed to use strptime to parse the string, they needed to use timegm() to convert the result to seconds when seen as UTC. The man pages pretty much describe these things.
The interface or these things could certainly be nicer, but most of the things they bring up as issues aren't even relevant for the task they're trying to do. Why do they talk about daylight savings time being confusing when they're only trying to deal with UTC which doesn't have it?