Hacker News new | comments | show | ask | jobs | submit login
Going long long on time_t (openbsd.org)
130 points by hebz0rl 1122 days ago | hide | past | web | 88 comments | favorite

If anyone has trouble reading this, there's also plain text versions of individual slides:




Here's the whole thing in one page, minus images:


My god, I'm not a fan of reading slides in general but this... this is a new dimension of horror. I can't shake the feeling that there's some sick humour in this.

What's so horrible about it? Obviously, it isn't on the same level as any of the slide decks web designers put up that are in a sane format (HTML5+js / PDF) which don't require you to actually click through, but then

a) Theo De Raadt works much further down the stack

b) This is probably something put up by the EuroBSDCon folks after converting his slides (can someone who was at the Con weigh in on this? If he actually presented his talk like this, that's extra bad).

> What's so horrible about it?

It's Comic Sans with JPEG artifacts. The only consolation is that it doesn't blink.

It might have been difficult to combine blinking GIFs with JPEG artefacts – that said, it wasn't all that bad with the mouse on the ‘next’ link and tapping of the left mouse button to read the next slide, at least it works in Opera :)

The content was good though which after all is the main thing...

True, but the font detracts from the technical content. Presentation of information is important. There are cognitive and psychologic implications that your reader will incur when reading a document written in Comic Sans. There is nothing to be gained from using Comic Sans and it's not difficult to simply not use the font.

It's 2013 and there are tons of programs that can generate a decent HTML based presentation from a text file. Heck, I've encountered one just today while looking at the slides for "fedmsg - The Fedora Infrastructure Message Bus" [1]; the tool is named hovercraft [2]. Another nice tool is landslide [3].

[1]: http://threebean.org/presentations/fedmsg-flock13/

[2]: http://hovercraft.readthedocs.org/

[3]: https://github.com/adamzap/landslide

I've been using dzslides (https://github.com/paulrouget/dzslides) and been pretty happy. I wrote my own tool to take markdown and emit dzslides and I've not been upset yet, but my slides tend to be very minimalistic.

Or just use HTML (rendered with Webkit):


and embed terminals and other programs along the way.

Not having any way to navigate without actually clicking the tiny "next" link is infuriating. I can't actually focus on the subject matter if I have to do that.

> the tiny "next" link

Once you zeroed in on the link, it's not like you have to move the mouse away, and its Fitt size becomes essentially infinite past the first slide.

Only if your screen is large enough that you don't have to scroll down.

It might not be immediately obvious that you can zoom in on a slide by clicking on it.

No humor at all, it is real and frankly, I am worried for my children. http://y2038.com/

WTF. I'm not sure I want to live in a world where plain text should elicit such a reaction!

Comic saaaaaaaaans!

Good read, even if you, like me, don't care too much about the low-level stuff..

A few takeaways:

- Embedded 32bit is everywhere. Sure they'll fix the obvious ones, but I'm sure some things will be forgotten about. This problem might not be taken seriously after the Y2K debacle.

- The OpenBSD guys & gals like to do implement new designs and ideas. Sometimes radical. (but I already knew that)

- A transitional solution can end up being the solution ans stick around forever :(

- Theorem "In operating systems, increased popularity leads to greater resistance to change". Probably true in most "products".

> This problem might not be taken seriously after the Y2K debacle.

I wish people would stop saying things like that. I was one of many programmers who spent some overtime over the course of about a year on fixing code in 1998-9. I was just the junior programmer at the time; the head guy in the department had some nearly sleepless nights around then.

I can promise you that in one East Bay school district, nobody would've gotten their paychecks, report cards would not have worked, DNS would have stopped working for the entire school network, and finance & budget would have had some really crazy errors in the output -- those are the systems I still remember requiring the most attention.

Y2K was a "debacle" because a bunch of people busted their asses fixing old code.

It feels a lot like pulling two consecutive all-nighters as an engineer on a project, to bring the project up to its deadline on-time and under-budget, only to have your department manager the next morning stroll in, well-rested, coffee cup in hand, and say, "See? Told you it was no big deal."

Fair enough, it's not the most fitting term.

I remember a lot of hype and no actual problems. I suspect others do as well, and this could lead to people thinking about the whole problem as over-hyped and a non-issue.

Not saying that it's correct, but that's the catch-22 of preventing problems: If you do a too good job, nobody will notice =/

Yes, and that some embedded 32-bit systems deployed today will be in service come 2038. And quite probably some 16-bit, 8-bit ones too.

For format strings, why not do the inttypes.h thing, and define a macro for the format specifier of (time_t)?

  "%" PRI_TIME_T

Openbsd does not believe in the use of the format string macros.

Any idea of the motivation behind that?

It's ugly.

It's superficially ugly.

It's deeply ugly to have to maintain "%llu", "%lu", and various other format strings for a single (uint64_t) or indeed (time_t). It's also ugly to up-cast everything to the largest potential size whenever you use format strings.

This seems like a case of choosing deep ugliness over superficial ugliness...

  "%" PRIu64 ": The time is: %" PRI_TIME_T, x, y
Seems nicer to me than:

  "%llu: The time is: %lld", (long long unsigned)x, (long long)y /* time_t is signed? Who knows? */
What type would you use if you wanted to print uint128_t? %llld ?

Finally, I think rejecting a standard C header file because it is "ugly" and coming up with your own solution is unnecessarily fragmenting things, especially when it isn't clearly better (IMO it is clearly worse).

> It's also ugly to up-cast everything to the largest potential size whenever you use format strings.

I'm not sure that is true. It accurately captures the reality that you don't know the size of the type, but that you have determined what the maximum size can be and hopefully made considerations for it.

I should think there isn't even necessarily a performance cost, as it wouldn't be hard to trick out a compiler to recognize what was going on and optimize accordingly.

> What type would you use if you wanted to print uint128_t? %llld ?

IIRC, there is no standard portable format string length modifier for 128-bits (I think some platforms used %q for it, but that's definitely not portable), so literally nothing. Format strings suck.

> Finally, I think rejecting a standard C header file because it is "ugly" and coming up with your own solution is unnecessarily fragmenting things, especially when it isn't clearly better (IMO it is clearly worse).

Note that as the presentation points out, the better thing to do is whatever is going to easily adopted. In this case, where people are already using format strings, and already working with a time_t that might be only 32-bits wide, this might actually be that solution.

> rejecting a standard C header file because it is "ugly"

I just checked an OpenBSD machine's inttypes.h. Some of the format macros (I presume the ones present in the standard) are there. So I can't say they rejected the header.

It would probably be weird to introduce one there that isn't in the standard. OTOH %lld for long long is in the standard.

It doesn't solve the problem.

    time_t x = ...;
    printf("%lld\n", (long long) x);
The above code works on any platform, as long as time_t contains no values that do not fit in long long. Using a macro for the format specifier would require defining such a macro on other systems, because they have to be able to compile much of this code on Linux, OS X, or even Windows in some cases.

> Using a macro for the format specifier would require defining such a macro on other systems, because they have to be able to compile much of this code on Linux, OS X, or even Windows in some cases.

    #ifdef LONG_LONG_TIME_T
    #define PRI_TIME_T "lld"
    #define PRI_TIME_T "ld"
There, I just made it so that you can do:

    time_t x = ...;
    printf("%" PRI_TIME_T "\n", x);
I don't much like how it looks, but it works and it'd be easy to make portable code with it. All you need is a preprocessor flag to indicate when you were using long long.

The real problem is that it'd be easier to get old embedded systems upgraded to 64-bit in the next 25 years than to get those old systems retrofitted with such an annoying syntax. Forcing everything to a know and use a "wide enough" integer width is probably the best you can really do with format strings anyway.

And you repeat that macro in every file? Or you assume you can somehow persuade other OSes to add it to their headers?

printf("%lld\n", (long long) x); is ugly but it works on every system, including existing systems with 32-bit time_t that don't make any changes to their headers.

> And you repeat that macro in every file?

You are familiar with the concept of headers, right? ;-)

Just put that in the project's common header (which if it is dealing with so many different OS's, invariably already have a bunch of platform abstractions). I deliberately structured the solution so the only "extra work" that is needed is for whatever platform has created a long long time_t (and if you really wanted to, you could probably get rid of even that work and base the entire thing off of sizeof(long long) even without using something like autoconf).

> printf("%lld\n", (long long) x); is ugly

Wow, we couldn't be of more different opinions. I'd argue the virtue of that approach is it is less ugly are more likely to be easily accepted as a change for crufty old 32-bit code in some embedded system that everyone has forgotten about.

So, this slide confused/troubled me: http://www.openbsd.org/papers/eurobsdcon_2013_time_t/mgp0002...

I don't see why it would be a good idea to convert "time_t" to "long long". Having an alias specifically for time_t is part of what makes this kind of work doable. I could see maybe introducing another alias like "time64_t" or something, but once you convert it to "long long" the type is no longer tagged in a way that makes it easy to find and more importantly suggests to the programmer they ought to NOT make assumptions about its size. Heck, in a perfect world I'd either introduce a new % symbol specifically for time_t width or have a macro that expands to represent its width (not to mention make it mandatory to use compiler warnings about string formats not matching argument widths).

I also found the comment about "would love better compiler tools -- none found so far". Certainly there are things like Sparse (https://sparse.wiki.kernel.org/index.php/Main_Page) which correctness verification easier.

The typedef 'time_t' is definitely being used, for example is is a patch from sys/kern/kern_clock.c: http://www.openbsd.org/cgi-bin/cvsweb/src/sys/kern/kern_cloc...

That was part of a larger changeset that enabled 64bit time_t on 2013-08-13. The 'long' type is changed to 'time_t', which is now 64bit everywhere on OpenBSD.

I didn't see think talk but I think it's confusing because we are just seeing the slides. Here is what I got from it:

    remove time_t from network/on-disk/database formats
Right now if you have a userspace app that has some kind of binary disk format, say a database, and you use the time_t typedef your binary files will not be portable between systems which have differently sized time_t's. However, if you use 'long long' or 'int64_t' and cast time_t's to those, your files will be portable and 64bit everywhere.

If you're using time_t in network formats, systems with different time_t sizes will confuse each other!

    remove as many (time_t) casts as possible
This is trouble because you don't know the size of time_t, if it's 32bit you might be truncating! It's better to cast to a 64bit size, that will always work, at least for the next 292 billion years.

Thanks. This makes much more sense sense now. I'd still be tempted to use some kind of a typedef for the field, but yeah, you'd not want to derive it from the system's time_t structure.

Yes that's about right. The videos will be up at some point but it was a two phase process like that.

The slides are confusing to me too, another slide [1] states that they will change time_t to be "long long" in a future release so the tagged type will still be there.

I'm also not sure why the OpenBSD people think that there will be lots of problems with ports [2], I'm typing this on a NetBSD system with 64 bit time_t, things that broke have had patches pushed upstream.

[1] http://www.openbsd.org/papers/eurobsdcon_2013_time_t/mgp0003... [2] http://www.openbsd.org/papers/eurobsdcon_2013_time_t/mgp0003...

Introducing a new format specifier for every typedef will quickly exhaust the alphabet.

In 292 billion years when we overflow a 'long long' time_t humanity is really going to regret that flippant attitude.

I agree, and consider that one of the inherent problems with the string formatter approach to IO. That said, time is perhaps one of the data types that is universal enough and general enough to the C runtime that you could make a case for it.

I've always felt that the 'right' solution was to emulate glibc and provide facilities for registering new format specifiers[1]. Then libraries could provide specifiers for everything that made sense and users could pick and choose.

Of course, there would be performance implications - not to mention the added complexity for implementors.


What happens when your graphics library and your network library pick the same letter?

The simplest solution (from the implementor's perspective) is probably to leave it to the user to reassign one. Allowing multiple-letter specifiers helps here. Coming from a C perspective, that's what I would prefer.

You could work out all sorts of namespacing and automatic reassignment schemes, of course.

I can't think of anything without trade-offs off the top of my head (I double that anything exists), but in my (limited) experience it's very workable.

I notice the ideas that it is not the number of contributors that matter, but the number of sufficiently skilled ones, and the that popularity impedes change. I can't help draw a parallel with the advice that you should listen to your most valuable customers, and potential customers, and that the rest of your users will expect free stuff, and complain loudly when you pivot.

NetBSD did this a while ago, but was better about binary compatibility and things.


Any idea why they would use long long rather than int64_t?

Imagine you just called time(2) and got a time_t value. Something like:

    time_t asdf = time();
And now you want to use printf to print that value to the screen. You can cast to 'long long', which is guaranteed to be at at least 64 bits wide, and ensure no loss of precision occurs:

    printf("%lld", (long long)asdf);
That will work whether time_t is 32 bit or 64 bit.

If they just added a macro to inttypes.h, you could use that for the format specifier. If you want to print a uint32_t, you just use PRIu32.

> you could use that for the format specifier.

And end up with even more horrible and less readable format strings.

So we should use unportable format strings or cast all the values to long long? Is that better?

I personally find inttypes format strings readable.

How do you think the int64_t type is constructed? Few problems are solved by endless stacks of typedefs.

int64_t is a much nicer type than "long long".

"int" makes some sense. Word-size of the machine.

"short", "long", "long long" are all non-sensical. You use them when you want to trade size and range. When you want to make that trade-off, you care what their sizes are.

Instead of lower/upper bounds on their sizes, which aren't very useful, they should just have specific sizes. At which point, you might as well use uint32_t, and uint64_t in place of long and long long.

Prefer the sized int types over the "long"/"long long" ones when you can, for saner coding.

Use uintptr_t and such when you need a ptr-sized int, rather than a specific size.

You need to realize that the size of most types in C have been perverted by history. When possible, it's best to use the few types that are unambiguous in all of C89, LLP64 and LP64.

> int64_t is a much nicer type than "long long".

Yes, except that "long long" has been part of the standard for longer. in64_t was part of C99 but is tricky to include in software up to the mid 2000's due to slow adoption of the standard.

You can use "long long" without headers in most C compilers from the last 20 years. int64_t when present is usually just a typedef to "long long". Keep it simple.

> "int" makes some sense. Word-size of the machine.

Except that it isn't. That was its original intent but for historical reasons, it is a 32 bit integer in almost all cases now, regardless of machine word size.

I fail to see why it's more reasonable to prefer "long long" to "int64_t" just because the former existed for longer. It's not year 2000 today and C99's not a hot new thing not many compilers support. Or OpenBSD do have some policy that their kernel must be able to build with any C89-compliant compiler?

It's not the kernel that's the issue, we're talking about basic, cross-platform userland utilities like "ping" here. Some of those do have a policy that they have to be able to build on Irix 5.8 or whatever.

Is there anything that forbids those utilities from using system-provided headers and time_t? I think they'll build fine on any POSIX.1-compliant system then.

Unless they're doing something really weird with time_t values, I don't think there's any reason they should know whenever it's long long or int64_t or whatever under the hood.

printf("%lld", (int64_t) t); wouldn't build on a system that doesn't have int64_t, so one needs to use long long.

Why cast to `int64_t` here? Just because `time_t` could be `int64_t` under the hood doesn't mean it must be casted to this exact type for string formatting/presentation purposes.

And I thought `%lld` actually means `long long`... So, http://ideone.com/SJJFPs seems like a proper approach to me. That said, if compiler supports %lld (an %I64d or alike might be required for older compilers), so a better cross-compiler approach would be in lines of `printf("test: " TIME_FMT "\n", (TIME_FMT_CAST)t)`. Or, ahem, maybe, `print_time(t)`.

Assuming we don't want to maintain a set of per-platform macros, we need to use an existing, standard format specifier. There isn't one that takes a time_t. So we have to cast time_t to a type big enough to contain it, use the format specifier for that type, and we want this to be as cross-platform as possible, i.e. we use the oldest, most widely-supported type which can definitely hold at least 64 bits and has a standard printf specifier available. i.e. "long long", exactly as in your example code.

So that's why we prefer "long long" rather than "int64_t", which I thought was your original question.

Oh. I thought the discussion was not about what type to cast when using printf (I agree, only `%lld`/`long long` fits perfectly), but what type to use for `time_t` internally. Sorry if I misunderstood and missed the point.

Microsoft's C compiler doesn't have C99 support.

Microsoft don't make a C compiler.

once again, how do the headers conjure up the int64_t type? it's not a keyword.

Here is how stdint.h on Linux conjures up the int64_t type.

  #if __WORDSIZE == 64
  typedef long int int64_t;
  typedef long long int int64_t;

Wait. You just typed long long. You're not allowed to do that.

You're whooshing.

You don't want to use "long long" because that's not necessarily 64-bits. You want to use int64_t which guarantees it is 64-bits.

And then, the correct format specifier for that is PRIi64, and not "%ld" or "%lld" which will break in different platforms.

Please don't be disingenuous. We both know that using #defines, you can get a type which is exactly 64 bits on any modern architecture. The fact that long long and long int are a builtin types and int64_t is implemented in terms of them, rather than the other way around, is just an implementation detail.

int64_t is a userspace libc typedef. long long is ICO C (C99) and guaranteed to be 64 bits or more.

Yes... as a typedef in <stdint.h>. In other words, "a userspace libc typedef" as the previous post said...

sirclueless implied that int64_t was not C99, which is wrong.

I think if you re-read what was written, what you think was implied was not.

<time.h> is userspace. It's at least theoretically possible for long long to be more than 64 bits. Why use long long (potentially, say, 128 bits) rather than int64_t for time_t?

And why wouldn't int64_t be available in kernel space?

long long can be printed in printf with the "%lld" flag.

So stdint.h not available to kernel programmers. Thanks.

I always found this special-casing of headers for kernel vs user unnecessary and complicating. In the many embedded runtimes I maintain, I've moved towards using standard POSIX-y and C(++) standard headers everywhere. I know kernel folks love to believe that their world is special so all the usual C library stuff needs to be done differently, but it's not needed.

It's so much easier to port code between the two worlds when you don't have to litter the #include prelude of every C file with conditionals.

stdint.h may not be available, but int64_t is; it's defined in include/linux/types.h.

The OpenBSD kernel probably doesn't use Linux headers.

A few times I got flamed on this site for complaining that the average HNer can't have reasonable discussions about C. I think the comment that the OpenBSD team should just #include <linux/types.h> pretty well shows that I was right. :-)

Just a silly mistake on my part.

Really going out on a limb there aren't you? ;-)

Yeah, good point.

I'm not familiar with the OpenBSD kernel. Is there any good reason (for OpenBSD or any other kernel) why <stdint.h> shouldn't be available -- or at least why int64_t shouldn't be available in some header?

The C standard only states that time_t is an integer (or floating-point) type, and POSIX further states it represents seconds since the epoch, so a 64-bit time_t is a good solution.

In order to find and change occurrences of time_t in ports more easily, they could use the Coccinelle tool.[1] The following semantic patch would find and replace variable declarations of type time_t:

    #include <sys/types.h>
    @time_t depends on sys_types@
    identifier x;
    - time_t x
    + long long int x
Replacing printf format specifiers is more difficult, so the following semantic patch will find printf statements which use time_t variables, which can then be edited manually:

    #include <sys/types.h>
    #include <stdio.h>
    @printf depends on sys_types && stdio@
    identifier x;
      time_t x;
    * printf(..., x, ...);
These can be used as follows:

    $ spatch --sp-file foo.cocci --dir /path/to/ports
where `foo.cocci` is the name of one of the semantic patches above.

[1] http://coccinelle.lip6.fr/

Funny they don't mention that starting Visual Studio 2005 time_t is 64bit on 32bit builds.

How did they resolve compatibility issues with existing code?

I assume the same exact way that the presentation describes? Plus they added a macro you can define to go back to 32bit time_t (which is probably what most app devs did...)

It's going to be a long long time_t.


... and I think it's gonna be a long long time.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact