

Ask hackers: Time as double? - chmike

I need to choose a data type to represent absolute time values for static typed language. I need a small time resolution and big time span coverage.<p>A natural choice is to use nanoseconds time ticks since some epoch. 64bit integer would then be fine. But we then get accuracy problems to represent time in seconds and rounding errors.<p>A fixed point value with seconds as time unit and decimal point between bit 29 and 30 might match most users precision preference. But it is not trivial to get fixed point computation correct because of the tricky problem of normalization and rounding errors.<p>Thus using double precision floats seems a good choice because all operations are built in computers. It is only the precision that is not uniform.<p>Do the hackers following YCombinator hackers news have an opinion to share on this ?
======
mark-t
I'm afraid I don't understand the downside of nanoseconds since epoch. What
accuracy problems and rounding errors? Are you suggesting that nanoseconds are
too coarse of a measurement? You could use picoseconds. No matter what, if
you're storing arbitrary real numbers in a fixed number of bits, you're going
to have some rounding error. Choose enough bits to make your representation
"good enough" or, say, "better than your instruments, anyway". Then use
integers. They're simpler and much more efficient than floats/doubles.

~~~
richardw
I'm guessing that he's saying binary doesn't represent all numbers properly so
if you need it to be absolutely accurate, you need to have a custom data type.
Making it picoseconds doesn't help.

Same problem with money - it's a very bad idea to use floats etc for money.

~~~
mark-t
I covered that. There is no "absolutely accurate" in a finite number of bits,
no matter what data type you use. In N bits, you can't store more than 2^N
distinct numbers, and there happen to be a lot more than 2^N distinct numbers,
even between 0 and 1. You have to choose your margin for error and be prepared
to live with it (or use a variable width format).

~~~
chmike
Fully agree. The time resolution I would be interested in are nanoseconds.
Using 64bit signed integer values with nanoseconds as units I could cover a
+/- 292 years which is far enough. I could then afford to keep the unix time_t
epoch.

------
SwellJoe
I would suggest you have a look at usleep, nanosleep, ualarm, and
setitimer/getitimer. These are the high resolution timer functions found in
UNIX/Linux, and have been worked over by really smart people for a very long
time (at least since POSIX in the case of some of them, but probably years
before that, as well). Then again, it looks like the POSIX functions have a
range limitation of 0 to 999999, so maybe they're using much smaller storage
than you'd use. If history is any indicator, I would guess that the Linux
implementation is not subject to those limitations internally, but imposes
them to match compliance with the POSIX standard (and probably provides the
less limited functionality via an optional parameter or library call--just a
guess, here), so a look at the Linux high resolution timer code might turn up
very useful ideas.

The only high res time tool I've ever used was Time::HiRes from CPAN, so I
have no low-level understanding to impart to you. I just know what the
functions are called, and figured that might get you started in the right
direction.

~~~
pmjordan
Agreed. There are good time/date libraries out there, including the functions
in the Linux/GNU C runtime library. Trying to do this yourself is just asking
for unnecessary pain: the library functions handle all the quirks for you,
from weekday/d/m/y h/m/s.ms.µs.ns <-> internal representation conversions to
time zones, leap years, leap seconds, etc.

This is a solved problem. Use the existing solutions.

And if for some reason you need absolute control and the built-in functions
just can't fulfill your obscure needs, using doubles doesn't seem like a
particularly appealing solution at all, as you'll still end up with precision
issues on non-integers, except they're harder to control because they will
vary. If there's something the built-in functions can't do for you (I doubt
it!) I suggest reading up on the internal representation those functions use,
and see if you can build on it.

------
Hoff
Where and when you can, don't roll your own format. Use the existing libraries
and routines, and/or use language-specific storage.

Save for specific and rare cases, neither time nor money seem (to me)
appropriate in floating point. Whether or not the integer format (implicitly
factional in most representations of time) is most appropriate, not all boxes
have floating point, and floating point is not appropriate in various contexts
on various boxes.

Do either include the timezone or offset, or always operate in UTC. Avoid the
TZ processing where you can, otherwise plan for and implement it.

I've usually switched to 64-bit and longer for time values for most
requirements; non-standard or "packed" formats or optimizing for storage have
been more of a problem than a benefit, as compared with the savings from
standard formats and libraries and the slight slightly increases memory use.
Unless you're storing zillions of these things.

OpenVMS uses centiseconds since 17-Nov-1858 in a long long; in a 64-bit value.
There are conversions around in the Perl library, and I and others have posted
C code for this.

Unix uses a long or (increasingly commonly) a long long value since 1-Jan-1970
UTC. FWIW, the long (treated as a signed) overflows circa 19-Jan-2038
03:14:07.

In various application environments, you will be stuck with your choice for a
very long time. Plan for DST and switch-over cases. Pick wisely.

------
chmike
Thanks for comments. The aim is to avoid the timeval structure and merge it
into one basic type value. Consider it as a time stamp value. The timeval
struct is very uncomfortable when dealing with time computation. It is even
error prone once we have to deal with microseconds. While the precision is
well defined, the covered time span is limited. BTW, the limit is not that far
anymore.

Converting to calendar time is not an issue as long as one can extract time in
second units from the time value. After adjusting to the right epoch one can
use the existing conversion functions.

The idea to use doubles to encode time came from some API in windows. Is it
patented ? ;-) Its advantage is that one can represent nearly any time value.
The disadventage is the varying precision.

My use case is for time stamps and thus integer or fixed point would be
preferable. I would prefer the fixed point using seconds as units. The problem
is to implement some arithmetic operations like multiplication and division by
a decimal scalar.

------
bumbledraven
Check out D.J. Bernstein's TAI library <http://cr.yp.to/libtai.html> and his
specification of the TAI64, TAI64N, and TAI64NA time formats
<http://cr.yp.to/libtai/tai64.html>.

------
chmike
I've just learned that Java uses 64bit signed integer with ms units and uses
the 1970 epoch. It covers +/-290 million year time span. The interesting thing
with this format is that you don't need any special library to do time
operations. For time stamps nanoseconds is in my case preferable.

Thanks for taking the time to comment on this issue. It has been very useful.

------
romuloab
Delphi uses Double as date/time since version 1 IIRC, and works pretty good. I
myself created a mini lib on a C project, but don't know if I'd follow the
same route again nowadays (you know, DRY and DRTFW - Don't Reinvent The F*
Wheel).

