ISO 8601 defines a number of lesser known features. For example, it allows for not only fractional seconds (12:34:56.78) but also fractional hours (12.3) and minutes (12:34.93). There is a special notation for the midnight of certain day (24:00:00 or 24:00), and obviously another for a positive leap second (08:59:60 etc.). There are three ways to write a date: 2012-05-21, 2012-W21-1 and 2012-142. Intervals can be specified in terms of start and end, start and duration, duration and end, just a duration (i.e. no context). There are also recurring intervals which can be bounded in the number of recurrences or not. And so on and on and on.
That said, ISO 8601 tries to cover most cases for date/time representation. Implementing every bit of ISO 8601 is not desirable of course, but it is certainly worth looking at.
I don't see the need for a placeholder for the n-thousandth day of the year. Unless we want to accomodate something like 2011-0367 also equating to 2012-0002?
Well, 3600 seconds can be expressed much more simply than that: 'PT1H'. Which is more human readable than '3600'. The point in intervals is a compromise between human and machine readability - the point being that we can more easily determine long periods of time expressed in unit values rather than milli/microseconds.
> 'PT1H'. Which is more human readable than '3600'
Not to mention six years, four months, four days, three hours, forty five minutes and fifteen seconds, which is nothing short of 200072715 seconds (which my brain definitely wants to parse as two billions and something).
Nonsense, it's very readible. You could tell someone who can't code to say "period" for "P", "hour" for "H", "time" for "T" etc. and they'd be able to read out the period exactly, accurately all the time. They would also know how write their own forms of this. Humans can also look at it and know, intuitavely, without a calculator how long it is.
Trying to get them to multiple seconds (incl all the fun with leap seconds!) would be hard.
What is all this "Human Readable" nonsense. Who the hell is reading all these date times?
If two pieces of software are passing dates around, use a UNIX, UTC timestamp. If a human wants to read it they're probably a programmer and know how to parse a timestamp. If they're not a programmer, then you shouldn't be showing them unformatted date times anyway.
(1) dates/times before 1970 cannot be represented using a Unix timestamp
(2) you'll have much fun with leap seconds
(3) a Unix timestamp is just a number represented as floating point ... not much different from all other numbers. A standard date representation on the other hand has a unique format that's distinguishable from all other data types. This comes in handy not only for humans reading it, but also for parsers of weak formats such as CSV
Also data lives longer than both code or the documentation describing the data, this being the main argument for text-based protocols.
(4) programmers are people with feelings too and would rather read human readable dates (since they are so common), rather than lookup the boilerplate for converting a Unix timestamp to a human readable format. Here's the Python version, which I know by heart just because I had to deal with a lot of timestamps:
timestamp = time.time() # now
datetime(*time.gmtime(timestamp)[:6]).strftime("%c")
Oh wait, that gives me the date in UTC. Ok, that date is meant for me, the programmer that knows what he's doing, so here's how I can read it:
Great, now I should write a blog post about it. Or you know, just use standard date formats that are human readable, because there's a reason why our forefathers used the dates representations that we use today, instead of counting scratches on wood or something ;-)
True, but many libraries still do not know how to deal with negative timestamps. For instance PHP on Windows, prior to 5.1.0. MySQL's FROM_UNIXTIME was also not working last time I tried it.
And many applications and scripts can break, like those that store and manipulate timestamps assuming a certain format (e.g. positive integers).
The Unix timestamp was created to deal with timestamps, not with date representations. Therefore this requirement (dates prior to 1970) was not an issue that had to be dealt with explicitly.
This is good general advice but breaks for things which don't fit the Unix timestamp assumptions of second-level accuracy: I need ISO8601 because I handle dates for which lack precision below at day at best and often just a month or year. If I'm formatting dates there's an implied precision difference between displaying '1923-1927' and 'January 1st 1923 to January 1st 1927'
If you need to handle variable precision, time zones, ranges, etc. you can either invent your own format or use ISO-8601 which at least has the virtue of being more human readable and more likely for people to have previously encountered.
Very good advice! ISO8601 is a perfect tradeoff between the machine readable Unix timestamp, and the human readable mess used in http.
The only thing libraries often get wrong is that the timestamp should always be present and default to 'Z'. It's pretty rare to want timestamps in local time except during debugging, but that's often the default. It catches a lot of people out.
Yes, this is one thing that I've had a lot of difficulty with when trying to implement a reliable ISO parser. I've found that Python is one of the worst offenders with this, as by default the datetime classes have no concept of tz, and it is significantly more difficult to attach tz info to a Python date. (Loosely mentioned in the article).
I'm always a bit nervous when people talk about ISO8601, given that very few people have probably read the spec, and are likely guessing as to its content.
I agree. The main barrier to actually reading the ISO8601 spec is that it costs money, and is not cheap (if memory services its ~$150), leaving people to read the draft specs (which are harder to get a hold of in their full form) or the Wikipedia entry.
The original article is about generating date strings, and I would give it more credence if it mentioned the RFC 3339 profile. (edit: I just noticed you're probably the author; please excuse my abrupt tone)
A timestamp and a timestamp with time zone are two different things and you need to use the appropriate one (which is usually time zone with timestamp). However, the argument is really not about attaching timezone information to timestamp which is an unfortunate but necessary thing, but storing times in the archaic multiple-base mixed format stuff humans traditionally use (because it's human-friendly) that computers simply do not need. An ISO timestamp mixes bases 60, 24, 12, a weird mix of base 28, 29, 30 an 31, and a weird mix of 365 & 366. This is craziness.
Technically UTC = GMT and always has and for this "epoch time" conversation is completely identical.
However there's a people problem. Some people think "GMT = the time in London now" which it isn't, since the UK switches to BST for daylights saving. Saying "UTC" avoids the "time in London" interpretation problem.
which, if you think the problem through enough you'll have to conclude that you do need to store the time zone, unless you just don't really care about having accurate times. Read through some of the other comments and you'll realize that there's certain kinds of calculations and comparisons you just can't do without the time zone, and a historical time zone database, and a leap seconds database. Do not make the mistake of assuming that dealing with times and dates can be easy.
Mad late reply but i use epoch when my time needs don't matter. Aka: I can get by without worrying about epoch not being second precise and I don't need to worry about dates except to display them in non UTC.
For my needs 90% of the time iso8601 is overkill and unnecessary.
But dates in what I do don't need to be complicated. Which is rare. Also I never said working with times and dates is easy, evaluate your needs for the situation. Going full tilt with timezones and full date parsing for some general server logs for example doesn't always make sense is all I'm saying. Tschuss!
+01:00 is not a useful timezone. If you want human-friendly semantics for things like "+1 day", you need to use the symbolic name for the timezone (Europe/Lisbon or whatever). If you just want an absolute time, you're better off expressing it in UTC, possibly even as seconds since epoch.
Correct, these are time offsets, not time zones. Although, absolute times are exactly what they are for and the reason why e-mail headers use this format. Whenever referring to offsets ("+1 day") or reoccurring events such as in a calendar app ("every 4/1 at midnight"), a real timezone is needed. I've seen plenty of code introduce DST related bugs by taking the current timezone offset and using it to do date/time calculation throughout the life of the application.
Yes, exactly. My point is: what is the use case where you would ever want "+01:00"? If you want to express an absolute[1] time, you use UTC or seconds since epoch. If you want to express a time that's meaningful to humans, you need a symbolic timezone (otherwise the answers to questions like "what is the time 1 day after this" will be surprising).
[1] Yes, I know there's no such thing as absolute time; perhaps "machine time" expresses what I mean.
Beware that older browsers, including IE8, won't automatically parse ISO8601 dates, so Date.parse('2012-05-04T12:20:34.000343+01:00') or new Date('2012-05-04T12:20:34.000343+01:00') return the date representation of 'invalid date'. ISO8601 parsing was only introduced with JavaScript 1.8.5. This means that if you're supporting older browsers, you'll need to write your own parser on the client side or use a third-party library (I generally use my own regexp based parser)
If you look at the website this article is on (http://tempus-js.com/) it is a JavaScript library for replacing the Date object with something that offers more functionality and is browser compatible down to IE6 & co.
If you're sending or storing local dates and times, it may actually be better to use the local time (as ISO 8601 does) with the (Olson tz) name of the time zone it's in, instead of the time zone offset as in ISO 8601. This is more resilient against future time zone rule changes, at least that's what this article argues: http://fanf.livejournal.com/104586.html .
The article doesn't mention the fact that ISO8601 strings will also sort correctly in chronological order, or does this not matter since epoch-seconds do as well (as long as there are enough leading zeroes)?
The first option is generally allowed by protocols where spaces are appropriate. Using the other alternatives would get one into the business of defining standards, which no sane person with an appreciation for the subtlety of that task would do unless they had no choice. Having multiple separators or mashing the numbers together would undermine the distinctiveness of ISO8601, and the distinctiveness allows someone to know the precise semantics of a date even when it is taken out of context.
I don't get the argument that timestaps are not human readable -- they are, the only thing you need is a good viewing software. And yes, it is worth it since at least 99.999999% of accesses to this data come from machines.
I was trying to say that it is better to invest a bit of time in improving data viewers/editors to present timestamps in a nice form than to waste huge amount of effort on parsing and serializing stuff no human will ever see. Besides, human readable is a relative term -- even ASCII text requires serious machine processing to become human readable pixel pattern.
They're certainly not human readable, but I want to know who these people are that are reading the date times being passed around by two pieces of software.
Also, why are people passing around date times for specific timezones? Converting a UTC timestamp to local time is trivial in every language I've used. Converting a local time to another local time isn't.
It sounds to me like a problem that doesn't exist, and a solution that causes more problems. I'll be sticking with unix timestamps thank you very much.
I don't understand why you would ever use anything other than seconds/micros/millis since epoch. Basically no parsing; Efficiently storable; Simpler; Easier; Better.
Only argument I've ever heard is human readability... If you're reading these by hand so often then just write a script/tool/whatever to convert them to human readable. This is still easier since you don't have to find an acceptable ISO8601 implementation. And frankly, how often are you reading the dates manually but not as part of some log output of your program where it could convert it to human readable before logging?
One reason is because unix time is not actually seconds since the epoch. It doesn't include leap seconds (a leap second is added about every 1½ years). Currently unix time is about 30 or so seconds behind "Number of seconds since (unix) epoch".
No, you can't reliably calculate it forward or backwards. To figure out the offset for dates in the past, you'd have to store the list of when leap seconds are added. This system is hardly a "just the number of seconds since epoch", but now have a arbitrary data attached. The official group that figures out leap seconds can't predict things more than 6 months in advance. How are you going to figure out in advance what's going to happen in 2 years time?
So every mail header in the world should represent:
Date: Thu, 17 May 2012 08:36:04 -0700 (PDT)
as
Date: 1337268964
because no one needs to look at mail headers and it should be the job of the user's mail reader to display it in a human readable form? I do not think it is so black and white. In the case of mail headers, the benefit of keeping the timestamps human readable outweighs the programmer cost for parsing them correctly.
"no one needs to look at mail headers and it should be the job of the user's mail reader to display it in a human readable form".
Exactly...
Who reads email headers? Maybe some server admin who's going through some backups of emails looking for something. He might even appreciate having a date format that's lexicographically ordered for his searching purposes.
Having a human readable date is like changing HTTP Content-Length to say "one million four hundred and fifty thousand and sixty four" so that when I'm reading through my raw server logs I can easily see the magnitude of the length of the responses.
I prefer epoch time for all the reasons you mentioned, but there are situations where you need something else. For example, you can't represent a holiday in epoch time (July 4th has four different start times in the lower 48), and if you need to do precise calculations epoch time is ambiguous around leap seconds.
I wish there was something simpler than 3339/8601, it really is a daunting mess of incompatible implementations and optional elements, but time is hard.
Reliability and lack of information. The problem with a "format" such as timestamps is that there is no defined parameters, especially cross language. I cannot rely that sending a timestamp in seconds will be read in seconds the other side (for example JS does not use seconds, it uses ms), there is no way to tell what one is over the other, either. I cannot look at a ms timestamp and say "This is in ms" over a second-based timestamp. It also lacks information, such as tz, and is of a fixed specificity. What happens when I want higher resolution time due to customer feedback? Go back and change the whole system to use MS over S.
ISO8601 defines its parameters within the body. I know I am parsing seconds from an ISO stamp because it is appended with an 'S'. It can support variable specificity, so I don't have to be second-accurate if it isn't needed.
Unix time (which is UTC, and expressed in seconds) does not embed a time zone, which is useful information in some contexts.
For example, with e-mail you want to know both the absolute time (so the mail client can indicate how long ago that was) and the sender's local time (important for human interaction).
Implementation wise, making the timezone explicit forces the implementer to think about it at least once and not make the mistake of writing a local time, which works while testing on systems that are on the same timezone and breaks down in the wild.
That's cool for Linux, going forward. Now we just have to take care of all the old mainframes and legacy system that will never get their O/S updated...
Since nobody else has mentioned this, an interesting essay on time by Erik Naggum; it was done with thought to implementing a time library for common-lisp, but should be readable for non-lisp users:
That said, ISO 8601 tries to cover most cases for date/time representation. Implementing every bit of ISO 8601 is not desirable of course, but it is certainly worth looking at.