

Date parsing performance on iOS (NSDateformatter vs sqlite) - MishraAnurag
http://vombat.tumblr.com/post/60530544401/date-parsing-performance-on-ios-nsdateformatter-vs

======
stevoski
A class like NSDateFormatter is designed to handle a wide range of date
formats. This usual results in sub-optimal performance. If you find it too
slow and you have a known, specific date format you should write a specific
fast parser.

I did the same with Java's Integer.parseInt(...) method. It is an interesting
task to go through.

Now I'll spend the rest of this rainy afternoon playing around with writing a
fast ISO date parser :)

Edit: Seems Java's Joda Time library already does parse ISO dates really
quickly. 7 seconds for 4 million on my MBP

Edit: A fast custom date parser for ISO dates I just wrote can parse 4 million
dates in 150 milliseconds.

~~~
gilgoomesh
It does seem as though Apple should implement a separate optimized path for
the very common case of ISO-8601 date formats.

Although I'm not sure how many people on iOS need millions of dates parsed.

------
pilif
_> Many web services choose to return dates in something other than a unix
timestamp (unfortunately)_

Wrong. That's very fortunate. Unix time stamps have some serious deficiencies
as data type for storing time information: for one, they lack precision. One
second just might not do it. Then they lack any time zone information. You
will never know what a specific time stamp is in. GMT? UTC? Time zone where
the server is in?

Sure. Maybe you are lucky and it's documented (it probably isn't because
people who care about such things are not using unix time stamps to begin
with), but using a string time stamp formatted in ISO means that no
documentation is needed. The encoding is good enough to store any sub second
time stamp including time zone info.

That way, you can turn any of these into whatever your environment uses
internally which you will then use in conjunction with the library routines to
deal with all the difficulties related to doing math with dates (how many days
in a month? What about leap years? What about time zones? Not really hard
issues, but many to keep in mind and many possible causes for bugs)

~~~
rorrr2
Unix timestamps are always UTC (GMT)

Quote:

> _Unix time, or POSIX time, is a system for describing instants in time,
> defined as the number of seconds that have elapsed since 00:00:00
> Coordinated Universal Time (UTC), Thursday, 1 January 1970_

~~~
colanderman
More importantly (the GP missed this as well), Unix timestamps can't convey
_local time_. Local time has UI implications, e.g. the query "is this event on
a weekend" is not generally answerable without the time zone.

~~~
DannyBee
Entirely true, but not necessarily relevant.

Local time is a weird thing and changes all the time.

For giggles, look at the history of timezone rule changes in tzdata.

Most timezones have at least one duplicate hour per year (IE the same time
occurs twice) in the US as well.

Local times are not an appropriate way to store time.

Note: ISO8601 does not give you local time anyway, since you cannot infer the
timezone from the time offset.

~~~
colanderman
No, that's exactly my point. Sometimes you want an event to occur at 9 AM EST,
_regardless of how that translates to UTC_.

That you cannot infer the timezone from the time offset in ISO 8601 is a good
point though.

------
0x0
I wonder if this could be improved by just using the standard C library
strftime(3) instead of going through sqlite?

~~~
jurre
I was wondering the same thing since that's also what apple recommends in
situations like these. This is what I got on the same hardware:

strptime_l took 58.803 seconds

NSDateFormatter took 107.570 seconds

sqlite3 took 7.022 seconds

And with MishraAnurag's suggestion of using timegm instead of mktime:

strptime_l took 21.656 seconds

NSDateFormatter took 108.163 seconds

sqlite3 took 7.096 seconds

~~~
coldcode
Why not see what sqlite is doing and do something in C yourself that solves
the actual problem. It's not surprising that a general purpose Obj-C (or any
language) class isn't terribly fast at one specific thing.

~~~
jurre
Yeah that would probably be the way to go ultimately if you're doing a lot of
date parsing, I agree!

------
becauseICan
These two methods start producing different dates after about year 3515.

~~~
MishraAnurag
That's interesting. I'm not sure of the significance of that year or how that
relates to the algorithm in Meeus' book. This web page talks about the date
algorithms in Meeus' book in some detail but the math is beyond me -
[http://mysite.verizon.net/aesir_research/date/jdimp.htm](http://mysite.verizon.net/aesir_research/date/jdimp.htm).
The Julian day conversion algorithm here is the same one used by SQLite.

------
stcredzero
NSDateForatter is designed for the UI and "Swiss Army knife" use case. SQLite
is for a back end data context.

Two different design goals. Two different sets of design trade offs. (For
"design" in the Rich Hickey sense.)

------
dante_dev
mmm.....can I see the code about NSDateFormatter? because I feel like you're
using it wrong. You need to cache somewhere the NSDateFormatter allocation (it
is really expensive), reusing the same instance to convert the string to
NSDate*.

~~~
MishraAnurag
A link to the source code is present in the article. You can find it here -
[https://gist.github.com/AnuragMishra/6474321](https://gist.github.com/AnuragMishra/6474321)

The NSDateFormatter is already being cached. That was my first suspicion on
finding this issue too. We are using one formatter per thread in the
production code, but that doesn't apply for the code I've posted since
everything is done on the main thread using a single formatter instance.

~~~
asveikau
sqlite happens _after_ the objc version. Are the results any different when
the sqlite code comes first? Actually the fairest comparison would be to store
the dataset in a file, and have a process for timing NSDateFormatter, and a
different one for testing sqlite. This would eliminate any advantage that a
warm cache might give you.

------
andymoe
> _To parse a million randomly generated dates on an iPhone 5 running iOS 7,
> NSDateFormatter took a whooping 106.27 seconds, while the SQLite version
> took just 7.02 seconds._

Yes, NSDateFormatter is slower than other methods including some C libraries
out there or this novel approach for turning a string into a NSDate however in
most instances it's plenty fast enough and has a bunch of useful functionality
[1] the least interesting of which is easily turning a string to a NSDate.

If you are optimizing this aspect of you code first you are likely wasting
your time and would suggest iOS/Mac developers get to know NSDateFormatter
intimately especially if you are displaying date/time information to users
anywhere in you apps.

[http://goo.gl/7flGRI](http://goo.gl/7flGRI)

~~~
MishraAnurag
That's a fair point. NSDateFormatter is fast enough and you should never
replace it for anything else unless you know for sure that it's a problem.
Well that goes for any optimization. In my particular case, it was really a
bottleneck, and the time difference of 100 seconds vs 7 seconds meant a user
will not have to wait for 93 seconds during the initial import step. I am not
suggesting we start getting rid of NSDateFormatter as it is a very valuable
tool which we use for any and all date formatting ourselves, just not during
massive imports anymore.

------
pothibo
That's bad. Very bad. iOS Apps use timestamp everywhere so I'm sure this could
be a low hanging fruit optimization for many apps out there.

You should wrap this piece of code in a nice API and let people benefit from
your findings. (Someone else could do it too, I'm just saying!)

------
asveikau
For those million timestamps I am sure that those extra allocations from the
statement APIs are not helpful. Since he references the C code sqlite is using
(at a quick glance it looks pretty contained) I don't know why he doesn't just
include it directly in his project and call it from objc, no statement API
needed.

[Edit: I see now that in the test there is only 1 statement object ever
created for a test of a million dates. Better than I thought initially. But my
guess is the statement object still creates some degree of inefficiency not
found in directly calling the C version.]

7 seconds is a long time in CPU terms, I am sure that he can do better.

~~~
MishraAnurag
Using the relevant SQLite implementation directly cuts down that time in half
to about 3.5-3.7 seconds over a few runs.

------
jergosh
terrible use of percentages.

~~~
winter_blue
Yea, _15 times faster_ would have be much clearer. On the other hand, 1400%
has "shock value" and makes for a very link-baity title.

------
willvarfar
Why convert each date with a select if they are putting it into the db anyway?
Why not just let sqlite do the conversion as part of the insert statement?

~~~
MishraAnurag
That was to make a fair comparison between the two approaches since
NSDateFormatter's dateFromString gives an NSDate, while SQLite was handing
back an integer.

But you are right. In production, it makes more sense to let SQLite handle the
conversion and insertion in the same statement.

