

How does Apple find dates and times in emails? - elchief
http://stackoverflow.com/questions/9294926/how-does-apple-find-dates-times-and-addresses-in-emails/9344555

======
bdunn
I created a (now defunct) product called Inbox Assistant which would parse
plain text emails and create events automatically in your calendar.

I had to figure out how to do essentially what Apple does, and I'm by no means
a computer scientist. Here's what I learned:
[http://blog.myinboxassistant.com/post/9613650483/experimenti...](http://blog.myinboxassistant.com/post/9613650483/experimenting-
with-natural-language-processing)

------
zaroth
Link should point here: <http://www.miramontes.com/writing/add-cacm/>

~~~
Wilya
Note this is from around 1998. And it seems heavily grammar based.

Things might have changed, since the hardware has become more powerful, and
some statistical techniques have proven pretty good at Named-Entity
recognition.

~~~
huxley
It's possible that the current ones are still based on the work of Apple's
Advanced Technology Group. Apple used HTC over its patents and this article
about the suit mentioned the technology being added back in 10.5:

<http://forums.appleinsider.com/showthread.php?t=138975>

------
huxley
Apple's Advanced Technology Group had a few projects that dealt with
processing text to discover information structure. Jim Miller who was part of
ATG has written a few articles about the work they did:

Data Detectors: <http://www.miramontes.com/writing/add-cacm/>

LiveDoc: <http://www.miramontes.com/writing/livedoc/>

DropZones: <http://www.miramontes.com/writing/dropzones/>

Oops ... didn't notice zaroth's earlier link
<http://news.ycombinator.com/item?id=3633077> to the Miramontes articles,
sorry for the dupe.

------
papercruncher
The timex module in NLTK contrib does a fairly good job at that but uses
regular expressions

[https://github.com/nltk/nltk_contrib/blob/master/nltk_contri...](https://github.com/nltk/nltk_contrib/blob/master/nltk_contrib/timex.py)

------
foobarbazetc
This is called Timex tagging. See <http://www.timeml.org/> etc.

I believe DARPA runs some sort of competition every year or so where different
algorithms are submitted to extract temporal data from text.

------
ashwinl
from the stackoverflow link, helpful video on machine learning:
<http://videolectures.net/mlas06_nigam_tie/>

The 14 min mark alludes to the question above.

------
jonhendry
lsm(1) - Latent Semantic Mapping tool

