

Natty: a natural language date parser in Java - joestelmach
http://natty.joestelmach.com/

======
RiderOfGiraffes
OK, it can't parse:

    
    
      The day before Sunday week
      The day before Sunday
      Next Sunday
      Sunday
    

Now, you may have tried putting in "Sunday" and it worked, but it didn't when
I tried it.

I had a leading space.

Going to the "Let use know" link takes me to a github issues reporting page. I
stared at it for about 30 seconds, then decided life's too short and I'd
report my findings here.

So I've made it parse some of the above, it appears that there we odd spaces,
but I can't retrieve the exact cases now, so I can't really make a sensible
bug report.

But it still doesn't parse "The day before Sunday week," nor "The day before a
week on Sunday." It also doesn't parse "26-05-2010," the usual UK data format,
but you probably knew that.

Is your test data set available?

~~~
joestelmach
Thanks for the feedback.

I'll admit to the amateur mistake of not handling leading white space (that's
now fixed on the master branch.)

It looks like the only example from your list that couldn't be parsed is "The
day before Sunday week". Forgive my ignorance, but I've never seen 'week' used
in that context here in the US. If you'd be willing to describe the proper
use, I'll look at implementing it.

As for the UK format, please see the issue I created here:
<http://github.com/joestelmach/natty/issues#issue/3>, and lend some advice if
you'd like.

The test set is available here:
[http://github.com/joestelmach/natty/blob/master/src/test/gun...](http://github.com/joestelmach/natty/blob/master/src/test/gunit/DateParser.testsuite)

~~~
RiderOfGiraffes
In the UK and Australia, "Sunday week" would be taken to mean one week after
the Sunday after today. Also phrased as "a week on Sunday."

I have use for the library, but cannot use Java. It's nice to know that such a
solution exists, and perhaps I can push towards an equivalent for my contexts.

Thanks for your response.

------
mmastrac
Nice, joestelmach! We're using JChronic (<https://jchronic.dev.java.net/>) at
DotSpots right now for this stuff internally. I'd love to switch to something
better maintained.

FWIW, Ruby's chronic contains some date formats that this doesn't seem to
support right now ('5 minutes ago'):

<http://chronic.rubyforge.org/>

Aside: I'd really like to see a publicly-available set of natural language
date test cases shared across these projects.

~~~
joestelmach
Thanks for the feedback.

The chronic project was actually the original motivation behind natty. I think
the chronic project is great, but I believe the grammar-based, AST approach is
the way to go for long-term maintainability.

Implementing relative times has been on my list of things to do (in addition
to recurrence.) I created a feature request here:
<http://github.com/joestelmach/natty/issues#issue/4>, so feel free to list any
time formats you'd like to see implemented.

I agree that a generic list of test cases would be nice. Any thoughts on how
such a list should be published?

~~~
phaylon
A list that can be used for interoperability tests would be great, even across
languages. For example, Perl has DateTime::Format::Natural that provides this
functionality. There's a list of supported inputs at
[http://search.cpan.org/dist/DateTime-Format-
Natural/lib/Date...](http://search.cpan.org/dist/DateTime-Format-
Natural/lib/DateTime/Format/Natural/Lang/EN.pm) .

I guess such a list of test-cases could simply be a JSON file containing the
input strings plus an output specification. The options for output would only
be explicit datetimes, datetimes relative to the now, or timespans.

