
Knwl.js – A JavaScript NLP - tilt
http://loadfive.com/os/knwl/
======
sagivo
This is not NLP. The code just finds words in list of pre-defined words. It
ignores the context of the sentence or perform any "real" NLP analyzing.

~~~
jmsduran
You are correct, it is not NLP. But I wouldn't downplay the potential of this
javascript library.

I have hundreds of digital documents that need to be sorted and archived. I am
able to extract a document's text via OCR, but now have to find a way to
sort/file them based off certain keywords present within the document (date,
time, place, etc).

I think Knwl.js will be a perfect fit for this. So far I have only spent a
couple of minutes looking through the demo and GitHub repository, but it looks
like with this library I will be able to write an extension that can help
detect restaurants and stores I frequently visit (which will further help me
sort all those pesky paper receipts).

I will definitely start playing around with Knwl.js once I get home. Good job!

------
spodkowinski
Nice parser collection. But there a more powerful NLP libraries out there for
JS:
[https://github.com/spencermountain/nlp_compromise](https://github.com/spencermountain/nlp_compromise)

~~~
thomasfoster96
That module looks pretty useful. I've been resorting to node-natural [0] and
then using a quick part of speech tagger so far, but this looks a lot better
and doesn't have a heap of baggage.

[0]
[https://github.com/NaturalNode/natural](https://github.com/NaturalNode/natural)

~~~
nyxtom
A while back when I used to work a lot with this stuff I wrote a library to do
a lot of sentiment, tokenization, and part of speech tagging based on several
corpus I came across when I realized natural didn't have what I needed at the
time.

[https://github.com/nyxtom/salient](https://github.com/nyxtom/salient)

------
dboshardy
knwl.english.adverbs = [ 'red', 'blue', 'green', 'orange', 'black', 'white',
'yellow', 'purple', 'really', 'sweet', 'sour', 'bitter', 'evil', ];

NLP. You keep using that word. I do not think it means what you think it
means.

~~~
ohitsdom
NLP or adverbs? Both seemed to be applied incorrectly...

~~~
Mahn
He means to say that Natural Language Processing isn't just scanning a text
for patterns, like this library does, hence it's not a NLP library in the
strict meaning of the term. It's an interesting project but it's really a
parser, not a NLP.

~~~
ohitsdom
Yeah, I got that part. I was pointing out that this array of "adverbs" really
contains adjectives.

------
lelf
sed -e "s,NLP,some hardcoded parsers for English-only dates/emails/links/…,"

------
eterm
This is going to look like "oh typical HN shooting things down", but:

Parses 13/4/2012 as:

year: 2012

month: 13

day: 4

~~~
tonyblundell
Yeah seems obvious, but I don't think it really makes sense to guess that it's
a UK format date based on whether one of the numbers is bigger than 12.
4/5/2012 could be 4th May or 5th April and it would be impossible to tell
programmatically.

People really need to start using yyyy/mm/dd :-)

~~~
gtf21
dd/mm/yyyy and yyyy/mm/dd are as valid as each other, since it proceeds in
either increasing or decreasing unit size. Having dates as mm/dd/yyyy makes
about as much logical sense as writing: 10492 as 49210.

~~~
tonyblundell
It doesn't make sense mathematically because it isn't a mathematical notation,
it's linguistic in nature.

In Europe, '4th May' is shortened to 4/5.

In the US, 'May 4th' is shortened to 5/4.

Neither is right, wrong or illogical.

I'm from the UK, but have spent a lot of time working with US based
clients/colleagues. They only way I've found to avoid confusion and ambiguity
is to use YYYY/MM/DD.

~~~
gtf21
It still makes no sense to have MM/DD/YYYY.

------
dlsym
This could be a little more robust... Miss a dot and you get weird places like
"if". Relative times don't get extracted.

Something like: "Hi Frank,

I've looked up the OIDs in the Cisco manual. Let's meet for lunch at burger
king tomorrow at noon.

S" yields only "Cisco" \- as a place. Well.

I know parsing natural language is hard. So keep up the good work and keep on
tweaking :-)

------
alistproducer2
This NLP is extremely naive. It's just a hard-coded dictionary lookup with
rudimentary English language syntax and structure recognition.

------
arihant
It is faster than some of the others I've tried. Good thing is, since it is
built to be basic and domain specific, the core should be kept as simple as it
is, and allow extensions to be built on top.

There are mess ups around sanity checks, but looking at their source it's
actually trivial to fix. Forking it, I hope they accept community pull
requests.

------
elcct
At thirst I was scared to click, because I though it is going to program me
neurolinguistically to use Knwl.js whatever that is.

------
n8m
I like it. I wonder if it would work with mixed in Markdown annotation.

------
esamek
These two sentences were not parseable: "What if we head to Rockville and buy
Daniel a beer or 2. May you fetch me a bottle of water?"

------
talleyrand
Can someone explain to me how that demo matches "Chicago"? From the source
provided, I don't see how that is possible.

~~~
yAnonymous
[https://github.com/loadfive/Knwl.js/blob/master/experimental...](https://github.com/loadfive/Knwl.js/blob/master/experimental_plugins/english.js)

knwl.english.prepositionalPhrases = [ ["about"], ["below"], ["in", "spite",
"of"], ["regarding"], ["above"], ["beneath"], ["instead", "of"], ["since"],
["according", "to"], ["beside"], ["into"], ["through"], ["across"],
["between"], ["like"], ["throughout"], ["after"], ["beyond"], ["near"],
["to"], ["against"], ["but"], ["of"], ["toward"], ["along"], ["by"], ["off"],
["under"], ["amid"], ["concerning"], ["on"], ["underneath"], ["among"],
["down"], ["on", "account", "of"], ["until"], ["around"], ["during"],
["onto"], ["up"], ["at"], ["except"], ["out"], ["upon"], ["atop"], ["for"],
["out", "of"], ["with"], ["because", "of"], ["from"], ["outside"], ["within"],
["before"], ["in"], ["over"], ["without"], ["behind"], ["inside"], ["past"],
["away", "from"] ];

------
jlink
Do not find "London" when in the demo text I added this: "Regards from
London,"

