
Rasa NLU: Open-source bot tool for natural language understanding - geospeck
https://rasa.ai/
======
espadrine
For NLP, they use either MITIE[0] or spaCy[1].

That said, from my experience, you can get surprisingly far with simple
systems; for instance, queread[2] relies on graph learning and statistics.

[0]: [https://github.com/mit-nlp/MITIE](https://github.com/mit-nlp/MITIE)

[1]: [https://spacy.io/](https://spacy.io/)

[2]:
[https://github.com/espadrine/queread#workings](https://github.com/espadrine/queread#workings)

------
IshKebab
A while ago I looked for information on how Alexa, Wit.ai, Nuance Mix etc. do
this intent classification and didn't find anything.

These guys have posted a nice blog post about their approach:

[https://conversations.golastmile.com/do-it-yourself-nlp-
for-...](https://conversations.golastmile.com/do-it-yourself-nlp-for-bot-
developers-2e2da2817f3d#.l62loslgd)

They suggest that they add the word vectors in the sentence. But it seems to
me that that would make the result independent of the order of words (i.e.
"when does Tesco open?" and "Open Tesco when does" are the same). I thought I
had tested that and it didn't work but actually I just tried saying "Tesco
open does when?" to Alexa and it said "Sorry, I don't have the business hours
for Tesco". Inconclusive I'd say but interesting anyway!

~~~
bendyBus
yeah you're quite right, intents are built with a bag of words model & doesn't
take order into account. Entity extraction does though. If you find a case
where word order is really important for getting intents right I'd love to
know about it! We could find a way to make that work.

~~~
espadrine
> _If you find a case where word order is really important for getting intents
> right_

This may be facetious of me, since it's still fairly uncommon, but here it is.

Set up the go game, go up game the set.

State the ban law, ban the state law.

Drive the car by the park, park the car by the drive.

------
Rabidgremlin
It's the "conversation" part that is really tricky... I have been working on a
bot for a large Corp for the last few months and we have been using Inkle's
Ink narration/dialog engine for this. Works very well. They let me open source
the framework:
[https://github.com/rabidgremlin/Mutters](https://github.com/rabidgremlin/Mutters)
it uses OpenNLP for intent identification and NER, Ink for conversation state
and "scripting"

------
Maarten88
This is interesting, I've been using LUIS for some time now and an open source
alternative - especially one that is drop-in API compatible - is very welcome.

However I can't find any information in the docs on how comparable the results
are (i.e. does it have built-in date and time entity recognition like LUIS?).
Most importantly: what languages does this support? All examples are in
english-only. Is it even language aware, or do you train a model in any
language? I'd be very interested if this were to support languages that LUIS
does not have (like my language: Dutch)

~~~
tmbo
Currently it supports english and german. In general we need a word embedding
for each language. If that has been created by someone else, it's rather easy
to integrate new languages.

------
ragebol
Nice, I've been looking for an offline solution to do this sort of thing to
run on a robot for RoboCup@Home.

Perhaps [http://sag.art.uniroma2.it/demo-
software/huric/](http://sag.art.uniroma2.it/demo-software/huric/) might also
provide some training data. It's annoying though I can't just download that
corpus but have to email some guy first.

------
niklasber
Seems like it doesn't say anywhere which language(s) it support? Guessing it's
English only.

~~~
tyingq
[http://rasa-nlu.readthedocs.io/en/latest/config.html](http://rasa-
nlu.readthedocs.io/en/latest/config.html)

 _" language : language of your app, can be en (English) or de (German)."_

------
mark_l_watson
Looks like an interesting project, based on skilearn and spaCy. The project
provides some simple training files for the domain of asking about
restaurants.

It would be useful to also have very large training data sets available.

------
nrp12
Cool stuff - was looking for Open source NLU alternatives for luis.ai. Thanks
to the emulators, this fits right in.

Does anyone know why rasa chose mitie/spacy and not stanfordnlp?

~~~
bendyBus
we could integrate with other backends, including NLTK & coreNLP. The stanford
stuff is under GPL though, prefer to promote startup-friendly licenses.

------
qhoc
How well does this scale? Let's say I have 500MB of JSON files from restaurant
info and user reviews.

~~~
samcodes
I think you would have to do some processing on those, I'm pretty sure the
input format has sentences classified by intent.

------
very_goord
Great stuff, Is there a Docker support already ?

~~~
bendyBus
yes! Docker Cloud isn't quite working yet but the Dockerfile should work :)

~~~
very_goord
Awesome!! Many thanks guys

------
Coldewey
like the idea :-) good job!

