I can recommend https://spacy.io for a low fuss solution to get you up and running quickly.
Oversimplifying quite a bit, if spacy focuses on syntax, then gensim focuses on semantics (https://en.wikipedia.org/wiki/Distributional_semantics). Gensim has an active community and is well documented.
If you have the data, can spend a few days experimenting and if you want something that can be orders of magnitude faster than deep learning to train, there's vowpal wabbit. Prediction speed is blazing. Results can be nearly state of the art but with a cost that's a great deal less. It's C++ but with bindings for many languages. It's very poorly documented.
I've never taken a gander at Facebook's fastText.
spaCy 2 is reconfigured for deep learning. It's still in alpha, but there's already a lot there:
I wrote a whole neural network library to get this done, because Tensorflow and Theano are terrible for the type of models NLP needs. Joke was sort of on me, because PyTorch came out just as I was finishing :). But it's actually very good to own the dependencies of the library anyway, since it's such an important part of things. Having our own NN library has made it easy to make lots of small innovations along the way. The best one is hash-kernel powered embeddings, which have just been published as "bloom embeddings". I've been using these for the last six months, with great results.
You can read my thoughts about NLP best-practices here:
Like Sebastian (and pretty much anyone else), I think the two improvements to emphasise in NLP are sequence models like LSTM, and transfer learning. That's what's better now with deep learning: what we used to call semi-supervised approaches and domain adaptation now work much better than they did before. Incidentally Ruder et al. (2017)'s sluice networks are an important recent paper on this: https://arxiv.org/abs/1705.08142
Going forward I think it's important that we get past just using word2vec to pre-train vectors, and start making it easier to use pre-trained LSTM and CNN models. Side-objectives in multi-task learning are also very important.
I don't think the APIs in spaCy around these things are quite right yet. There are also lots of trade-offs in sharing weights, that make things complicated for people. Sometimes weight-sharing gets in the way, because you want to just train this one part, and it's really weird that your updates are affecting other models you don't think you're touching.
For more idea of where Ines and I are going with all of this, you can read this: https://explosion.ai/blog/supervised-learning-data-collectio...
Basically I think the main problem people are having with NLP is that they don't want to commit to a problem and create training and evaluation data for it. Teams that don't bite the bullet and commit to their problem thrash around and don't get anything done. Even if you're using unsupervised techniques, you need repeatable evaluations.
We're preparing to launch an evaluation tool to address this problem. You can subscribe to our mailing list, RSS or Twitter to get the announcement when the beta is ready: https://twitter.com/explosion_ai
I think it is widely accepted that CoreNLP outperforms Spacy in terms of accuracy for POS and NER.
Spacy is very convenient and works well enough for most tasks. It can also load Gensim vectors.
StanfordNLP is pretty close to the state of the art for things like POS tagging (in English anyway. Not familiar with benchmarks in other languages)
As mentioned in the other reply, Spacy, Gensim, Fasttext and VW are great for specific things.
> One way to decrease the risk of vanishing gradients is to clip their maximum value
But probably helpful for a general picture if you're new to this stuff.
> Disclaimer: Treating something as best practice is notoriously difficult: Best according to what? What if there are better alternatives? This post is based on my (necessarily incomplete) understanding and experience. In the following, I will only discuss practices that have been reported to be beneficial independently by at least two different groups. I will try to give at least two references for each best practice.