Ask HN: Is python NLTK library still used for word tokenization? - pandeykartikey
======
mlthoughts2018
Yes, I work on a team that uses NLTK for lots of word canonicalization tasks
in an NLP-heavy search engine. There are other options that work well too, but
we have found NLTK to be very good, even at a large scale.

Our pipeline uses NLTK to take in a string of text, do word tokenization,
lemmatization and stemming, and construct bigrams and trigrams, as part of a
large map-reduce job for building text search indices.

