
Pytext: A natural language modeling framework based on PyTorch - myinnerbanjo
https://github.com/facebookresearch/PyText
======
syntaxing
Wow, I just used the AllenNLP mentioned here and it is quite amazing! I took a
random article from Google news which happen to be about Flynn's FBI
criticism. I asked a couple question like "who is going to jail" or "who is
leading the investigation" and it worked flawlessly. The article is only
around 15 sentences too!

Edit: Super wow, the documentation is amazing as well
([https://allennlp.org/tutorials](https://allennlp.org/tutorials)).

~~~
thanatropism
A few years ago I read the novel "Galatea 2.2" by Richard Powers which is all
about training a neural network to do just that and thought "now this is some
bullshit".

------
mendeza
I love textacy, it has soo much out of the box. Topic modeling, topic
extraction, summarization, and its built on top of Spacy.
[https://github.com/chartbeat-labs/textacy](https://github.com/chartbeat-
labs/textacy)

------
laughingman2
If anyone from the dev team there, can you look into integrating the "make the
research into production" part into allennlp. Facebook currently has fairseq,
this and other nlp repos. Allennlp makes it easier to model most classes of
NLP problems with a clean dependency injectable interface with most common
tasks abstracted out cleanly.

~~~
smhx
PyTorch dev here. We'll talk with the AllenNLP folks to see if we can make
this can happen.

We just released PyTorch 1.0 stable last Friday that adds stable production
capabilities, so the stuff is just out of the oven.
[https://github.com/pytorch/pytorch/releases/tag/v1.0.0](https://github.com/pytorch/pytorch/releases/tag/v1.0.0)

~~~
joelgrus
AllenNLP dev here. We're going to do a "PyTorch 1.0" release of AllenNLP next
week, and then after that we're planning to investigate how to incorporate the
new "production" aspects.

~~~
sh33mp
Could you guys elaborate on the relationship between PyText, torchtext, and
AllenNLP? I've briefly used the latter two, but with how quickly things are
moving it'd be nice to have a quick answer from the devs themselves.

~~~
ahhegazy77
PyText dev here, Torchtext provides a set of data-abstractions that helps
reading and processing raw text data into PyTorch tensors, at the moment we
use Torchtext in PyText for training-time data reading and preprocessing.

AllenNLP is a great NLP modeling library that is aimed at providing reference
implementations and prebuilt state-of-the-art models, and make it easy to
iterate on and research with models for different NLP tasks.

We've built PyText to be a rich NLP modeling library (along the lines of
AllenNLP) but with production capabilities baked in the design from day 1.

Examples are: \- We provide interfaces to make sure data preprocessing can be
consistent between training and runtime \- The model interfaces are compatible
with ONNX and torch.jit \- A core goal for us in the next few month is to be
able to run models trained in PyText on mobile.

Among other differences like supporting distributed training and multi-task
learning.

That being said, so far our library of models has been mostly influenced by
our current production use-cases, we are actively working on enriching this
library with more models and tasks while keeping production capabilities and
inference speed in mind.

------
wodenokoto
What is a good data structure for holding your parsed corpus? Ideally I'd like
to be able to count number of sentences, paragraphs, average word counts for
these and easily do queries such as "nouns that fit this regex" or "POS that
precedes a named entity"

I've been looking at Spacy, but as far as I can tell it is hard coded to use
universal parts of speech.

------
xfitm3
What's NLP in this context?

~~~
jimsmart
Natural language processing

~~~
xfitm3
Thanks. I had assumed Neuro Linguistic Programming.

~~~
jimsmart
Agreed, it is potentially ambiguous if one doesn’t follow the field.

Clicking the link takes you to the Github repo, which states ‘natural language
processing’ in its title (though perhaps it didn’t earlier).

The title of this HN post has been edited now anyhow.

------
wiradikusuma
Is this like bare Dialogflow?

