
Flair: A simple framework for natural language processing - kumaranvpl
https://github.com/zalandoresearch/flair
======
elyase
The closest alternatives in this space would be allennlp [1], the recently
released pytext [2] and spacy [3]. pytext's authors wrote some comparison on
the accompanying paper [4] and this GitHub issue [5].

[1] [https://github.com/allenai/allennlp](https://github.com/allenai/allennlp)

[2]
[https://github.com/facebookresearch/pytext](https://github.com/facebookresearch/pytext)

[3] [https://spacy.io](https://spacy.io)

[4]
[https://arxiv.org/pdf/1812.08729.pdf](https://arxiv.org/pdf/1812.08729.pdf)

[5]
[https://github.com/facebookresearch/pytext/issues/110](https://github.com/facebookresearch/pytext/issues/110)

~~~
mkl
Do you know if any of these can be used for text prediction? (I.e. guessing
what the next word/token will be.)

~~~
yorwba
Text prediction is usually called "language modeling" in NLP. Because it's
useful as a weak supervision signal to improve performance on other tasks,
most of the mentioned libraries support it. However, they might not always
provide complete examples, instead assuming that you know how to express the
model and train it using the primitives provided by the library.

Flair:
[https://github.com/zalandoresearch/flair/blob/master/flair/m...](https://github.com/zalandoresearch/flair/blob/master/flair/models/language_model.py)

Allen NLP:
[https://github.com/allenai/allennlp/blob/master/allennlp/dat...](https://github.com/allenai/allennlp/blob/master/allennlp/data/dataset_readers/language_modeling.py)

PyText:
[https://github.com/facebookresearch/pytext/blob/master/pytex...](https://github.com/facebookresearch/pytext/blob/master/pytext/models/language_models/lmlstm.py)

spaCy seems to focus on language analysis and I couldn't find an API that'd be
directly usable for text generation.

~~~
plagtag
Flair looks really promising to me!

------
sweezyjeezy
I'd be wary of being dazzled by the performance metrics here. I've been
disappointed in the past using out of the box language models on data that
they weren't trained on, e.g. SpaCy. I feel like people put too much emphasis
on trying to get the high score on benchmark datasets, and they're
overtraining to those particular domains.

For example, try using named entity models trained on CoNLL (newspaper
articles) on free text (e.g. tweets, or text from application forms), you
generally get pretty bad results. When the domain is different, I've even seen
them screw up basic things like times and dates, where regexes will suffice.
If you're using it for newspaper articles you're sorted, if you're not, the
performance metrics here are probably not all that meaningful.

~~~
sqrt17
Here's the thing: the authors, and most people in the NLP community (as
opposed to people who can use off-the-shelf tools and nothing more) know how
to get better performance on other domains. It's just that "it requires some
adjustments and manual labour" doesn't add anything to the "deep learning
solves the problem" narrative that is predominant in research articles and
most blog posts on the topic.

On the other hand, you can bet that actual practice in Zalando (the authors
are all from Zalando's reasearch lab) involves more regexes and retraining
models on proprietary datasets and less using off-the-shelf models and hope
they stick.

No one claims either that you can solve every Vision problem with a model
trained on ImageNet - you'd do transfer learning, or for non-understanding
problems (estimating colors and contrast or anything else that's unrelated to
objects in the image) you'd use something else that doesn't involve deep
learning models at all.

~~~
sweezyjeezy
Right, but my point is - once you start needing to add ad-hoc retraining, or
regex hacks, it's not clear to me that shaving a point off baseline f1 scores
is really all that relevant anymore.

~~~
yorwba
You'd have to do the same modifications to the baseline models to adapt them
to a different domain. If they managed to shave off percentage points on a
large number of benchmarks, then it's likely that using their models will also
help you with the task you care about.

~~~
sweezyjeezy
Not convinced. Pretty much all baseline NER datasets are on news corpuses,
which are written in well formatted prose, tend not to have spelling mistakes,
abbreviations, bad punctuation, etc. etc. Why do you think that a better
performance on these kinds of datasets will translate to better performance in
other domains? I wouldn't even be surprised if it's the opposite, maybe it
relies more heavily on these assumptions. The truth is there is no way to know
a priori - you need a different kind of benchmark to test this.

~~~
pdyck
I had the same experience when trying to do NER on customer support requests.
My model performed great for research datasets but it was mediocre at best for
my own dataset. Do you have any suggestions on how to achieve better results
in domains where mistakes, bad punctuation, etc are common?

~~~
edraferi
Label more training data.

Do more clustering.

Label more training data.

Strip out more garbage.

Label more training data.

PS you can get an idea of how much value additional training data will give
you by training models on various subsets of your dataset (e.g. 10%, 20%...),
evaluating them against the same test dataset, and plotting the results.

------
roman030
Is research in commercial companies becoming a driving factor in today's
scientific achievements?

~~~
cleansy
That's the free market at work I would guess. Zalando seems to see NLP
research as a valuable asset so they pay for projects like this plus they open
source it assumably because of employer branding. I don't see anything wrong
with that.

~~~
pploug
Zalando Research is an internal team of researches, their work is primarily
shared with the research community through publications:
[https://research.zalando.com/welcome/mission/publications/](https://research.zalando.com/welcome/mission/publications/)

In the case of Flair, research led to a reference implementation, which was
then matured through internal use and open sourced to further mature it and
get external feedback. While employer branding is a nice benefit, it is a
positive side-effect, not the motivation in itself :)

------
duncanawoods
Can someone explain what the expected behaviour is with punctuation and Named
Entity Recognition? Is there an assumption that punctuation is preprocessed in
some form?

I'm a noob but it's not what I expect - periods change what is extracted in
inconsistent ways.

e.g.

"I love Berlin." -> "Berlin."

"I love Berlin ." -> "Berlin"

"George Washington loves Berlin." -> "George Washington"

"George Washington loves Berlin ." -> ["George Washington", "Berlin"]

~~~
slx26
If you go to their first tutorial, "Tutorial 1: Basics", you will see this
comment in the code: "# Make a sentence object by passing a whitespace
tokenized string"

In that simple example you posted, they already did the tokenization manually,
as it's pretty trivial. But yes, in many cases, you have preprocessors that do
the tokenization. In some libraries, you actually have a class/object_type for
tokens, but it's pretty common to just preprocess and take every space as a
token separator.

In some contexts and cases, it's possible to see tokens like "social_network",
where multiple words are considered a single token.

In that first tutorial, they also mention they have a tokenizer if you need
it: "In some use cases, you might not have your text already tokenized. For
this case, we added a simple tokenizer using the lightweight segtok library."

So for your example you would simply run the tokenizer first, then the named
entity recognition.

EDIT: apparently you can do this directly: "sentence = Sentence('The grass is
green.', use_tokenizer=True)"

~~~
duncanawoods
Thanks. I assumed it was French style punctuation with a space before the
period rather than space tokenization!

~~~
yorwba
French doesn't put spaces before single-part punctuation like periods or
commas.

------
dajonker
I like the way Zalando is doing tech. They release (and maintain!) tons of
open source stuff in several domains.

------
EmilStenstrom
Flair 0.4 was released just 14 days ago, and contiains LOTS of improvements
for a point release:
[https://github.com/zalandoresearch/flair/releases](https://github.com/zalandoresearch/flair/releases)

------
tgrzinic
What are the main advantages of Flair over Spacy?

Is it easy to add a new language in Flair? In Spacy adding a language looks
pretty straightforward.

------
Nimitz14
Nice! Was using stanford POS tagger which is both bad in quality and sooo slow
in execution. Looking forward to trying this out.

------
vstik
How does it compare to managed NLP services like Google's Cloud Natural
Language API and AWS's Comprehend?

~~~
sqrt17
Google and Amazon have proprietary datasets for important sub-tasks (e.g.
recognizing "consumer good" entities, or more accurate sentiment recognition,
or supporting other languages better) that are not available to the public.

In other words, if your problem looks like one of the benchmarking tasks in
NLP research (e.g. recognizing persons and locations in fluent text) you can
expect good performance out of open source tools. If you go beyond that, you
have to concoct your own dataset and/or use proprietary cloud services.

~~~
dhairya
would you mind linking to ones you are aware of. it would be super helpful.

------
daolf
Very great and elegant API, look promising. Has anyone tried it? How does it
stand against NLTK, Spacy or gensim ?

~~~
sqrt17
It solves different problems. flair doesn't include parsing, Spacy doesn't
support embeddings, gensim doesn't do tagging or parsing at all but contains
the most practical word2vec implementation. NLTK is nice for learning, but
don't use it in production unless you're ready to reimplement things in a more
efficient way when parts start falling off.

The message is - again - learn about the mental framework, not individual
tools, to understand where each strengths are and what the gaps are in between
them. Or choose a problem, find the best tool for that problem and get
progressively better at the tool(s) that helps you with most of your problems.

