
A computational linguistic farce in three acts - lx
http://www.earningmyturns.org/2017/06/a-computational-linguistic-farce-in.html
======
YeGoblynQueenne
So, I understand this blog post is about something else completely (the
internet argument started by Yoav Goldberg on Medium, reportedly) but for me
the really interesting part is the historical information in it. I wish
Fernando Pereira could find the time to expound a bit on all those
parenthetical notes in his blog post, perhaps even write a short book on the
history of AI.

AI is kind of a strange beast like that: it's gone through a few very
different phases and it's difficult for one person to understand all of them
equally well. Which of course makes it even harder to avoid reinventing wheels
and repeating mistakes. A bit of history would do us all a world of good.

Btw, I'm getting the feeling most people here will probably hear of Fernando
Pereira for the fist time but he has a very long career in AI and NLP. He was
a prominent symbolicist, with some important contributions to logic
programming (he was one of the co-founders of Quintus, the company that sold
the first commercial Prolog, along with Warren, Byrd and others). Then he
turned to statistical AI and now he's a VP at Google (a.k.a. the den of the
connectionists, if I may be so bold). He's probably one of the few computer
scientists around who understands both symbolic and statistical AI in equal
measures. If anyone is qualified to talk about their relative merits, that's
him.

(and if I sound like a bit of a fangirl- that is because I basically am.
Pereira is one of my logic programming heroes and a great teacher to me,
albeit unbeknownst to him :)

------
paulsutter
This relates to the big Twitter uproar over this blog post:

[https://medium.com/@yoav.goldberg/an-adversarial-review-
of-a...](https://medium.com/@yoav.goldberg/an-adversarial-review-of-
adversarial-generation-of-natural-language-409ac3378bd7)

And here's the meat of his response:

> Idea! Let's go back to toy problems where we can create the test conditions
> easily, like the rationalists did back then (even if we don't realize we are
> imitating them). After all, Atari is not real life, but it still
> demonstrates remarkable RL progress. Let's make the Ataris of natural
> language!

> But now the rationalists converted to empiricism (with the extra enthusiasm
> of the convert) complain bitterly. Not fair, Atari is not real life!

> Of course it is not. But neither is PTB, nor any of the standard empiricist
> tasks, which try strenuously to imitate wild language

~~~
YeGoblynQueenne
My reading is that Pereira doesn't think that deep learning has quite
conquered language, and in this he's in complete disagreement both with
Goldberg and Le Cunn's side (who both champion deep learning for NLP and claim
that it has led to great advances in the field).

For me the problem with NLP and deep learning, or indeeed any empirical
method, is that the evaluation metrics we have are imperfect. Take BLEU
scores, from Goldberg's post, for instance. Those basically compare generated
text to some arbitrary target. Originally, they were proposed as metrics of
machine translation quality, so the target was some existing translation and
the machine-generated translation was examined for coverage of this human-made
translation. But of course, there is no principled way that we know of to
choose one translation over another- or even say whether a translation is a
good or bad translation, on its own. And that's true for translations by
humans also. You give the same text to 10 professional translators, they'll
give you 10 different translations. Then you give each of their translations
to 10 readers and ask them for their opinion, and you get back 100 different
opinions.

The translation task itself is not even particularly well defined, exactly
because there may be any number of valid translations (possibly, infinitely
many) of a piece of text in another language. So, with translation, we have an
ill-defined task with an arbitrary metric. And that metric of course is lifted
from its original task and used to evaluate language generation and so on.
Then someone comes along who knows how to train a deep net but has no idea
what the purpose of their chosen metric is, or what it does and has no
understanding of the task itself- and claims to have solved it because they
got good results on that metric.

It's a bit of a methodological mess that's not going to lead to much progress.
People can keep piling on these "results" for as long as they like and pretend
that they're "solving" this or that problem- but in real-world terms, nothing
is really being solved at all.

~~~
paulsutter
Google translate is now based on a neural network and you can be sure they
have solid metrics. By analogy Google search has a large panel of humans whose
subjective feedback is used to test the quality of search algorithm
variations.

~~~
YeGoblynQueenne
This is something that needs to be repeated until everyone internalises it:
for language pairs other than the "easy" ones Google translate sucks.

I am Greek and translations from and to my language are utterly ridiculous, on
the level of Bozo the clown doing the translation with his underpants on his
head back to front.

Typical example: I put in the Greek word for "swallow", the bird, and ask for
the French translation. I get back the word "avaler" \- the French word for
"to swallow", the verb.

That's my little benchmark there, useful because Google translate has been
doing this consistently, for a good few years, before it used neural networks,
before it started claiming its setup essentially constitutes an "interlingua"
etc etc.

Note that the bird and the verb sound nothing like each other in Greek, or
French. They sound the same only in English, so GT goes from Greek to French
through English. Because it doesn't have enough parallel texts to go directly
to French. And so it sucks, because it doesn't have enough data. You can ask
native users of other languages-that-are not-English or have few ish speakers,
perhaps Turkish or Hungarian etc. I'm pretty sure you'll find out they have
similar experiences.

So I don't know what metric they use to evaluate their results, it doesn't
seem to be a particularly good metric of translation quality. Maybe they just
care more about how many people use their system and try to optimise for that,
rather than going for the much harder to know quality.

~~~
ajuc
I'm Polish. I google translate even from Slavic languages that are very close
to Polish (Ukrainian, Slovak - it's like 50% understandable without
translation) to English not to Polish, because X -> Polish google translation
sucks.

------
JohnStrange
I kind of disagree with some of the premises in the article. I've seen an HPSG
for German in the late 90s that was able to parse almost any sentence I could
throw at it correctly from a syntactic perspective.

The main problem for natural language _understanding_ is not parsing and not
even the semantic and pragmatic representations per se, it has always been the
understanding. This requires an adequate knowledge representation and the
drawing of inferences from it, and I don't believe that any substantial
advances have been made in that field. Computational ontologies have grown
larger and there are more "frameworks" than you can count, but none of them
offer much knew and promising approaches like geometric meaning theories are
in their infancy. Knowledge representation and, generally speaking, the
problem of how to integrate different information sources in useful ways are
essentially unsolved problems.

Just my 2 cents. Note that I'm talking about the principal problems, not about
specific practical applications for which you can use the statistical
sledgehammer to some extent.

~~~
jesuslop
Recently Coecke comments on Gärdenfors geometric meaning in the context of his
categorical semantics that I'm finding interesting, in arXiv:1608.01402. What
I would welcome is a computational link relating that semantics and oldie
semantic-network based ideas. For instance in arXiv:1706.00526 description
logic based knowledge representation is cast in string diagrammatic,
categorical terms, and that at least puts the meaning realm in the same mathy
foot.

------
throwawuyar3231
I have to wonder if English is really the best language for NLP research.
Things like the Winograd schemas which have attracted a lot of attention
simply aren't possibilities in other languages.

Why not start working with more structured agglutinative * languages like
Japanese/Korean and Indic family (Sanskrit esp.) _.

How about other European languages ? Are they better structured empirically ?
I hear German is very grammatical, and that Hungarian is ... erm odd ?

(_ Note: I know occidental tradition likes to split Indic tongues, and Indo in
Indo-European is not considered agglutinative. I don't subscribe to this view.
I use _agglutinative_ in the sense of Panini: "particles" sticking to
stems/roots/words - phonetic modifications are irrelevant for grammar.)

~~~
lgessler
> I hear German is very grammatical, and that Hungarian is ... erm odd ?

Just want to point out that "grammatical" probably isn't the word you want
here. Every language is grammatical by definition in the sense that there are
rules that govern its sound system, word formation system, syntax, etc.

The concept you're getting at, though--that some languages are easier for
computer programs and/or speakers of Indo-European languages to understand--is
sound.

~~~
mark_edward
do you think analytic would be a good term here? i heard mandarin is very
analytic language, maybe that could be a good choice

~~~
WorldMaker
"Regular" would be the classic linguistics term, would it not? Although
computer science limits the term to the use of regular languages in the
Chomsky hierarchy sense (that is, more specifically to regular expressions and
the languages they describe), I am under the impression linguistics as a whole
treats regularity as a multivariate spectrum. Some languages have more
regularity in terms of grammar productions or morphology than English.

~~~
mark_edward
i meant analytic in this sense of the word
[https://en.wikipedia.org/wiki/Analytic_language](https://en.wikipedia.org/wiki/Analytic_language)

I don't know too much about computational linguistics but it seems highly
analytic languages could be easier to work with, but I'm not sure.

~~~
WorldMaker
That points to Isolating [1] and I think highly isolating may be the more
useful distinction to this specific example. (Modern English is rather
analytic, having dropped most, but not all, inflections in the Middle English
era. Mandarin Chinese is much more isolating than Modern English.)

[1]
[https://en.wikipedia.org/wiki/Isolating_language](https://en.wikipedia.org/wiki/Isolating_language)

------
throwaway-1209
The whole field of NLP and computational linguistics reminds me of that joke
where a drunk is looking for his keys under a street lamp instead of where he
actually lost them.

This is true in particular of anything that pertains to reasoning and
knowledge representation. People still are trying to "infer rules" and do
logical, rather than probabilistic reasoning. I get why that is. To me though,
the kind of real life reasoning that humans do seems heavily probabilistic and
contextual, Bayesian almost. And there's next to no notable work going on in
that direction.

~~~
johnbender
I don't think these two things are mutually exclusive.

As far as I'm aware there is work underway to take logical constructions and
integrate them with probablistic machine learning to do things like force zero
probabilities in impossible input cases. That is encoding domain knowledge
into the model directly in the form of symbolic reasoning.

I mean even Bayesian nets require some encoding of causality​ right? Maybe I'm
reading to much of "blah symbolic reasoning is worthless" in your comment?

~~~
throwaway-1209
It's not worthless, per se, it's just not a precursor to AGI in any shape or
form, no matter how much the researchers pretend otherwise.

~~~
johnbender
Worth reading maybe?

[http://reasoning.cs.ucla.edu/fetch.php?id=136&type=pdf](http://reasoning.cs.ucla.edu/fetch.php?id=136&type=pdf)

Abstract:

> We propose the Probabilistic Sentential Decision Diagram (PSDD): A complete
> and canonical representation of probability distributions defined over the
> models of a given propositional theory. Each parameter of a PSDD can be
> viewed as the (conditional) probability of making a decision in a
> corresponding Sentential Decision Diagram (SDD). The SDD itself is a
> recently proposed complete and canonical representation of propositional
> theories. We explore a number of interesting properties of PSDDs, including
> the independencies that underlie them. We show that the PSDD is a tractable
> representation. We further show how the parameters of a PSDD can be
> efficiently estimated, in closed form, from complete data. We empirically
> evaluate the quality of PSDDs learned from data, when we have knowledge, a
> priori, of the domain logical constraints.

Still working on my understanding but Professor Darwiche gave a lecture on the
material in one of my classes. Salient bit:

> The problem we tackle here is that of developing a representation of
> probability distributions in the presence of massive, logical constraints.
> That is, given a propositional logic theory which represents domain
> constraints, our goal is to develop a representation that induces a unique
> probability distribution over the models of the given theory.

------
danidiaz
When he talks about the "computational models of language" that ruled in the
80s, is he referring perhaps to stuff like Montague semantics?
[https://plato.stanford.edu/entries/montague-
semantics/](https://plato.stanford.edu/entries/montague-semantics/) Or is
Montague semantics merely a descriptive framework without practical
applications?

What were the main "practical" approaches for natural language understanding
back then?

------
wodenokoto
What is the Atari referred to here?

Does it have somethi g to do with the game playing AI from openAI? And if so,
how is that even related to NLP?

