
Norvig vs. Chomsky and the Fight for the Future of AI - fogus
http://www.tor.com/blogs/2011/06/norvig-vs-chomsky-and-the-fight-for-the-future-of-ai
======
knowtheory
It's a little bit frustrating to read a rehash of an argument that was cutting
edge _maybe_ back in the late 90s, especially one that is so poorly written,
and framed as a battle between two intellectuals.

Chomsky's past his heyday. He has been seminal in his field, but he's no
longer doing research which pushes at the boundaries of our understanding of
language, how to model it, or what the fundamental nature of language
understanding systems is. (as one might infer, I come from a non-chomskyian
school of linguistics).

Given that we have actual data and research about large scale systems that do
interesting things (including the massive artificial neural network that
google built last month, see:
[http://www.wired.com/wiredscience/2012/06/google-x-neural-
ne...](http://www.wired.com/wiredscience/2012/06/google-x-neural-network/) )
reporting as substance free and obfuscating as this is, is a real frustration,
when we could be talking about more interesting things, such as what a solid
operational definition of meaning is, or how exactly heuristic/rule based
systems actually differ from statistical mechanism, and whether or not all
heuristic systems can (or should) be modeled with statistical systems.

The framing of this article is particularly galling because there are so many
non-chomskian linguists out in the world who operate fruitfully in the
statistical domain. Propping Chomsky up as somehow representative of all
linguists is pretty specious and a bit irritating.

~~~
mcguire
Non-Chomskian linguists? I was under the impression that, post-Chomsky,
linguistics was defined as Chomskian; everyone else had left the field for
whatever related discipline most closely matched what they wanted to do.

~~~
knowtheory
I mean... there are core things at the basis of what everyone agrees on in
linguistics (language is structured, learned, and there is a defined syllabary
of sounds that humans make and use in language, etc), but Chomsky's grand
unified theory of language, and his ideas about how those mechanics function
are not universally agreed upon.

Even within syntax, which is really the sort of core of what Chomsky has been
interested in, there are other formalisms to represent syntax which differ
from Chomsky's theoretical framework. You can read more about Chomskian
transformational grammar on wikipedia with links off to other sorts of
formalisms as well: <http://en.wikipedia.org/wiki/Transformational_grammar>

~~~
mcguire
" _a defined syllabary of sounds_ "

Actually, that was one of the examples I had in mind. I was under the
impression that the people interested in languages and sound had left
linguistics in favor of phonetics, and as a result there was little interest
in the interaction between the two.

My difficulty with the Chomskian method, as opposed to my ignorance about
linguistics, is actually based on the "formalisms to represent syntax", since
application of application of formalisms to natural language seems to be to be
problematic. To quote the Wikipedia page you mentioned,

"Chomsky noted the obvious fact that people, when speaking in the real world,
often make linguistic errors (e.g., starting a sentence and then abandoning it
midway through). He argued that these errors in linguistic performance were
irrelevant to the study of linguistic competence (the knowledge that allows
people to construct and understand grammatical sentences). Consequently, the
linguist can study an idealised version of language, greatly simplifying
linguistic analysis...."

At the time, my impression from more neurological reading was that the errors
were rather more interesting (<http://en.wikipedia.org/wiki/Aphasia>).

~~~
knowtheory
Phonetics is most assuredly part of linguistics, unless you redefine
linguistics just to mean "syntax" (No linguistics department i'm aware of
makes such a distinction).

And yep, there's a whole sub-discipline of psycholinguistics which definitely
learns from things like speech pathologies.

------
phaedrus
I spent about ten years working on Markov based chat programs. I gave up on
themwhen I realized that no matter how sophisticated your statistical model it
will never be more than a statistical analysis of text, unless it includes
some rich rule based model of mental processes and mental objects. It may be
that such a model of mental processes must itself be fuzzy and probabilistic,
but it must exist. Therefore I come down firmly on the side of Chomsky in this
debate: we should pursue theories of intelligence, and stastical models
without any theory do not advance our scientific understanding of AI, however
practical their application may be at the present time. This is not to say
statistical methods do not work, of course they work, what I am saying is it
is not a path that leads to true understanding of intelligence any more than
spectral analysis of the EMF emissions of a running computer would lead to a
theory of computation.

~~~
knowtheory
Just because you chose Markov chains as your modeling mechanism doesn't mean
that there is _no_ statistical modeling method that is capable of developing
something passing for what we'd call "meaning".

This is the same argument that was used against artificial neural networks.
Neural network of type A can't do X, therefore neural networks will never do
Y.

Language is immensely complex, and real human language involves things which
are not encoded in text (and i'd remind you that you were trying to infer
meaning from _text_ specifically, not the full multi-channel robustness of
humans communicating), we don't even have a full handle on what all of the
cognitive processes and factors _are_ that go into the production and
understanding of language (although we've developed a lot of interesting work
to those ends).

So hearing folks give up claim that Chomsky is correct because our current
tools aren't up to the job is a bit puzzling, because we don't even have a
complete understanding of what sort of thing language _is_ or what sorts of
things _we_ are as systems which can use language.

Chomsky has _opinions_ (and some facts) about what language is, and _we_ are,
but he does not have solid proof to confirm his specific conjectures. Is human
language context free? context sensitive? Something else? (Chomsky's
minimalist program uses movement along a tree to preserve referentiality and a
bunch of junk, alternative syntactic frameworks such as HPSG uses directed
graphs as the basis of their language modeling. Still others do weirder things
like higher order combinatoric logics. And unfortunately none of the
theoretical frameworks appear to be without their drawbacks)

~~~
neilk
I am not a specialist, but as far as I know, Chomsky's argument here was that
the existence of recursion showed that a Markov approach had to be wrong.
Surely a similar argument can be made for statistical approaches? There is no
way to represent a reference to some other part of the statement in a purely
statistical method. If they work they happen to work basically by accident.

Just blue-skying here, but it seems to me that if I knew enough about how a
statistical program worked, I could craft a sentence that would utterly
confuse it, even though it was perfectly intelligible to a normal English
speaker. A putative strong-AI program could not be fooled in this way.

~~~
knowtheory
Except that his argument is somewhat moot as a practical matter, because there
are no infinitely recursive sentences (given that all sentences are finite).

Long distance dependencies are an issue in language modeling that do need to
be accounted for, but all that tells me is that Markov chains aren't the right
structure to model language (unless, maybe you had a MASSIVE amount of data,
and a markov chain of an order high enough that you account for the majority
of sentences. maybe).

------
robg
This is one of those rare moments in intellectual life where being in the room
and now seeing the debate develop, it becomes clear that the resulting hype
isn't (wasn't) loud enough.

This distinction marks the real turning point in AI from abstract, grand
claims with highly restrictive evidence toward engineering that simply works.
Who cares about the ontology when we can recreate? It's like saying airplanes
don't properly explain flight because they don't replicate how birds do it.
Who cares? We can fly (and translate and soon reason) artificially.

It's clear that Chomsky and Universal Syntax has held back the entire field of
AI (and at MIT). There isn't one algorithm in the human mind to decode all of
our mental capabilities. That's mistaking subjectivity for objective lessons.
Trying to recreate that Phantom has led to rule tables in AI, constraints on
how the mind must operate. Instead, by allowing those fuzzy boundaries to
accumulate with evidence, statistical approaches win in the long-term of our
lives and in this debate.

Kuhn knew what happens to dinosaurs.

~~~
_delirium
I don't think I would take that strident battle-against-dinosaurs view, in
part because I think intellectually understanding things is useful in and of
itself, not some kind of "just build it and shut up" anti-intellectual view;
and also because I don't think it's an accurate summary of the history of AI.
There's no particular reason we can't both build and study things in various
ways, and the history of AI has been full of people doing many takes on both.

In particular, statistical approaches have been used for a long time, but were
not practical until fairly recently; it was the lack of "big data" computing
power holding them back more than anything. Statistical machine translation
and parsing experiments have been tried on and off for decades, but with
1950s-era data they produced total garbage as output, even worse than the
(also bad) symbolic approaches. Hence why Shannon's work on text processing
didn't produce practical NLP or NLG systems. It took Google-sized data to
produce statistical translation that was actually usable.

What numerical approaches _were_ possible on computers of the time were fairly
extensively investigated when they became possible (e.g. the 1980s focus on
"sub-symbolic AI", with perceptrons, neural networks, numerical regression
methods, etc.). Some were shelved for years because they just didn't work as
well, e.g. symbolic game-tree search massively outperformed machine learning
in board games in the early experiments, which is why Samuels's 1950s ML-based
checkers player was theoretically intriguing but not considered very
practical.

~~~
slurgfest
Intellectually understanding things is useful. There's no anti-intellectualism
here.

It's become increasingly clear that Chomsky's approach to language is not
going to generalize to AI successes or explanations of many domains of human
performance, as was promised. That approach has not yielded fruitfully
anywhere except the narrow realm of syntax (in particular typologies of
syntax). While machine learning has kicked butt left and right.

There is no reason people can't go on studying syntax at the same time machine
learning expands. But it would be dishonest to ignore the extreme and
exclusive claims laid down by Chomsky's school at the outset of the cognitive
revolution. These claims were given a very liberal benefit of the doubt for
decades. A lot of good work has been done in syntax. But now those claims
about AI and psychology-in-general are clearly threadbare, since they have not
yielded the promised fruits either in AI or in explaining human functions.
They have not even yielded plausible IOUs. Remember, this was supposed to
explain pretty much everything. Not even language acquisition has been
explained.

That is empirical inadequacy. Who cares if Chomsky thinks it's beautiful?

And it's equally clear that learning of various kinds has to occur and also
gives us the more parsimonious and elegant solutions to problems (no-solution
and no-explanation is not elegant even if it is simpler).

Machine learning has a great deal of conceptual beauty - if you study it
rather than pooh-poohing it because of some facile abductive argument to UG.

Computing power isn't a problem. The scale of the human brain is enormous and
we are still nowhere near reaching it with our computational resources.

~~~
ddd571
What about research showing things like intermediate traces exist, we parse
sentences in accordance with binding principles, we don't postulate gaps for
fillers in sentences in island contexts unless there'll later be a potential
filler gap? All of these are facts we know from abstract work in Chomskyan
syntax, and yet they've been replicated in laboratories. So, your claim that
there's no fruits is just simply empirically false

------
rm999
I've been in machine learning/AI for ten years now - from undergraduate
research, to graduate school, to industry - and I find debate like this
fascinating. My take on it is that our understanding of what we will be able
to do in the future is very unclear, and what we will want to do is very open-
ended. So the debate is worth having, but it won't really resolve anything.

Statistical models may (in my opinion probably will) end up being an "AI"
dead-end, eventually falling into other fields such as algorithms, like game
trees and logic-based agents did. That's not to say the current statistical
approach is a bad idea; on the contrary, I think these techniques are useful
and simple enough that they will become fairly ubiquitous in CS.

On the Chomsky side of the argument, AI researchers have consistently been
frustrated in the past 50 years, to the point that studying AI today makes you
sound like a joke. But their goal is a noble one. Anyone can understand how
great it would be to have a human-level intelligence on a chip - this would
fundamentally change the World. The fact that we haven't dented this problem
doesn't mean the problem isn't worth solving, it just means our understanding
of what it takes to build this kind of AI is in its infancy.

I almost feel like Norvig and Chomsky are arguing in parallel. They are both
right, but their arguments are valid on different time scales. Today, the
Norvig approach will easily win out; Chomsky has nothing and is largely
irrelevant. But Chomsky is, IMO, correctly predicting what will need to happen
to move beyond an eventual roadblock in a much grander AI.

------
debacle
They have two different definitions of "artificial intelligence," which is
where the schism seems to be arising from.

Chomsky takes the academic approach - artificial intelligence is the
simulation of humanlike (or even possibly mammalian) intelligence.

Norvig is taking the engineering approach - artificial intelligence needs only
to pass the Turing test.

They're both right, both approaches have value, and they both are bound by our
limited technology at the moment.

In the end, though, Norvig will lose out. Sure, he'll make the finish line
first - an AI capable of 'passing' the Turing test, but in order to have real
intelligence you need an analytical engine (or brain, if you will) that can
prioritize data without fiddling with bits. In the Norvig solution, someone
will always have to be fiddling with the bits.

Chomsky's approach, on the other hand, will result in a 'true' artificial
intelligence, the way neurologists understand it. It's just going to take a
lot longer to get there.

~~~
jan_g
But what exactly is _true_ artificial intelligence? For example, I consider
Google search and Wolfram alpha very intelligent. They can do math, answer
questions, rank information, follow current events, ...

~~~
debacle
They're still computers in the traditional sense - they only do what someone
told them to do.

~~~
zumda
You are right, but to solve most problems, do machines really HAVE to think
like humans?

~~~
zumda
(can't answer you directly, so I'm doing it here)

My point was, that, for example, language recognition doesn't need human
intelligence, a statistical model is enough.

Or driving around a confined space only requires a particle filter, not human
intelligence.

------
azakai
First thing, please read the actual article by Norvig, it is excellent,

<http://norvig.com/chomsky.html>

Second: I found it astounding that the article never mentions Skinner. Surely
this article is trying to do to Chomsky what Chomsky did to Skinner in 1959
("A Review of B. F. Skinner's Verbal Behavior",
<http://www.chomsky.info/articles/1967----.htm> ).

Chomsky basically marked the beginning of modern era of cognitive psychology
with that essay, displaing the previous paradigm of behaviorism. Norvig's
article has similar form in some ways to that article, and similar goals (to
argue for a new paradigm over an older one). As I was reading it, I was sure
Norvig had that context in mind. So I was surprised to read

> So how could Chomsky say that observations of language cannot be the
> subject-matter of linguistics? It seems to come from his viewpoint as a
> Platonist and a Rationalist and perhaps a bit of a Mystic

Well, no, Chomksy explained very well why he opposed observations being the
subject matter of linguistics in his 1959 essay. Skinner's behaviorism looked
only at observations and experience, and did away entirely with internal
mental states. That might seem bizarre to us today, and the reason is in large
part the shift heralded by Chomsky's article from behavioral psychology to
cognitive psychology. In the latter, the goal is to understand the internal
processes that are involved in psychology (or specifically language).

Statistical language models are not behaviorism. But they do share a lot with
it, they are based primarily on raw empirical observations as opposed to deep
models, so it is natural for Chomsky to oppose them on similar grounds (and
not due to Platonism or Rationalism, although I suppose you can speculate that
those motivated his 1959 essay too).

Side note, we can speculate that if Skinner had today's computers and
statistical modelling methods, the shift from behaviorism to cognitivism might
never have happened, seeing as the statistical approach is so successful.

------
orbitingpluto
I know a card counter. I showed him how to condition probabilities to
determine how to best play. He went for the full Monte Carlo method and he
lets his simulation run for a week before he starts using it "just to make
sure". It's frustrating because he doesn't get that his results are
statistically significant after about 30 seconds of runtime. He still makes
money doing it. The results are tangible, but he's still just mucking about.

'Quantum mechanics is certainly imposing. But an inner voice tells me that it
is not yet the real thing. The theory says a lot, but does not really bring us
any closer to the secret of the "old one." I, at any rate, am convinced that
He does not throw dice.' --Einstein

Statistical methods can work but they are unsatisfying to the scientifically
curious. You're not really a scientist if you create something that works and
you don't really know why. (Not to say that the method doesn't have value.
Sometimes you have to play with your Lego before you grow up.)

~~~
bluekeybox
> You're not really a scientist if you create something that works and you
> don't really know why.

According to your logic, the only true "science" is mathematics. If you test
the workings of your "creation" using scientific method, you're still a
scientist. Scientific method is also about testing your claims empirically,
and it has been successfully applied for more than a century to study of
biological organisms, climate, and other complex systems that we do not
"really" understand. Not to berate understanding of underlying mechanism which
is always preferable, just to point out that there is more than one way to
skin the cat.

~~~
orbitingpluto
I notch you up a point good sir for discovering my bias. (mathematics)

~~~
bluekeybox
I guess I know your bias because I'm biased towards mathematics as well :).

------
VikingCoder
I picture Chomsky as Kepler, trying to build orbits out of Platonic solids.

Until Kepler had access to Brahe's data, he was not going to be able to come
up with his theories of planetary motions.

Worse than that, the laws of planetary motion present a simplistic view of the
universe: what happens when a bunch of small objects orbit a very massive
object. I think they wouldn't help you out at all, in trying to understand
planets moving in a binary star system.

There is no analytic solution to the N-body problem. We can only simulate the
motions of a group of massive bodies by iteratively applying the laws of
gravitation that we have deduced. Knowing the mathematical properties of how
objects behave in a gravitational field, and actually understanding HOW
GRAVITY WORKS are two enormously different things. Newton was frustrated with
the theory of Gravity, because it was, as Norvig's models, just a model - with
no explanation of why. But the model allows you to make falsifiable
predictions, and understand how the universe will behave. Looking for the
Higgs Boson is awesome - but there is potentially no equivalent in the
linguistic world.

Chomsky asks us to ignore F = G * m1 * m2 / r^2, because there's no WHY
attached to it.

PS - this understanding of the history of science is brought to you by Carl
Sagan's Cosmos TV series. I have no deeper insight than that.

~~~
OmegaHN
I think it is the other way around. Chomsky is trying to find the underlying
structure of intelligence (just like gravity underlies planetary motion), and
is saying that others are simply trying to generate a model of intelligence
(through statistical methods) with no understanding of why the intelligence
behaves that way. Gravity is the why, planetary motion is the model produced
by data (acquired by Brahe).

~~~
VikingCoder
What will prevent Chomsky from having an Earth-centric model of the solar
system, with epicycles to explain all of those weird little ticks like dropped
pronouns?

The only thing that could possibly break you out of that way of thinking is
massive amounts of observational analysis to show you that your foundation is
flawed.

Seeing the moons of Jupiter revolutionized physics. Chomsky says that
observing the heavens is unnecessary, and a distraction from his studying of
the motion of billiards balls.

He's got trigonometry down cold, but he'll never come up with calculus that
way. And quantum mechanics would never fall into Chomsky's way of thinking, in
my analogy.

------
mootothemax
Isn't this basically an argument over John Searle's Chinese Room thought
experiment?

 _It supposes that there is a program that gives a computer the ability to
carry on an intelligent conversation in written Chinese. If the program is
given to someone who speaks only English to execute the instructions of the
program by hand, then in theory, the English speaker would also be able to
carry on a conversation in written Chinese. However, the English speaker would
not be able to understand the conversation. Similarly, Searle concludes, a
computer executing the program would not understand the conversation either._

<http://en.wikipedia.org/wiki/Chinese_room>

~~~
sp332
The argument is whether a computer can learn language (well) from scratch, or
whether some capacity for language must be built into the computer manually.
<https://en.wikipedia.org/wiki/Poverty_of_the_stimulus>

~~~
mootothemax
Ooh, I see, interesting, thanks! :)

------
brudgers
Intellectually, there seems to be something as wrong with avoiding
anthropomorphism when discussing human endeavors (such as language) as there
is with anthropomorphic explanations of erosion or chemical reactions.
Skinnarian approaches to language may leave people unsatisfied because there
is no story, just clinical observation.

Norvig's approach (as characterized in the article) takes the the "Artificial"
in "Artificial Intelligence" to include the mechanism by which an intelligence
makes decisions. Chompsky's aesthetic of linguistics applied to AI would treat
"Artificial" as a description of the platform in which an intelligence is
embodied (i.e. non-biological) while requiring the platform to operate
linguistically on the same principles as a "natural intelligence."

Norvig's approach (as characterized in the article) is essentially a better
Eliza (or Ford's faster horse).

If one takes the Turing Test as scientifically meaningful rather than an
engineering standard, then one falls in one camp or the other and the Norvig
Chompsky debate is over a pseudo-problem. "Artificial Intelligence" is in that
sense metaphysical jargon.

~~~
slurgfest
Skinner's book Verbal Behavior was mostly unsatisfying because it didn't have
a lot of data; it really just laid out a research program which had not been
carried out in any significant way (and now, never will be). Of course it is
also unsatisfying that Skinner does not appeal to our sense that we already
understand everything important about psychology and language "from the
inside" and don't really need any stinking data.

The reason most people are unsatisfied with Skinner's approach to language is
that they did not read Verbal Behavior, but rather Chomsky's review; and
because Chomsky chose it (as among Skinner's weakest work) and reviewed it in
the most uncharitable way possible, without understanding any of the basic
concepts or motivations to Skinner's approach.

So, for example, he successfully associates Skinner directly with Watson, and
makes it out that "radical" behaviorism is radical not for its rejection of
premises of classical behaviorism but for being even more crazy.

That review is a masterpiece of propaganda and it effectively prevented
Skinner's basic ideas from even being seriously evaluated ever again.

~~~
brudgers
Just to clarify, my reference to Skinner was to be behaviorism in general or
even more generally, radical empiricism in regards to human activities.

------
Jun8
OK, let me start with two facts, one objective, one personal: (i) Noam Chomsky
is a genius with many contributions to linguistics and computer science (ii) I
think his overall influence had been damaging to linguistics.

Here's a summary of Chomsky's career in layman's terms: As everyone knows,
Chomsky first came to prominence with his critique of Skinner (who, as
everyone also knows, was a total psycho). He pretty much created linguistics
as we know it (at least in the US, there were some numbskulls in Europe who
still doubted the new order), starting from the main thesis of linguistic
universals, which can be summarized as the fact that all humans possess _the
same_ language faculty, i.e. the wide range of linguistics differences
between, say, English and Mandarin are just on the surface. This was a welcome
relief against the Sapir-Whorf mumbo-jumbo which held that Eskimos had
hundreds of words of snow and language constrained how we think. Chomsky has
also been very active in politics (he's actually much better known to the
general world by his political books), pointing out the evils especially of
the American brand of capitalism (is there any other kind?) and its corrosive
influence on the world, e.g. Iraq, Afghanistan, etc. He also points out errors
in certain approaches in Economics, e.g. see
<http://en.wikiquote.org/wiki/Noam_Chomsky#Capitalism>, without holding a
degree in the field, but everybody does that.

Chomsky's greatly damaging influence to linguistics is due to the fact that
his speculative and simplistic (at least originally) views on how the brain
processes and learns language has stifled research in promising fields by
decades. The main problem I have with him is that the cause of the
shortcomings of his theory seems to be not lack of knowledge (very little was
known about cognition in the 60s), which, of course handicaps all pioneers of
science, but politics (I detest politically motivated scientific theories).
AFAIK, his universalist views were motivated from his political beliefs.

Luckily, starting in the 90s, Chomsky's chokehold on linguistics has slipped
somehow. Researchers, such as Leda Cosmides, have ventured into research on
linguistic relativity (<http://en.wikipedia.org/wiki/Linguistic_relativity>).
Skinner's theories are making a comeback in academic circles
([http://www.theatlantic.com/magazine/archive/2012/06/the-
perf...](http://www.theatlantic.com/magazine/archive/2012/06/the-perfected-
self/8970/)).

So, what does all this mean for the current debate? I think it's time to
retire and the "old guard"! Let us acknowledge their breakthroughs, their
contributions, but also their limitations and move on.

~~~
zzzeek
I'm completely ignorant here - is it widely established that there's no link
between how we think about things versus the languages we speak ? It seems
intuitive that our ability to conceive of concepts would be dependent on
having/creating language that can describe them, even in our own minds, but I
have no idea how anyone could really know one way or the other.

~~~
Jun8
As you point out, this is a hard thing to test. Add to this the fact that the
question may be a sensitive one, similar to differences between men and women.

How can one go about testing the effect of language on thinking? Consider this
example: English has an explicit grammatical structure for counterfactuals
(CFs), e.g. "If I were a rich man ..." whereas some languages, e.g. Chinese
(Mandarin) do not (they do have some other means but not as overt). One can
then think presenting stories containing complex CF situations to native
English and Chinese speakers, and somehow test how quickly they grasp them.
This exact experiment was performed by Alfred Bloom in 1981 and indeed showed
some differences. Later researchers noted some points that might have affected
the results. You can see why this research may be sensitive, it may be
mistakenly used to argue that Chinese speakers are somehow linguistically
deficient.

~~~
pyoung
I am no expert in this area, but I believe this is a fairly good example of
the issue you are describing. According to the study, it appears as if the
gender system used in some languages appears to bias individuals perception of
the object.

[http://public.wsu.edu/~fournier/Teaching/psych592/Readings/G...](http://public.wsu.edu/~fournier/Teaching/psych592/Readings/Gender_Grammar.pdf)

------
PaulHoule
Well, in the big picture, Chomsky created an activity which keeps liguists
very busy. His approach, however, has contributed very little to language
engineering.

~~~
_delirium
Do you mean specifically of human languages? Because Chomsky's approach has
contributed pretty extensively to _programming_ language engineering, as the
foundation of parsing theory and the whole formal language hierarchy (context-
free, context-sensitive, regular, etc.).

I do agree it's been less successful for its original intended purpose, but
things often find new life, which seems okay.

~~~
slurgfest
Yes: the Chomsky hierarchy is a fundamental of computer science, one of the
great intellectual achievements of humanity. And in that respect also
important to AI.

Chomsky is incredibly strong on anything that does not require empirical data.

But UG has no legs and Chomsky's analysis of syntax has very limited
applicability and after many many IOUs, pretty much no empirical claims have
panned out in any significant way.

If you take away the application of those basic computer science concepts to
language, you unfortunately take away most of what Chomsky has written
regarding linguistics, psychology and AI. Because of the sheer volume of
output, that leaves a number of contributions. My point is that it is
necessary to be discriminating rather than making Chomsky into the Pope, as
certain fields have done for some time.

------
mcguire
Historically, AI has been divided into two related but different approaches.
"Strong" AI is interested in understanding and creating Minds; figuring out
what intelligence is, how it works, how we do it, and how it could be done _in
general_. "Weak" AI is interested in doing things that couldn't be done
before; things that we do not have good algorithms for, or don't have any
algorithms at all.

Those two are not _opposed_. Any advance on either side helps the other. In
this argument, Norvig is representing an extreme version of weak AI since he
seems to be arguing that it's possible that statistical methods are _all there
is_. (I suspect that he isn't actually making that argument, though, but that
strong AI's models are currently too simplistic to capture what statistical
approaches can do.) Chomsky, on the other hand, seems to be caricaturing
strong AI by saying that anything that doesn't directly shed light on the
Grand Theory is worthless.

------
aidenn0
It's a question about engineering vs science. Before Kepler, people actually
could predict the motion of the stars and planets through the sky; perhaps not
as elegantly or accurately as after Kepler, but to a certain degree, so what?

The AI case is clearly a point where the theories from linguistics are
insufficient for engineering purposes. Watson could not have been built today
based off of Chomskian linguistics. Maybe the statistical models will advance
the theory of linguistics, maybe not. Either way they will give us useful
tools _now_ which is better than elegant tools later.

------
frobbin
AI research, including speech recognition and machine vision, are currently
ENGINEERING disciplines trying to make artifacts that do interesting things.
Success is an artifact that works.

Several basic science disciplines are trying to understand how brains work.
There is mostly tremendous amounts of experimental facts, difficult to put
together, and some theory and modelling to go with it.

Norvig would be confused if he thinks that engineering AI systems
automatically counts as models useful for understanding the brain. If there is
application to understanding brains it is a welcome accident. It happens that
there are signals in basal ganglia that look like the temporal difference
error signal from reinforcement learning. So maybe RL research can help
understand some brain circuitry in that case.

But in general the engineers are trying to get stuff to work, and they are
deluded if they think they are simultaneously making progress in understanding
how brains work.

EDIT:

For example: why does speech recognition use hidden markov models and N-gram
language models? Because they're the best model of how brains understand
speech? No! Not at all. HMMs and N-gram models are above all computationally
tractable. Easy to implement, not too slow to run.

We have algorithms (such as baum-welch and N-gram smoothing techniques) to get
them work work well in engineering applications. Nothing more. Might they help
us understand brains? Maybe, but not at all necessarily so.

------
aangjie
Just for the record, i consider this a simple model. And it's from norvig.
<http://norvig.com/spell-correct.html>

------
fat_clown
It is an interesting debate, though I think it's being shone in the wrong
light.

According to the article, it almost sounds like Chomsky believes a statistical
approach to AI is a disservice to the field. The point he's missing is that
research in statistical based AI is just that - statistics research.

Chomsky and Norvig deal in two different fields, which happen to have similar
applications. Norvig does research in statistical and machine based learning.
Success in this field comes from a new model that can make more accurate
predictions, or a proof that it is impossible to make valid predictions about
X with only Y as input. Applications of this field include technologies which
rival AI systems as envisioned by Chomsky, but the essential point is that
this field focuses on statistics research, not AI research.

Chomsky is wrong in dismissing this as a disservice. I do agree with his main
point, that AI research and knowledge is not necessarily furthered by
statistics research, but that is simply because they are different beasts
entirely.

Maybe one day, when the biology has caught up with us and we have a solid
understanding of the brain, will we be able to create a highly intelligent
computer. Until then, statistics research is most likely to yield fruitful
results.

~~~
slurgfest
Chomsky sees a threat to a politics-based academic hegemony, so he responds
politically. If he hadn't made so many broadly quantified claims (i.e. had
stuck to syntax as Norvig sticks to AI and machine learning) then those claims
would not be in such jeopardy from other fields. Chomsky has always been a
top-class warrior and this is more of the same.

------
no_more_death
One myth I want to debunk:

Copernicus's theory did NOT do away with epicycles. Search on Google for
"copernicus epicycle" and the first article demonstrates my point. The one who
did away with epicycles was Kepler. Copernicus believed orbits had to be
perfectly circular; Kepler recognized that the data fit better into an
elliptical model.

It's not 100% clear whether the author believed the "myth," but hopefully I
can set some people straight in this forum.

------
mbq
The main problem with Chomsky's approach is that it is quite likely that human
intelligence mechanics are just incomprehensible for a human intelligence, and
not because of some crazy construction tricks but simply plain old brute size
and complexity it imposes. Judging from much simpler (thus deeper
investigated) biological systems like some bacteria metabolisms we can see
that there is no grand design there, only trivial primitive core and numerous
layers of less or more subtle modifiers of modifiers. IMO there is no reason
why the same can't work for the brain and thus the "transition to sentience"
is way more continuous than we would like to expect.

------
stcredzero
_> If the solar system’s structure were open for debate today, AI algorithms
could successfully predict the planets’ motion without ever discovering
Kepler’s laws, and Google could just store all the recorded positions of the
stars and planets in a giant database_

I'm sorry, but this bit is half wrong and simply numerically illiterate. We
can store all of the recorded positions of the planets and other bodies in the
solar system, but we need models to predict their future positions. This is an
important distinction, since we might use such models to save the human race
one day.

------
sireat
There must be some analogies made to the much smaller field of chess computer
programs.

From 1950s to about 1980 or so it was thought that the best computer chess
program would approximate the way a human would think about the game.
Botvinnik in particular was adamant that such an approach would be the right
one.

However, most of the progress was made through brute force. Modern chess
programs select moves in a way that is far removed from the way a good
chessplayer selects moves, yet they can now produce games that seem very
"uncomputer" like and "human".

------
6ren
It's true that Engineering at times leads Science. But, from a scientific
view, what's the point of a model if you can't understand it? After all, we
already know how to create intelligence without understanding it.

While it's conceivable that intelligence is too complex for a human to ever
understand (e.g. if not amenable to hierarchical decomposition), that would be
very sad news for science.

------
yters
Norvig is only trivially right. Sure, with enough stats you can infer a lot of
the structure of all the information we humans have created, and thus
replicate the structure, as Google is doing with its suggest service. However,
this does not explain how humans created the structure in the first place.
Such a form of AI will forever being playing catch up to humans.

------
ecolak
When Einstein heard about Quantum Mechanics and the idea that everything is a
probability, he said: "God doesn't roll dice". He meant that even though
Quantum Mechanics does give us many answers about the world of the tiny, it
doesn't truly explain it. I believe that a similar analogy can be made to this
case.

------
ilaksh
I know that everyone has been careful not to mention Chomsky's political
beliefs, but I am suspicious that this is actually partly about Chomsky's
political beliefs, which I think are more in line with reality or at least
more egalitarian than Norvig's must be, since Norvig has been running one of
the hegemony's greatest tools recently. I see a parallel between the general
derisive dismissal of Chomsky's academic views as being simplistic with the
type of dismissal commonly given to a Chomskyish geopolitical viewpoint. I see
this disagreement as a surrogate for the very different geopolitical
worldviews.

I doubt that Chomsky is really so hard line about his old approaches to AI as
we are led to believe, although he is probably farther behind the times than
Norving.

I actually think that even Norvig is just applying recent contemporary AI to
AI problems, but still is part of an old or establishment guard himself as far
as AI goes. I think that the real cutting edge AI research is called AGI
(artificial general intelligence) research.

The generation/category of AI research or machine learning that Norvig is tied
into is much newer and steps beyond the earlier traditional AI that Chomsky
might have been involved with, but the AGI researchers are a step beyond
Norvig's clique. And the AGI researchers are, by the way, very optimistic
about the Singularity or at least the likelihood of human-like and probably
super-human artificial general intelligence in the short or medium term.

I mean the Norvigish machine-learning stuff isn't completely disconnected from
the AGI stuff and completely behind and I assume it will result in extremely
capable AIs relatively soon, but the AGI approaches will probably prove to be
more powerful and more humanlike since they are closer to human models.

Take a look at what Brain Corporation is doing, or Numenta, or the OpenCog
project. That stuff is beyond Norvig and friends' approaches.

------
psb
Where is eyudkowski when we need him?

------
SlipperySlope
I am a entrepreneur/researcher working to create artificial intelligence. My
approach follows Turing's suggestion that one should create a child mind and
proceed to educate it. I employ Construction Grammar in my English dialog
system - not a statistical parser/generator. Operating on a smartphone, I use
available statistical speech recognition engines to transform speech to text,
but from that point onwards the server-side processing in Construction Grammar
is symbolic, thus engineered from first principles. Likewise, for English
generation, my discourse planner emits structured RDF that the bi-directional
Construction Grammar generator transforms into a text utterance. That symbolic
text is then input to an available statistical text-to-speech engine available
on the smartphone, to speak to the user.

As an example of the power of symbolic approaches, my parser has a complete
symbolic analysis of English auxillary verb constructions, producing unique,
meaning-rich, RDF-compatible semantics for:

I am learning about computers.

We are learning about computers.

We will be learning about computers.

I could be learning about computers.

I have been learning about computers.

I better learn about computers.

I had better learn about computers.

I dare learn about computers.

I did learn about computers.

I do learn about computers.

He does learn about computers.

I had learned about computers.

He has learned about computers.

I have learned about computers.

He is learning about computers.

I need learning about computers.

I ought to learn about computers.

I ought to be learning about computers.

I used to learn about computers.

I was learning about computers.

We were learning about computers.

Because of the so-far limited success of my work, I am inclined to agree with
Chomsky's AI argument despite using a modern grammar opposed to his linguistic
principles.

An artificial intelligence will use both statistical techniques and symbolic,
e.g. procedural techniques, I think. With the most useful intelligent behavior
being symbolic. E.g. an AI designing, writing and testing software.

