
Go, Marvin Minsky, and the Chasm That AI Hasn’t yet Crossed - wslh
https://medium.com/backchannel/has-deepmind-really-passed-go-adc85e256bec
======
joseraul
> AlphaGo isn’t a pure neural net at all — it’s a hybrid, melding deep
> reinforcement learning with one of the foundational techniques of classical
> AI — tree-search

Most board game computer players use some sort of tree search followed by
evaluation at the leaves of the tree. What we discovered in the 70s is that
you don't need to have human-level evaluation to win at chess; it is enough to
count material and piece activity, plus some heuristics (pawn structure, king
safety...); computers more than compensate this weakness with their superhuman
tree exploration.

This approach never worked so well for Go because evaluation was a mystery:
which group is weak or strong? how much territory will their power yield?
These are questions that professionals answer intuitively according to their
experience. With so many parts of the board that depend on each other, we
don't know how to solve the equation.

It looks like AlphaGo is the first one to get this evaluation right. At the
end of the game, his groups are still alive and they control more territory.
So Go evaluation is yet another task that used to be reserved to human experts
and that computers now master. The fact that this is mixed with classical tree
search does not make it less impressive.

~~~
theunixbeard
The author of this post (Gary Marcus) is a huge proponent of hybrid systems,
in fact he is using that technique for his current stealth startup:
[https://www.technologyreview.com/s/544606/can-this-man-
make-...](https://www.technologyreview.com/s/544606/can-this-man-make-aimore-
human)

~~~
nl
Yep.

Not to be harsh, but Marcus has been critical of Neural Nets for a while now.
His claims that there are issues around the provability of them are well made.

But.. there is a way to make people listen to you. It's called results. Deep
Learning is getting them, in an increasing number of diverse fields.

~~~
colllectorof
_But.. there is a way to make people listen to you._

Hype, right?

Choosing problems for their theatrical effect, rather than utility. Writing
articles and research papers as if they're marketing pamphlets. Claiming that
incremental improvements are paradigm shifts. Treating arbitrary achievements
as if they were commonly agreed upon milestones all along.

All of this is happening right now. Being skeptical in such environment is the
only right thing to do.

 _Deep Learning is getting them, in an increasing number of diverse fields._

If you look for practical applications that give tangible benefits to people
outside of academia, the achievement of applied deep learning so far aren't
nearly as impressive as you make them out to be. This is despite insane levels
of hype, huge investments in research and amounts of computing power
available.

Heck, if anything, the fact that AlphaGo needs to use a tree search to prop up
its ANN components could be seen as a sign that ANNs have some serious
practical limitations when it comes to "results". Which is kind of the point
of the article.

~~~
nl
No doubt there is plenty of hype. From where I sit though, a lot of it is
justified (Not the general intelligence stuff of course).

 _Choosing problems for their theatrical effect, rather than utility. Writing
articles and research papers as if they 're marketing pamphlets. Claiming that
incremental improvements are paradigm shifts. Treating arbitrary achievements
as if they were commonly agreed upon milestones all along._

I'm not sure what to say to this.

There are no "commonly agreed upon milestones". The closest things are the
academic benchmarks/shared tasks that you seem to be critical of.

I guess the closest thing you'll find to a "commonly agreed upon milestone" is
something like the Winograd schema[1]? Based on progress like "Teaching
Machines to Read and Comprehend" I wouldn't be betting against deep learning
on that.

 _If you look for practical applications that give tangible benefits to people
outside of academia, the achievement of applied deep learning so far aren 't
nearly as impressive as you make them out to be._

Could you explain what you expecting? Deep learning techniques aren't exactly
wide spread yet, and outside Google and a few other companies it takes time
for things to migrate into products and have tangible benefits.

Nevertheless, Google Search, Pinterest, Facebook image tagging, Android Voice
Search etc, etc.. these all are used by billions of people daily. I think it's
hard to argue there isn't at least some practical applications.

[1]
[https://en.wikipedia.org/wiki/Winograd_Schema_Challenge](https://en.wikipedia.org/wiki/Winograd_Schema_Challenge)

[2] [http://arxiv.org/abs/1506.03340](http://arxiv.org/abs/1506.03340)

------
shas3
Interesting article! However, Deep Blue, Watson, and AlphaGo are very
different from one another. I don't think anyone deemed beating humans at
chess or jeopardy impossible at the time Deep Blue and Watson were built. On
AlphaGo the point about generalizing AI is valid, yet I think the author
doesn't fully appreciate the novelty of approach described in the AlphaGo
paper. Their work advances the field, has more general utility than Deep
Blue's chess or Watson's jeopardy programs. AlphaGo paper specifically
represents an advance in machine learning algorithms for games in general. As
I understand, Watson's new NLP algorithm is PRISMATIC [1]. PRISMATIC is a
rule-based NLP system and AlphaGo is more statistical inference/neural
networks. Even if AlphaGo's 'policy-network- value-network' framework is not
too generalizable, the philosophical implication is that we can build AIs that
can mimic 'human intuition'. Jeopardy and chess have lesser components of
'human intuition' than Go. They are apples and oranges. So, I wonder if the
author errs in bringing Deep Blue and Watson-Jeopardy into the picture.

In my opinion while Watson-Jeopardy and Deep Blue were 'over-fitting' for
Jeopardy and chess respectively, AlphaGo algorithms are more general and over-
fits for the larger category of 'games'.

[1]
[http://brenocon.com/watson_special_issue/05%20automatic%20kn...](http://brenocon.com/watson_special_issue/05%20automatic%20knowledge%20extration.pdf)

~~~
Gibbon1
I'll get hammered for this statement but.

Traditional AI suffers from a love of solving parlor tricks. Solving tick tack
toe, checkers, chess, poker, Jeopardy(tm), are parlor tricks. It seems
important because frankly humans just suck at parlor trick type problems and
other forms of intelligence like say a cat brain don't even get parlor tricks.
SO then we found that computers were really good at parlor tricks and it
seemed like we were really onto something here. But nope.

On the other hand playing Go is not really a parlor trick, it's actually
_hard_ simple symbolic logic totally fails to grasp the problem at the first
level.

~~~
grmarcil
Can you provide a firm definition of parlor trick?

Not saying I agree or disagree with you, but without a real definition of
parlor trick you have a wide open no-true-Scotsman defense.

~~~
resu_nimda
Well that's the crux of the problem - the corollary to your question is "can
you provide a firm definition of intelligence?" Nobody can yet, so it's all
speculation and subjective opinion.

The reason that I personally consider all these things parlor tricks
(including, hypothetically, complete mastery of Go) is that I see no path from
these particular types of systems to general intelligence. A human can take in
arbitrary sensory data and make all sorts of conclusions and associations with
it. Does this particular system have the capability to get to a point where it
can see an apple falling and posit a theory of gravity? Will it ever be able
to read subtle cues in facial/oral/bodily expression and combine them with all
sorts of other data, instantaneously, to achieve compelling real-time social
interaction? Will this system ever _invent_ the game of Go, or anything else,
because it felt like it? No, it has absolutely no framework to do any of those
things, or countless other things humans can do. It's a machine built with a
single purpose in mind, and it can only serve that purpose. It's a glorified
function call. I don't think this type of machine will just wake up one day
after digging deeper and deeper into these "hard" tasks. We need breadth, not
depth.

~~~
such_a_casual
You may find this paragraph on wikipedia interesting:
[https://en.wikipedia.org/wiki/Great_ape_language#Limitations...](https://en.wikipedia.org/wiki/Great_ape_language#Limitations_of_ape_language)

------
vonnik
I'm a bit disappointed in this piece. It doesn't say anything surprising to
anyone who read a few words into the DeepMind paper, and it serves to settle
some of Marcus's academic scores:

> two people ought to be really pleased by this result: Steven Pinker, and
> myself. Pinker and I spent the 1990’s lobbying — against enormous hostility
> from the field — for hybrid systems

Told-you-so's are almost always boring, especially when they are part of a
larger campaign. Here, Marcus's campaign is that neural nets are not enough,
which isn't really news to DeepMind or most other people working with NNs, and
doesn't matter much to anyone not working with them.

His chief critique is their interpretability.

>In 2016 networks have gotten deeper and deeper, but there are still very few
provable guarantees about how they work with real-world data.

But for most of the world, including the people using whatever Marcus's
startup eventually makes, predictive accuracy trumps interpretability. Let's
get to 99% accuracy and worry about why later. And that's what researchers
have done for many problems with NNs. Of course it's nice to know why, but
it's not a fatal flaw if you don't, most of the time.

IBM's struggles to market Watson are a bit of a straw man. If you had judged
the PC market by IBM's moves a few decades ago, you might have reached the
same conclusions, and you would have been dead wrong.

~~~
eli_gottlieb
>But for most of the world, including the people using whatever Marcus's
startup eventually makes, predictive accuracy trumps interpretability.

Does it? What if the network has 99% accuracy, but is equally confident about
its correct _and_ incorrect predictions? "Deep neural networks are easily
fooled", after all.

~~~
vonnik
Sure, that's a flaw. It's just not a fatal flaw. For the chief reason that
there's probably nothing better. So you take your lumps and remember that it's
wrong sometimes. It's something we can work on while still benefitting from
these tools.

~~~
eli_gottlieb
>For the chief reason that there's probably nothing better.

The paper "Deep Neural Networks are Easily Fooled" noted that generative
models don't suffer from this flaw.

------
splatcollision
Interesting read & puts the Google Go bot in some needed perspective.

From the article:

> In the real world, the answer to any given question to be just about
> anything, and nobody has yet figured out how to scale AI to open-ended
> worlds at human levels of sophistication and flexibility.

One doesn't have to shoot for the moon in order to find useful applications
for AI or Cognitive technology. If you can restrict the domain of knowledge of
an expert system, it doesn't need to be create 'open-ended worlds' in order to
provide value. It just has to beat human effort, or be an augmentation to
human cognition to enable scale, for it to be useful - or provide business
value.

~~~
frozenport
But then why call it AI? Perhaps the work that it does is not actually
intelligent?

~~~
empath75
Is what people do 'actually intelligent'? If you break down the processes of
the brain to a low enough level, all of the 'intelligence' will disappear,
just as it does in a computer neural network.

Intelligence is not some kind of aristotelian substance that permeates brain
matter. At some level, anything which is intelligent has to be built from
parts which are not intelligent.

~~~
colllectorof
_> Is what people do 'actually intelligent'?_

Yes, although there are plenty of people trying to convince everyone otherwise
by example.

 _> If you break down the processes of the brain to a low enough level, all of
the 'intelligence' will disappear, just as it does in a computer neural
network._

If you break matter down to low enough level, everything is just elementary
particles. Now, would you please trade me some gold for equal mass of
aluminum?

------
sixQuarks
It's fascinating that nature has created human-level intelligence using blind
randomness, albeit over a period of 1+ billion years.

My theory is that with renewed global focus on AI, we're going to have a lot
of minds looking at this problem from various outside perspectives. I believe
a breakthrough in AI will come about not from the computer science sphere, but
a very unlikely area that will surprise many.

~~~
nairboon
What would your guess be for that area?

~~~
sixQuarks
Biology would be the "obvious" answer, but more specifically, I could see a
breakthrough coming from psychedelic research, which is making a huge comeback
right now after decades of ridicule. It's amazing how little research has been
conducted in this area, a lot of scientists are rediscovering and relearning
things that were first explored back in the 60s, and there's already been lots
of progress related to human psychology.

~~~
tim333
Deep Mind are working on the biology thing and more specifically study of the
human brain. Demis Hassabis, the main guy at Deep Mind did a PhD in cognitive
neuroscience and is focused on that stuff. Not so sure about psychedelic
research - I don't know how you'd use that to build computer systems even if
the Google dream pictures are pretty trippy looking.

([https://www.google.com/search?q=google+dreams&num=20&tbm=isc...](https://www.google.com/search?q=google+dreams&num=20&tbm=isch&tbo=u&source=univ&sa=X&ved=0ahUKEwj41bDriNvKAhXDuo4KHZxNC40QsAQIHw&biw=1093&bih=478))

~~~
sixQuarks
Actually, you just brought up a great point. The Google dream pictures are
very similar to what you might see during a psychedelic experience.

I admit I don't know how psychedelic research could help in building AI, I'm
just saying that my hunch is that a breakthrough in AI will come about from
left field somewhere.

------
theunixbeard
An interesting article about the author, Gary Marcus, and his stealth
startup[1]: [https://www.technologyreview.com/s/544606/can-this-man-
make-...](https://www.technologyreview.com/s/544606/can-this-man-make-aimore-
human/)

[1] [http://geometric.ai/](http://geometric.ai/)

------
Animats
I went through Stanford in the 1980s, just as it was becoming clear that
logic-based AI had hit a wall. That was the beginning of the "AI Winter",
which lasted about 15 years. Then came machine learning.

AI used to be a tiny field. In the heyday of McCarthy and Minsky, almost
everything was at MIT, Stanford, and CMU, and the groups weren't that big.
There were probably less than 100 people doing anything interesting. Also, the
total compute power available to the Stanford AI lab in 1982 was about 5 MIPS.

Part of what makes machine learning go is sheer compute power. Training a
neural net is an incredibly inefficient process. Many of the basic algorithms
date from the 1980s or earlier, but nobody could hammer on them hard enough
until recently. Back in the 1980s, John Koza's group at Stanford was trying to
build a cluster out of a big pile of desktop PCs. Stanford got a used NCube
Hypercube with 64 processors (1 MIPS, 128KB each). The NCube turned out to be
useless. There was a suspicion that with a few more orders of magnitude in
crunch power, something might work, but with the failure of AI, nobody was
going to throw money at the problem.

At last, there are profitable AI applications, and thus the field is huge.
Progress is much faster now, just because there's more effort going in. But
understanding of why neural nets work is still poor. Things are getting
better; the trick of using an image recognition neural net to generate
canonical images of what it recognizes finally provided a tool to get some
insight into what was going on. At last there was a debug tool for neural
nets. Early attempts in that direction determined that the recognizer for
"school bus" was recognizing "yellow with black stripe", and that some totally
bogus images of noise would be mis-recognized. Now there are somewhat ad-hoc
techniques for dealing with that class of problems.

The next big issue is to develop something that has more of an overview than a
neural net, but isn't as structured as classic predicate-calculus AI. One
school tries to do this by working with natural language; Gary Markus, the
author of the parent article, is from that group. There's a long tradition in
this area, and it has ties to semantics and classical philosophy.

The Google self-driving car people are working on higher-level understanding
out of necessity. They need to not just recognize other road users, but infer
their intent and predict their behavior. They need "common sense" at an animal
level. This may be more important than language. Most of the mammals have that
level of common sense, enough to deal with their environment, and they do it
without much language. It makes sense to get that problem solved before
dealing with language. At last, there's a "killer app" for this technology and
big money is being spent solving it.

~~~
harryjo
Some of us went through school in the 90s-2000s and were trained by folks who
never let go of the dead-end pure-logic-based systems.

~~~
Animats
Ouch. That was a decade late to be doing that.

I finished a MSCS in 1985. Ed Feigenbaum was still influential then, but it
was getting embarrassing. He'd been claiming that expert systems would yield
strong AI Real Soon Now. He wrote a book, "The Fifth Generation", [1] which is
a call to battle to win in AI. Against Japan, which at the time had a heavily
funded effort to develop "fifth generation computers" that would run Prolog.
(Anybody remember Prolog? Turbo Prolog?) He'd testified before Congress that
the "US would become an agrarian nation" if Congress didn't fund a big AI lab
headed by him.

I'd already been doing programming proof of correctness work (I went to
Stanford grad school from industry, not right out of college), and so I was
already using theorem provers and aware of what you could and couldn't do with
inference engines. Some of the Stanford courses were just bad philosophy. (One
exam question: "Does a rock have intentions?")

"Expert systems" turned out to just be another way of programming, and not a
widely useful one. Today, we'd call it a domain-specific programming language.
It's useful for some problems like troubleshooting and how-to guides, but
you're mostly just encoding a flowchart. You get out what some human put in,
no more.

One idea at the time was that if enough effort went into describing the real
world in rules, AI would somehow emerge. The Cyc project[3] started to do this
in 1984, struggling to encode common sense in predicate calculus. They're
still at it, at some low level of effort.[4] They tried to make it relevant to
the "semantic web", but that didn't seem to result in much.

Stanford at one time offered a 5-year "Knowledge Engineering" degree. This was
to train people for the coming boom in expert systems, which would need people
with both CS and psychology training. They would watch and learn how experts
did things, as psychology researchers do, then manually codify that
information into rules.[2] I wonder what happened to those people.

[1] [http://www.amazon.com/The-Fifth-Generation-Artificial-
Intell...](http://www.amazon.com/The-Fifth-Generation-Artificial-
Intelligence/dp/0201115190) [2]
[https://saltworks.stanford.edu/assets/gx753nb0607.pdf](https://saltworks.stanford.edu/assets/gx753nb0607.pdf)
[3] [https://en.wikipedia.org/wiki/Cyc](https://en.wikipedia.org/wiki/Cyc) [4]
[http://www.businessinsider.com/cycorp-
ai-2014-7](http://www.businessinsider.com/cycorp-ai-2014-7)

------
mcv
Interesting. I studied AI in the 1990s, and although I don't know this author,
I've always felt that the real progress in AI would come from combining
various techniques. I don't understand why there would be any hostility
towards that idea. (Except that in research people can be very protective of
their own pet projects.)

------
MrQuincle
The challenge is to have grammar-like and pointer-like structures in a
connectionist network. To ground such symbolic notions is the entire quest!!

MC tree search converges to minimax.

I'm really fond of Bayesian methods, but I think arguments about optimality
should not be overstated. The brain is just an approximation at best.

------
seanwilson
So now that computers are getting good at Go, what's the next logical stepping
stone for AI?

~~~
ktRolster
AI chess that actually thinks like a human:
[http://www.popularmechanics.com/technology/robots/a17339/che...](http://www.popularmechanics.com/technology/robots/a17339/chess-
engine-plays-against-itself/)

If you could get a chess engine that can prune the tree as efficiently as a
human, while calculating as fast as a computer, it would be phenomenal.

~~~
te
After reading the DeepMind paper, it does seem like their techniques could
also be applied to chess and could possibly improve the state-of-art there as
well. It is unclear, however, how much improvement remains in modern chess
engines. They are already phenomenal.

------
justicezyx
Beating the world's 663th go player is not mastering the game...

I guess AI has to purge human out from earth to justify the statement "AI
masters xxxx"...

~~~
sago
The superhuman fallacy is the nemesis of all AI research: if it isn't better
than the best human who has ever attempted a problem, it is worthless and
derisory.

I've done a lot of work on artificial creativity, and am constantly thrilled
when code generates something at the level of a creative 3rd grader. But show
it to most people and you get "It's hardly a Michaelangelo, is it?"

Frustrating.

~~~
Mark222
No, it's rather the idea that you cannot say "AI beats Go master" or "AI
masters Go" if it didn't actually attain some sort of high ranking by itself.
Beating a low-ranked player while still interesting is not necessarily a proof
of proficiency but could be due to just blind luck for example.

~~~
thangalin
Winning 5 games to 0 probably isn't blind luck.

There's no clear definition of "mastery." Honinbō Shusaku was a master.
Honinbō Shūsai, Go Seigen, and Minoru Kitani would also be considered masters.
As time marches on, average player skill increases. While Honinbō Shusaku was
among the strongest players of his day, he would be hard pressed to hold his
own against professionals in today's era.

I think it's fair to say that anyone who reaches shodan (1 dan professional)
plays at the master level. The difference between 1 dan professional (1P) and
9 dan professional (9P) is three stones, or roughly 30 points. In amateur
play, for comparison, the difference between a 1 dan and 3 dan is about 2
stones (roughly 20 points).

AlphaGo won 5 straight games against a 2 dan professional player. That puts
AlphaGo around 3P, well into the master range.

In Go, the ranks are:

    
    
        30 kyu amateur (never played)
        1 kyu amateur (understands the game)
        1 dan amateur (mastered the basics)
        7 dan amateur (nearly professional strength)
        shodan (1 dan professional)
        9 dan (top ranking professional)
    

[https://en.wikipedia.org/wiki/Go_ranks_and_ratings#Professio...](https://en.wikipedia.org/wiki/Go_ranks_and_ratings#Professional_ranks)

[https://en.wikipedia.org/wiki/Go_professional#Discrepancies_...](https://en.wikipedia.org/wiki/Go_professional#Discrepancies_among_professionals)

"Traditionally it has been uncommon for a low professional dan to beat some of
the highest pro dans. But since the late-90s it has slowly become more
common."

------
arthur_pryor
it seems to me that while humans are built out of neural nets, we are capable
of logic, and the sort of reasoning that seems to be fall under the category
of "tree search" in this conversation. in that sense, it would seem that we
can do deductive logical reasoning, and we run that software on neural net
hardware (slash firmware/software).

so, why the slavishness among some to neural nets and building everything on
top of that? why emulate logic processing on top of associative processing and
then mesh the two, when you could just do associative processing and mesh it
with more "native" logic processing, since man-made computing hardware already
does that easily?

also: this has been brought up here and elsewhere before, but as the article
mentions: "Just yesterday, a few hours before the Go paper was made public, I
went to a talk where a graduate student of a deep learning expert acknowledged
that (a) people in that field still don’t really understand why their models
work as they well as they do and (b) they still can’t really guarantee much of
anything if you test them in circumstances that differ significantly from the
circumstances on which they were trained. To many neural network people,
Minsky represents the evil empire. But almost half a century later they still
haven’t fully faced up to his challenges."

let's keep in mind that some of the biggest mistakes humans make come from
unexamined associative reasoning. let's also keep in mind that this crowd
tends to be particularly suspect of the sort of not-strictly-logical
associative heuristics that we associate (haha) with more mainstream society
(unexamined articles of faith, habits and social customs that have unintended
deleterious effects, etc).

one of my favorite failures of loose associative reasoning is racism, so
here's a link many of you have probably seen, but it's an important thing to
keep in mind, so here it is anyway:
[http://www.nytimes.com/2015/07/10/upshot/when-algorithms-
dis...](http://www.nytimes.com/2015/07/10/upshot/when-algorithms-
discriminate.html?abt=0002&abg=1)

let's not reproduce some of our species' biggest flaws just because it let us
expediently reach some narrowly defined (and likely immediately financially
desirable) goal.

i realized i've sort of conflated my points here: it is entirely possible to
implement all sorts of terrible things (like racism) using logic. there's
certainly no shortage of spurious logic to justify all sorts of bad behavior.
axioms need to be sound and nuance needs to be considered. but i think my
general point is that humans aren't some sort of model of perfection, and
should not be copied as such.

