
Is There a Smarter Path to Artificial Intelligence? Some Experts Hope So - allenleein
https://www.nytimes.com/2018/06/20/technology/deep-learning-artificial-intelligence.html
======
jonathanstrange
Maybe good AI requires a combination of symbolic approaches for the knowledge
representation, deep parsing for NLP and semantics/discourse models, and
machine learning for pattern recognition tasks of any kind.

The currently trendy "AI" looks more like massive data mining with powerful ML
to me, that's very good for certain tasks but brings us nowhere near real AI.
The knowledge representation problem has not yet been solved. In order to get
even just a convincing simulation of AI, let alone real AI, we need a large
common sense knowledge base / computational ontology.

One perhaps promising approach are geometric meaning theories/concept
representations that allow for logical combinations. For example, Diederik
Aerts works in this area. However, to be honest, I don't have the math skills
to evaluate his approach. Generally speaking, logical modelling is too limited
for good concept representations - especially classical 90s AI like in default
reasoning and other nonmonotonic logics -, whereas traditionally geometric
representations suffer from problems with representing logical inference and
quantification. IMHO, that's a problem worth looking at. (Admittedly, I'm a
bit biased towards symbolic AI like the people in this article.)

~~~
YeGoblynQueenne
>> Generally speaking, logical modelling is too limited for good concept
representations

In principle, First Order Logic and equivalent languages can represent
anything that can be represented in natural language. The problem is that in
practice it is very hard to transfer into a logic language all the knowledge
you might need for a useful system.

In fact, this was one motivation for at least one branch of machine learning-
the work from Ryszard Michalski, Claude Sammut, Ranan Banerji, Steven Vere,
Brian Cohen and later rule-learners like Quinlan's decision tree learners and
of course all the later work in Inductive Logic Programming by Plotkin,
Shapiro, Muggleton etc. These are all rule-learning systems, that can avoid
the knowledge acquisition bottleneck that probably did in for purely rule-
based expert systems in the '90s (i.e. getting experts to transfer their
knowledge into rules).

I rather agree that further progress in AI will require a, let's say,
synchretistic approach- like I say in another comment, the obvious thing for
me is to use deep learning for perception, logic for inference and
probabilistic modelling to deal with the noisy world. There are some people
working in that sort of direction, with different amounts of emphasis on each
of the three approaches. For instance, Josh Tenenbaum at MIT, Evans and
Grefenstette at DeepMind, Luc De Raedt at KU Leuven, Kristian Kersting at
Dortmund, Lise Getoor also at MIT, Pedro Domingos at Washington, and many
others.

Apologies for all the name-dropping without links. Let me know and I can
provide them if required.

~~~
paganel
> In principle, First Order Logic and equivalent languages can represent
> anything that can be represented in natural language.

Is there any proof about an existing non-leaky abstraction between natural
language and First Order Logic (or logic in general)? Afaik there is none, but
I might me wrong. For example (one amongst thousands): “I don’t like you” can
actually mean (and it often does) the exact opposite, and the same can be said
of a simple statement like “I am ok” which also can mean the exact opposite.
Natural language is a very hard problem.

~~~
YeGoblynQueenne
Well, going by formal language theory, FOL is equivalent to a universal Turing
machine, which is probably also the full expressive power of human language.
Therefore, everything that can be said in human language should _in principle_
be possible to say in FOL. I don't think this is controversial.

Notice again that I don't think this is feasible in practice. At least not
with hand-crafted FOL expressions and certainly not with existing technology.

I'm not sure what "non-leaky abstraction" means. Would formal language theory
fit the bill?

~~~
sgt101
Yeah but humans don't have a single consistent logical model for all reasoning
- i believe that all my models can be in fol, but i don't believe that they
can be unified in fol

~~~
polkapolka
Dennett aims to solve this using heterophenemenology:

[https://en.m.wikipedia.org/wiki/Heterophenomenology](https://en.m.wikipedia.org/wiki/Heterophenomenology)

In this framework, utterances can be studied without taking their truth value
at face value.

~~~
sgt101
Very interesting, I'll have to think _a lot_ about that.

------
FrozenVoid
Lets rename "Deep learning" to "Statistical Compression". Its essentially
distilled data from a dataset, limited by size of the neural network. There
isn't anything "deep" there, the goal of NN is always towards overfitting the
data.

~~~
gaius
_Lets rename "Deep learning" to "Statistical Compression"_

The previous name for what is now called “machine learning” was predictive
statistics. But a better name for deep learning is “machine intuition” since
it seems to work pretty well but can’t be readily explained

~~~
FrozenVoid
> can’t be readily explained

It works "pretty well" in well-defined complete-information domains like Go.

The training in that case is actually about improving the next dataset from
which higher-quality patterns are extracted.

1.We have a NN that doesn't know how to play and makes random moves.

2.Blank slate dataset filled from random games.

3.NN learns to produce better games by weighting wins vs losses, making losing
moves less likely and winning moves more likely.

4.NN produces slightly better dataset.

5.Repeat #3 until you have covered a large chunk of go openings and have
pretty good estimation for move quality.

6.NN becomes the condensed statistic pattern data extracted from billions of
games, able to estimate value of a move more accurately than humans.

Still in some rare cases it "condensed pattern database" will make mistakes
due gaps in data or wrong correlation that NN hallucinated from its
connections.

~~~
gaius
_works "pretty well" in well-defined complete-information domains like Go_

But no one can definitively say “my AI credit scoring system didn’t decline
his mortgage because he is black” and that’s the explainability bar that needs
to be passed.

Right now the state of it is “gut feel” with an unknown amount of unconscious
bias.

~~~
tim333
Same as humans really.

------
sixdimensional
This discussion reminds me of NASA's CLIPS [1], particularly FuzzyCLIPS. I
know that is basically a rule engine/expert system basis at its core, but
FuzzyCLIPS was an interesting variant that try to do inference off of CLIPS
rules and facts.

I always wondered if rules and facts can be seen as high-order abstractions
that represent the low-level data that is encoded into neural nets. Would that
mean that a system of symbolic logic, or rules and facts can in fact
accomplish the same thing, just in a different, higher order way?

Then when you get into combining fuzzy with known, it's like different organs
in the brain combining to serve different functions. When I know a rule and
recall it, I believe it to be true and act immediately on it. When I'm not
sure, I get fuzzy and try to deduce it. The more data and inputs I have about
a given environmental condition, problem, subject, input, fact, etc., the
easier it becomes to "reason" using fuzzy links to try to work out the answer.
Unless I get information overload or the complexity exceeds one human brain's
capability.

I've always felt that techniques are complementary and probably we employ
multiple subsystems in our brains together to act like humans.

[1]
[https://en.m.wikipedia.org/wiki/CLIPS](https://en.m.wikipedia.org/wiki/CLIPS)

------
gremlinsinc
One problem with ai (or general ai) is ai lives in a box, it experiences what
we tell it to experience, unlike a child who learns organically with what it's
exposed to randomly while trying to survive and thrive.

Perhaps the answer is to figure out all the things that make people tick via
deep learning and other techniques, build one huge open api/db for
vision/audio/movement in 3d
planes/ethics/morality/understanding/discernment/language and then a more
general ai could simply send out calls to api's of data already trained on
specific foundations and create a 'whole' brain from multiple 'sub' brains
(pre-learned open knowledge).

Unless we literally learn to clone the human brain technologically, in it's
entirety.

I think another issue is not all working in AI are neuroscientists, and it may
take understanding every mapped out facet of the human brain to come close to
re-creating that in humans, it all can't be deep learning unfortunately,
because that's not how we learn and we're the best guess solution to
'intelligence' as the only intelligent race (we know of) in the universe.

~~~
choxi
> it all can't be deep learning unfortunately, because that's not how we learn

If you could humor me, why can't it all be deep learning? It seems like a
robust model for how we learn: we start with some priors (random weights),
experience something that either confirms or negates those priors (sampling),
and then adjust our priors based on that experience.

Of course there are a wide variety of architectures and the human brain is
more sophisticated than one conv net. Our brains have on the order of 100
billion neurons and I believe the largest neural networks are on the order of
10's of millions. I would argue that a combination of better architectures and
scale could simulate human intelligence.

~~~
chii
> we start with some priors (random weights)

i dont think we start with random priors. We start with a set of priors that
are geared for certain things - facial recognition, language etc. The
environment then fine tunes them.

~~~
garmaine
There is no way that the genome encodes synaptic layouts to such a detail as
to specify an algorithmic prior. At best it specifies, loosely, the generic
architecture of how many neurons, how many layers, how tightly folded, and
where the inputs (senses) connect. We develop the same algorithms because we
all start with the same priors in largely the same gestational and early
infant environments.

~~~
username90
Right, watch this genius horse learn to walk and navigate a 3d environments
just hours after birth:
[https://www.youtube.com/watch?v=RXKdYThau7c](https://www.youtube.com/watch?v=RXKdYThau7c)

No, we don't "learn" how to balance, how to identify objects and navigate 3d
environments. All of that is highly codified in our DNA from hundreds of
millions of years of evolution. Human babies, similar to kangaroos, are just
born too early for those systems to have developed.

~~~
garmaine
Is this sarcasm? You do know that babies (human or animal) exercise their
muscles and gain body awareness during gestation, right?

------
sgt101
Why prolog? Why not PRISM or problog?

~~~
YeGoblynQueenne
This is a valid question, I don't know why you (originally) got downvoted.

The answer I think, for the applications in industry at least, is that PRISM
and other probabilistic programming langauges are not as developed as Prolog,
a language that has been around for a good four decades now. Swi-Prolog in
particular, is a free and open-source Prolog interpreter with an IDE, a
graphical package and a veritable somrgasbord of libraries, including an http
server, json and xml processing, encryption, interfaces to SQL and other
databases etc etc etc. Swi is in active development, compatible with YAP-
Prolog (currently the fastest implementation) very well supported and
documented and has an active community.

It's a lot easier to setup a production environment in Prolog, especially
using Swi, than in pretty much any other logic programming language, including
the probabilistic logic programming ones.

Btw, I noticed that Kyndi list experience with Logtalk as a "plus" for one of
their career opportunities (of course I looked :) which means that they're
probably using Swi-Prolog.

~~~
randcraw
But its not clear to me that encoding productions in source code adds any
value over encoding them in the various other forms used by expert systems of
old. Prolog also compels binary constraint satisfaction, recursive descent
parsing, backtracking, and depth-first resolution — all undesirable
constraints that are easily avoided using other fact representations or
resolution engines.

Until the startup in question can make a compelling case for using prolog,
much less avoiding the brittleness inherent in expert systems' resolution
model, count me a disbeliever that their 'better way' really is.

~~~
YeGoblynQueenne
I wouldn't say that backtracking and depth first search (not resolution) are
undesirable. Far as I'm concerned they're pragmatic choices that minimise the
amount of resources necessary to perform resolution theorem-proving.

~~~
randcraw
I would hate for my logic engine to be _required_ to backtrack every time a
candidate interpretation was evaluated. AFAIK, prolog allows no alternative.
Nor can it support the many powerful probabilistic extensions for rule-based
reasoning that arose 25(?) years ago, nor the many improvements and variations
on impasse resolution.

IMHO, the choice of prolog unnecessarily straitjackets a modern production
system, making the engineering approach of this startup that much less
powerful, flexible, or viable than it should be.

~~~
eazar001
The SWI-Prolog ecosystem supports some probabilistic extensions such as
[http://www.swi-prolog.org/pack/list?p=ccprism](http://www.swi-
prolog.org/pack/list?p=ccprism), and [http://www.swi-
prolog.org/pack/list?p=cplint](http://www.swi-prolog.org/pack/list?p=cplint).

Also, the (!/0) operator allows you to prune choice-points and control
backtracking so that it is not unconditional. Many Prolog implementations
support (->/2) operator, as well as ((*->)/2) -- the soft cut, for more fine-
grained control. SWI has a nice library for even further control over
backtracking as well: [http://www.swi-
prolog.org/pldoc/man?section=solutionsequence...](http://www.swi-
prolog.org/pldoc/man?section=solutionsequences).

Also good Prolog programmers usually are not overly concerned with
backtracking, as they usually have strong knowledge of modes and determinism.
See: [http://www.swi-prolog.org/pldoc/man?section=modes](http://www.swi-
prolog.org/pldoc/man?section=modes). This is mostly second-nature to a skilled
Prolog developer.

------
alexgmcm
I think it's obvious to anyone with experience in the field that Deep Learning
etc. isn't a path to AGI nor an emulation of biological intelligence.

But it is one thing to know that the current path isn't leading us there and
another thing entirely to know which path will.

After all it isn't even clear that emulating biological intelligence would be
a wise approach - we didn't build the jumbo jet by copying the birds.

------
whaaswijk
Somehow the main article link is behind a paywall for me, but the mobile one
is not: [https://mobile.nytimes.com/2018/06/20/technology/deep-
learni...](https://mobile.nytimes.com/2018/06/20/technology/deep-learning-
artificial-intelligence.html)

(YMMV)

~~~
baking
The New York Times and Washington Post both use a "fingerprint" method to
identify unique users using things like your browser version and IP address. I
like using Firefox nightly because a quick update to Nightly will make it
appear that I'm a new user and will reset my free "monthly" article limit.
That's also why switching browsers will work for some people, but not others.

------
YeGoblynQueenne
>> If the reach of deep learning is limited, too much money and too many fine
minds may now be devoted to it, said Oren Etzioni, chief executive of the
Allen Institute for Artificial Intelligence. “We run the risk of missing other
important concepts and paths to advancing A.I.,” he said.

I was very pleasantly surprised this year to see the Symbolic Machine Learning
course at Imperial College London well-attended by about 30 students each
session. The course basically teaches the principles and practice of Inductive
Logic Programming, a set of algorithms and techniques that learn logic
programs from relational data [1]. It is an optional course at Master's level
so I was originally expecting to see maybe half a dozen people attending (I
assisted with a couple of the sessions and followed the rest to brush up on my
background).

I discussed my surprise with my supervisor who is one of the tutors on the
course. His opinion was that logic-based AI is much closer to the material
that most Computer Science students are familiar with, than statistical and
probabilistic AI. It's true that, although many CS courses have started to
include statistics and probabilities in their curriculum (the one at Imperial
sure does) the mainstay of computer science are logic-based, discrete maths,
complexity and computation theory and language theory. The jump from there to
GOFAI is just a short hop; in many ways, the two fields are really one
subject.

My intuition is that the boom in deep learning is partially being fed by
disciples of fields well outside CS -maths, physics, bio-sciences etc- jumping
on to the, well, bandwagon, motivated by the deep learning frenzy in industry
and not so much by any interest in AI or even computers. At the same time,
there is plenty of traditional CS talent waiting in the wings. It may be that
a revisiting of the good old ways of knowledge representation and inference
may not be completely out of the question [2].

Of course, there's always the chance that the students only came to the course
because it had "machine learning" in its name :)

__________________

[1] More the point, both of the outputs and inputs of ILP are logic programs,
i.e. binary relations. Most of the work uses Prolog but there also exist
techniques to learn Answer Set Progarmming, Constraint Logic Progarmming and
of course Probabilistic Logic Programming programs. In any case, the
homoiconicity of logic programming languages allows a newly learned hypothesis
to be immediately reused as data to learn a new concept, in a way that is
probably impossible with statistical techniques. And of course, learning is
extremly sample efficient and the resulting programs generalise vey well.

I bet you had never heard of any of this before now :)

[2] To be honest, my guess is that the way forward is a combination of the two
approaches- perhaps, some monster combo of deep learning for sensory
perception, logic for inference and probabilities for final decision making.

One way or another, I expect the AI researcher of the future will have to be a
jack of all trades - and master of all of them at once. Learn to juggle.

------
skohan
Deep learning has produced a lot of interesting results over the past few
years, but we're still miles away from anything that even approximates human
intelligence. When I hear someone like Elon Musk say something like
"Autonomous vehicles can drive without Lidar because humans can drive without
Lidar" it makes me cringe.

The human brain is unimaginably complex. Even if we created a neural network
on the order of the 100 trillion or so connections in the central nervous
system we would still not even be close. Each one of those synapse is a
complex, constantly adapting chemical computer which is sensitive to the
interaction of dozens of neurotransmitters. Gradient descent is to neural
adaptation as minecraft is to photorealism.

The idea that artificial intelligence will be achieved by adding a few layers
to a "deep" neural network, or by throwing more data at it is laughable.

~~~
lawlessone
>"Autonomous vehicles can drive without Lidar because humans can drive without
Lidar"

I have another reason to hate this. If we could give humans lidar to make
their driving even better why shouldn't we?

Why should we deny an AI extra senses that can help it do its job better than
we can?

~~~
ianai
There is such a thing as being crippled by information.

------
vinchuco
Hardware enables software

~~~
alexgmcm
Yeah - I remember reading about memristor based technology that aimed to
better emulate neurons in the hardware back when I was in Uni.

That was almost 10 years ago though and I haven't heard much of it since.

------
trumped
We need a field of AI that makes sure all other fields of AI behave nicely.

