
Programming with a Differentiable Forth Interpreter - sapphireblue
http://arxiv.org/abs/1605.06640
======
hacker42
I've forgotten where, but I've recently read a blog post/opinion that we might
be encountering the dawn of an applied mathematics winter. Instead of
meticulously crafting algorithms for particular problems, we just apply
stochastic gradient descent to a computational graph to evolve whatever
program solves the problem defined by some training data.

Exciting times.

~~~
Animats
I don't fully understand the article, but they're applying it to bubble sort
and addition. This is about where things were with Lenat's Automated
Mathematician of 40 years ago.[1] Lenat was doing something similar, but on
LISP programs.

After a few years, it turned out that this approach only worked on problems
for which hill climbing worked really well. You need a well chosen metric for
"sorted", and that guides the hill climber to converging on a bubble sort.
(But not Quicksort.)

This needs to be demonstrated on a harder problem.

[1]
[https://en.wikipedia.org/wiki/Automated_Mathematician](https://en.wikipedia.org/wiki/Automated_Mathematician)

~~~
aab0
I would be more optimistic here. AM/Eurisko never showed any replicated
results outside of one or two toy domains of simple math & the Traveler's
game, and there's always been a lot of questions about how much of that much
was even AM/Eurisko and how much was Lenat since he refuses to share source
code and his followup project Cyc is notorious for not delivering anything.
While on the other hand, deep networks have delivered astounding results on a
huge variety of domains when implemented in different frameworks by different
people around the world often taking quite different deep approaches. A result
from AM/Eurisko doesn't mean much. A result from a deep network may be a crack
in the dike which is about to explode and solve longstanding challenges like
Imagenet.

~~~
gambler
You're comparing a single system developed years ago to everything ever
produced with neural networks (which refer to a whole family of different
architectures).

 _> A result from AM/Eurisko doesn't mean much. A result from a deep network
may be a crack in the dike which is about to explode and solve longstanding
challenges like Imagenet._

Sounds like extreme and unsubstantiated bias. Statements like this is why I am
highly skeptical of the current neural network hype.

~~~
aab0
> You're comparing a single system developed years ago to everything ever
> produced with neural networks

I am comparing a single system and all its variants and followups to another
family. Oh wait, there _aren 't_ any variants and followups to AM/Eurisko
except Cyc. Huh. How about that.

> Sounds like extreme and unsubstantiated bias.

ImageNet? AlphaGo? SOTA on language parsing, classification, and prediction
tasks? Human-level performance on scores of Atari games? High-quality image
synthesis, unsupervised and from textual descriptions? Predictions of visual
cortex activations? Program synthesis? If you aren't impressed, you aren't
paying attention.

~~~
gambler
You've watched far too many Hinton videos.

Cyc belongs to the family of rule-based expert systems. Expert systems were
successfully used in medical diagnostics, chemistry, biology and various
branches of engineering. Not to mention countless "trivial" applications in
planning and logistics for businesses. I could also make a case that DeepBlue
was an expert system, and thus add "superhuman performance in chess" to the
list.

Saying that results from an expert system don't matter (simply because it's an
expert system), while believing equivalent results from an ANN will "explode
and solve longstanding challenges" (simply because it's "neural") makes no
sense. ANNs are not magic.

