
Machine learning has become alchemy (2017) [video] - henning
https://www.youtube.com/watch?v=x7psGHgatGM
======
Barrin92
The problem with ML in my opinion is not that we're missing some sort of
fundamental theory, but that there simply is none. ML is essentially fancy
pattern matching roughly resembling the human visual system, which is why it
happens to be good at tasks related to perception.

It's not some master algorithm, it's not going to produce sci-fi AI, and it
probably isn't even suited to solve most problems in the realm of
intelligence.

In fact it basically hits the worst possible spot on the problem solving
scale. It barely learns anything given the amount of computation effort an
data that goes into it, but it just happens to be good enough to be
practically preferable to old symbolic systems.

It is completely mysterious to me how networks that approximate some utility
function are a huge step forward to giving insight into cognition, reasoning,
modelling, creation of counterfactuals and the sort of mechanisms we actually
need to produce human like performance.

~~~
soared
> It barely learns anything given the amount of computation effort an data
> that goes into it, but it just happens to be good enough to be practically
> preferable to old symbolic systems.

I don’t follow this. Are you implying there haven’t been absolutely massive
gains in computer vision, nlg, nlp, etc?

~~~
turingspiritfly
Those massive gains have yet to considered reliable enough to be considered
trusthworthy. Would you consider them trusthworthy in court, where lives are
at stake? Gains are nice but we are still so far from the essence of AI
systems and considering how much resources we are pouring into learning, at
this point all of them appear as nothing more than massive fat expensive toys

~~~
another-one-off
> Would you consider them trusthworthy in court, where lives are at stake?

Probably. Human intelligence is extremely fallible - based on the statistics
the only reason we trust humans to do half the stuff they do is because there
is literally no choice.

If we held humans to a high objective engineering standard We wouldn't:

* Let them drive

* Let them present their memories as evidence in a court case

* Entrust them with monitoring jobs

* Allow them to perform surgical operations

Humans are the best we have at those things, but from a "did we secure the
best result with the information we had" perspective they are not very
reliable. A testable and consistently performing AI with known failure modes
might even be able to outperform a human with a higher failure rate (eg, we
can reconfigure our road systems if there is just one scenario an AI driver
can't handle).

Basically, you might be dead on the money that they are not 'trustworthy
enough', but lets not lose sight of the fact that even being an order of
magnitude from human performance might be enough after costs and engineering
benefits get factored in. The weakest link is the stupidest human, and that is
quite a low bar.

~~~
denzil_correa
> Basically, you might be dead on the money that they are not 'trustworthy
> enough', but lets not lose sight of the fact that even being an order of
> magnitude from human performance might be enough after costs and engineering
> benefits get factored in.

Ironically, the thing that is lost in this comment would be "accountability".
In case of a human, you can go back / trace decision making criteria and hold
someone accountable. In case of an algorithm, everyone washes their hands off.
Performance is not the only criteria to make a decision if algorithms are
"trustworthy" over humans.

~~~
RA_Fisher
Linear models are highly interpretable and an operator can be held
accountable.

------
joe_the_user
It is simple to say deep learning is based on "alchemy" or "engineering" or
whatever it is that isn't strong theory. And it's reasonable to say deep
learning has a lot of mathematical and statistical intuitions but doesn't have
a strong theory - maybe just doesn't yet have a strong theory or maybe can
never get one.

So this is by now a standard argument. The standard answers I think have been:

1) Well, we are discovering that you don't need a strong theory for a truth-
discovery machine. We've discovered experimental thinking construction.

2) It's true deep learning doesn't have a strong theory now but sooner or
later it will get one.

3) This shows that instead we need theory X, usually by people who've been
pursuing theory X all along. But I think Hinton at least has thrown a bunch of
alternatives against the wall over the years.

(I could swear this has appeared here before but I can't find it).

~~~
bartread
Worth also bearing in mind that we've been here before in other fields.
Alchemy ultimately became chemistry.

Even in the Victorian era when, for fairly large swathes of the periodic
table, and different types of compound, we already had a quite good
experimental understanding of chemical reactions in terms of their constituent
components and products, along with the conditions under which those reactions
occur, we still didn't know the why. We didn't understand much about atoms or
how they bond together, for example.

The point is this: science can often take a long time to advance, and AI is
still a very young field, with the first practical endeavours only dating back
to the post-WWII period.

Should we therefore be terribly surprised that ML seems a bit like alchemy?

As an aside, another normal facet of scientific advancement is the vast
quantity of naysayers encountered along the way. Haters gonna hate, I suppose.
(But don't misunderstand me: whilst I'm not an ML fanboi, I recognise that
advances come in fits and starts, dead-ends _will_ be encountered, and overall
it's going to take quite a long time and require a lot of hard work to get
anywhere.)

Final aside: this video has definitely been posted here before but I've also
been unable to find it.

~~~
joe_the_user
_Worth also bearing in mind that we 've been here before in other fields.
Alchemy ultimately became chemistry._

Indeed but that might be where the analogy breaks down.

I think one say that we just don't know how far we can take "experimental
computer science" \- where the experimental part is making "random" or
actually "seat-of-the-pants" programs and seeing what they do. This is simply
new and one could create a physics on top of this particular kind of
experimentation is yet to be seen.

------
ineedasername
Even granting the claim that ML has become alchemy, consider what alchemy was:
Misguided in its quasi-magical underpinnings and goals, but nonetheless an
extremely important step in the founding of modern chemistry. And so what
started out with poor understanding and numerous misconceptions evolved over a
period of time, through trial and error and hard won knowledge into a
massively important branch of science. And so if ML is indeed comparable to
alchemy, the only salient question is how long, and what will it take, to turn
into chemistry.

------
rwilson4
Science is a combination of theory and experiment. Sometimes theory advances
faster than experiment, sometimes vice versa. Right now in ML, experiment aka
practice is advancing faster than theory. Theory will eventually catch up.

~~~
xamuel
ML is a souped-up version of "draw a line through these points". We can get
really efficient at drawing lines through points, but it's not like we'll
suddenly realize some deeper fundamental theory about it!

~~~
whatshisface
The reason you don't expect to see a deep fundamental theory of drawing a line
through a few points is because you can always do it. ML doesn't always work,
and sometimes it is harder to get working than other times. What's going on?

~~~
xamuel
You can always draw a line through the points, but it isn't always a good
approximation. If the points are inherently bunched around a line, then a line
through them will approximate them well. If they're a big random cloud, then
the line won't. It's the exact same way in ML, except the points are in
n-dimensional space and "line" is replaced by "higher-dimensional curve or
manifold". Sometimes (e.g. in image processing), the n-dimensional points are
inherently bunched around a curve or manifold of the form you're using, and
then ML works great. There's nothing deeper going on!

------
TrackerFF
The ML scene was more rigorous 10-15-20 years ago, because it was mostly
confined to the world of academia and industrial R&D, and we had much less
wide-reaching problems to work towards.

As tech evolves, more data is being generated, which in turn creates more
problems that increases the demand for solutions.

To put it short: Solving "real world" problems is more/better rewarded than
figuring out the underlying technology, so it's no wonder that we see a larger
portion of practitioners that may lack the academic background - and that is
understandable.

You could be a ML Ph.D (in Academia) for 5-10 year, earning gov. worker salary
while trying to reach the next step - or hack together some ML-based product,
until desired accuracy, and cash in 50-100 times the pay.

Now, with that said, I understand that Ali targeted his speech at the NIPS
folk - but the industry / academia crossover in ML is massive, and it still
stands that much of industry problems are of commercial interest - so the
motivation for many is still the same.

------
sonnyblarney
Making a serious, industrial scale web app in 2000 felt like alchemy. It was
all arcane, there were no established patterns, nobody knew how to do it for
sure, there were a lot of hustlers, most of them thankfully sincere.

When something is new, it feels like a mystery - eventually we'll have a
language for wrapping our heads around neural networks, even if it's not as
clear cut as we'd like.

~~~
brain5ide
We had the neural networks, and the language. The problem is with the
rebranding and the amount of marketing bullshit comming with it. The null
hypothesis is that apl of that most of it is a pile of crap, for an
overengineered, overoptimized solutions that are probably applied at
abstraction layer different from one they are marketed on. There may come
solution out of it, but it's more wishfull thinking than not.

------
whoisnnamdi
Sure - perhaps alchemy in the sense that many practitioners are simply
throwing things against the wall and seeing what sticks, but not in the sense
that there isn't anything real behind all the math or engineering.

Many advancements in machine learning have significant backing in theoretical
proofs that a given algorithm will result in unbiased estimates, or will
converge to such and such value etc.

On some level, the high amount of experimentation necessary in machine
learning is not so much a sign that the practice is faulty in any particular
way, but rather, that the world is a complex place. This is _especially_ true
when attempting to predict anything that involves human behavior.

Long-story short - I'd cut ML some slack!

~~~
Retric
Alchemists could do a lot, making gunpowder for example is non trivial. They
simply worked from a bad model with little understanding of what was going on.

Consider, lead and gold are very similar substances and chemistry lets you
transform many things into other things so it must have seemed very possible.
Unfortunately, I suspect the current AI movement is in a very similar state
even if they can do a lot of things that seems magical it’s built on a poor
foundation. Resulting in people mostly just trying stuff and see what happens
to work without the ability to rigorously predict what will work well on a
novel problem.

------
puzzledobserver
Can someone give a concrete example of the kind of theoretical properties they
desire of ``new-style'' machine learning? The kinds of properties that ``old-
style'' learning methods guaranteed?

People often complain about interpretability: in what sense is an SVM
interpretable that a deep neural network is not?

Or is the worry about gradient descent not finding global optima? But why is
the global optimum a satisfactory place to be, if the theory does not also
provide a satisfactory connection between the space of models and underlying
reality?

The arbiter of good theory is ultimately its ability to guide and explain
practical phenomena. Which machine learning phenomena are currently most in
need of theoretical elucidation?

~~~
digitalzombie
Most machine learning algorithms that aren't statistical base doesn't give a
CI. From a statistical stand point it doesn't give a sense of how good your
prediction is. You can get a general sense with just CV.

Also your parameter is not inferable like in statistical algorithm. This is
where I see people saying Deep Learning isn't interprable and there are
research into this area. If you compare time series stat forecast algorithm
with deep learning you at least get a CI on stat algorithm.

Randomly dropping node is pretty magic in my mind.

While I don't know much about SVM I know it's mathematically proven so there
should be a way to interpret SVM fitted model.

I sure as hell wouldn't use ML in clinical trial for drugs. That's why biostat
is a thing.

------
tabtab
We are still at the experimental stage, so trial and error without clear
theories is not unexpected. Chemistry formed out of alchemy: people fiddled
and noticed the patterns. The patterns were written down and shared, and
different people began floating theories/models to explain the patterns, and
the theories were further vetted by more experiments.

Another thing is that the industry should look at other AI techniques besides
neural nets (NN) or find compliments to NN's. Genetic algorithms and Factor
Tables should also be explored as well. Just because recent advances have been
in NN's does not necessarily mean that's where the future should lead. Factor
tables may allow more "dissection" & analysis by those without advanced
degrees, for example. Experts may set up the framework and outline, but others
can study and tune specifics.
([https://github.com/RowColz/AI](https://github.com/RowColz/AI))

------
mistrial9
This video is great ! for me, not for the math though.. two things:

* multi-layer, automated "jiggering" with so many components that only a machine can contextualize them, might be great for finding patterns in some sets, but the industry HYPE, the DIRECTION+VELOCITY, and the human manipulation (including lies and pathologies) are gut-level PROBLEM.. and this guy says that! +100

* Alchemy itself rambles and spreads [1] Some variations of Alchemy included a ritualized, internal psychological and psychic experience by the human practitioner.. hard to stabilize, yet not always a bad thing, since you are reading this and are actually one of those ..

[1]
[https://en.wikipedia.org/wiki/Psychology_and_Alchemy](https://en.wikipedia.org/wiki/Psychology_and_Alchemy)

lastly, the selected slides encourage a student-minded viewer to look up some
math and think a bit. Not a bad thing. Thanks for this video and thanks for
the talk.

------
imperio59
The way I understand neural networks to work, they are actually series of
connected infinitely valued logic gates, where given a numerical input from
neg infinity to positive infinity, it spits out another number from neg
infinity to positive infinity that feeds into the next set of logic gates, and
at the end gives you a confidence interval from 0 to 1 of whether there was a
pattern match or not.

To me it's very similar to Boolean logic circuits the way I was taught those
in college except that there's too many gates to configure manually so you use
supervised learning to find reasonable values and an arrangement of gates that
works (aka your trained neural network).

I've never heard anyone else describe it this way but this is how I like to
think about it. It really has nothing to do with how the human brain or much
less human mind works. That's just marketing speak.

~~~
digitalzombie
> That's just marketing speak.

Pretty sure it was the initial thinking when Neural Network was created and it
have move beyond that. I think people who know surface level only repeating
this tadbit that's out of dated.

It's even in the official wikipedia article
([https://en.wikipedia.org/wiki/Artificial_neural_network](https://en.wikipedia.org/wiki/Artificial_neural_network)).

Everything you've stated was basically a personal opinion that could have been
verified via google...

------
m0zg
We do understand how it works most of the time, but can't predict if certain
changes will be beneficial or detrimental. After the fact it's pretty clear
what e.g. CNNs do (find manifold transforms which, coupled with nonlinearities
minimize error in the layer-wise output distributions when backpropagating
loss of the minibatches of the training set). But you can't reliably say "if I
add e.g. a bottleneck branch over here my accuracy will go up/down X%".

Fundamentally we have to contend with the fact that human ability to
understand complex systems is fairly limited, and at some point it will become
an impediment to further progress. Arguably in a number of fields we're past
that point already.

~~~
oldgradstudent
That's all very cool.

But for safety critical systems you have to understand how these systems work
to understand their limitations. You have to know when these techniques
succeed, when they fail, and how badly they fail.

~~~
m0zg
Do you feel like you understand human limitations in these critical systems?
Are humans suitable? Would AI be suitable if it performs statistically better
than humans?

~~~
oldgradstudent
> Would AI be suitable if it performs statistically better than humans?

In general yes, but it might depend on the pattern of failure - if your self-
driving cars hunts me or my family personally, I might have problem with that.

But how can you determine that without releasing it to the wild and waiting
for bodies? Worse, say you have a safe system, but you need to modify the
network (to fix some bug). How can you determine that the new system is safe
enough to put on the road?

~~~
m0zg
But any technology can be deadly if you deploy it widely enough. _WhatsApp_
has resulted in "bodies" and it doesn't have any AI in it at all. First
airplanes were basically flying coffins. Cars until early 90s had very little
chance of survival in collision above 40mph. Many drugs have serious,
sometimes deadly side effects. Quarter of a million people die in hospitals in
US alone every year due to medical errors. 100% of those errors are currently
made by humans.

It's remarkable that AI seems to be held to an arbitrarily high standard,
often exceeding that of other technologies.

~~~
bumby
My guess is that most people feel AI should be held to a higher standard is
because we feel the need to be able to audit the system in the case of
mishaps. When ML becomes a high-level black box, we may not have the
confidence in how to right that ship if it goes astray. With human errors, if
we're (hopefully) empathetic creatures we at least have the hope of
understanding the root of the error.

~~~
RA_Fisher
The good news is that these tools exist but they're called statistics.

~~~
bumby
Statistics applied to black-box component mishaps have a couple things going
against them. 1) you need a relatively large number of failures to build a
good sample of data and 2) even if you have the probabilities in place to
quantify risk, you may never understand the root cause of the failure to fix
or mitigate it.

For a large expensive system, the presence of either of the above may be
unacceptable. Take something like the space shuttle program. If it was heavily
reliant on black box AI, you might be able to build probabilities through
tools like monte carlo simulations but you would be hard pressed for the
government to put billions of dollars at risk without understanding the root
cause of simulation failures

------
bob1029
In my mind, machine learning is simply one class of algorithm primarily
oriented towards multidimensional fuzzy pattern matching. This technique can
easily be classed as a subset of digital signal processing concerns.
Everything tacked onto the sides of this (LTSMs, etc.) in some attempt to
increase the cleverness or scope of the ML network (aka self-driving cars and
other assorted high magickery), seek to band-aid something that is
fundamentally flawed for these purposes.

A ML network does not have intrinsic, real-time mutability in how it is
defined, outside the scope of memory-based node weights, inputs, outputs or
graphs over time. These nodes are added, removed or modified based on a
predefined set of input, output and internal mappings. How would intermediate
layers be defined in a dynamic way between input and output in such a network
in an attempt to achieve these higher powers? Driving a car, for instance, is
a task that requires learning entire new subsets of skills, many times ad-hoc,
that require intermediate models to be dynamically developed which are
potentially outside the capabilities of our understanding. The biggest
challenge I see today is that we don't necessarily have a good way to
dynamically construct models of the intermediate layers (such that we can map
them to other layers), especially if these layers are being added and removed
dynamically by algorithms at the edge of our capability to understand.

I've always felt that there needs to be some internal processing occurring at
rates far higher than the input sample rates such that higher-order
intelligence may emerge by way of adjusting the entities noted above multiple
times per input sample (and potentially even in the absence of input). The
problem is also going to come down to how a person would define outcomes vs
how an AI/ML network would. For the future to really begin we will need an AI
that can understand and reason with what success and failure _feel_ like in
our abstract terms. This will require it to have the capacity to dynamically
construct abstractions which we would have no hope of modelling ourselves, as
we do not have very deep insight into the abstractions upon which the
biological human brain implements virtually any behavior today. There is no
amount of discrete math in the universe which can accurately model and assess
the higher-order outcomes of decisions made in our reality. You can run ML
disguised as AI in simulations and environments with fixed complexity all day,
but once you throw one of these "trained" networks out into the real world
without any constraints, you are probably going to see very unsatisfactory
outcomes.

------
dplarson
Additional context from Ali Rahimi and Ben Recht:
[http://www.argmin.net/2017/12/11/alchemy-
addendum/](http://www.argmin.net/2017/12/11/alchemy-addendum/)

~~~
mistrial9
.. and a Jupyter Notebook [https://github.com/benjamin-recht/shallow-linear-
net/blob/ma...](https://github.com/benjamin-recht/shallow-linear-
net/blob/master/TwoLayerLinearNets.ipynb)

------
Mikhail_K
If someone turns in math homework consisting in answers only, the teacher will
probably not give any credit, and will ask to show the work. Yet the AI/ML
community insists on cargo cult test for intelligence - behaving like someone
with a mind.

That can be traced to the "Turing test". It was flawed then, and it is flawed
now. Reproducing the behaviour of a thinking agent does not prove that a
putative AI will not fail in a more detailed test, as demonstrated in numerous
papers about "adversarial images".

------
nopinsight
Related: Debate at NIPS 2017 on “Interpretability is necessary for machine
learning”, by senior researchers in Machine Learning including Yann LeCun.

[https://youtu.be/93Xv8vJ2acI](https://youtu.be/93Xv8vJ2acI)

