
Unreasonable Ineffectiveness of Machine Learning in Computer Systems Research - donnemartin
https://www.sigarch.org/the-unreasonable-ineffectiveness-of-machine-learning-in-computer-systems-research/
======
60654
> Driverless long-haul trucks are apparently just a few years away, and the
> main worry now is not so much the safety of these trucks but the specter of
> unemployment facing millions of people currently employed as truck drivers.

No, no they're not. We have some lane tracking in good weather etc., but we
are still _decades_ (or more) away from full level-5 autonomy that would make
drivers behind the wheel unnecessary.

But it only goes to show that not even computer science experts are immune to
marketing hype and well funded PR campaigns. :)

As for the unreasonable ineffectiveness - it's not just in systems research.
ML can be very effective in some areas (especially when there is a ton of
training data), but many areas of human endeavor are hard to model via
function approximation techniques like those used in most of ML.

~~~
snackai
> but we are still decades (or more) away from full level-5 autonomy

Decades? Are you really sure? This assumption seems quite uninformed to me.
The Grand Challenge was 12 years ago, from then we evolved from Level 1 to
Level 3. Back then there was no data, computing power wasn't available as
today, chips (&sensors) can be designed an built within a few months or even
less these days for far less money. Machine Learning and Computer Vision made
many advances in these 12 years (talent in these fields is now easier to
acquire as more people can learn about it these days). We are at a completely
different point these days. Level 5 is not the matter of _decades_ , 5 years,
10 max. But not _decades_.

~~~
codinghorror
I disagree. I think the higher levels are many orders of magnitude more
difficult.

~~~
tim333
Volvo is supposed to be rolling out a few level 4 cars to the public this
year. (Level 4 being no human attention required on certain roads that the car
knows, if there are problems it drives to a safe place to park and pulls
over). [http://www.theverge.com/2016/4/27/11518826/volvo-tesla-
autop...](http://www.theverge.com/2016/4/27/11518826/volvo-tesla-autopilot-
autonomous-self-driving-car)

------
dkarapetyan
I think this is because the kinds of problems that arise in system design are
logical and symbolic in nature and the current crop of "AI" has no symbolic
reasoning capabilities. All the current hype is about pattern matching. Very
good pattern matching but just pattern matching nonetheless. Whereas when
constructing a compiler or a JIT it's more like what mathematicians do by
setting down some axioms and exploring the resulting theoretical landscape.
None of the current hype is about theorem proving or the kinds of inductive
constructions that crop up in the process of proving theorems or designing
compilers and JITs.

For an example of the kind of logical problem optimizers solve you can take a
look at: [https://github.com/google/souper](https://github.com/google/souper).

So I don't see how you can take the current neural nets and get them to design
a more efficient CPU architecture or a better JIT.

~~~
bmh100
It would actually be very straightforward to do so if the costs of testing
solutions weren't so high. CPU architecture and JIT code can both be
represented as unstructured (non-tabular) data. I even recall a circuit having
been optimized by a genetic algorithm a while ago in an experiment. I also
recall using LSTM to generate valid code from IIRC examples in Linux.
Superiptimizatiom is also a relevant topic.

We just need better simulation tools or more resources.

~~~
jacquesm
Genetic algorithms are not neural nets though.

~~~
xapata
And they're generally worse than simulated annealing.

------
andreasvc
Cute title but the post didn't really address either reasonableness or
effectiveness, but mostly claimed that the potential has not yet been
realized. It's a pet peeve of mine to see these hackneyed joke titles
referencing famous papers, "considered harmful" is another case in point.
Let's just stick to descriptive titles.

~~~
visarga
It imitates an old paper title: "The Unreasonable Effectiveness of Mathematics
in the Natural Sciences" by Eugene Wigner (1960)

~~~
backpropaganda
I thought Wigner was actually paying homage to Karpathy.

~~~
visarga
Few know that Karpathy was using a previous meme.

------
Animats
The perceptron scheme for branch prediction (full paper) [1] probably works
because it uses far more memory for branch history than the usual approaches.
It's not doing a better job with comparable resources.

[1]
[https://www.cs.utexas.edu/~lin/papers/hpca01.pdf](https://www.cs.utexas.edu/~lin/papers/hpca01.pdf)

------
saosebastiao
Indirectly they have helped quite a bit. Some of the most advanced
mathematical and symbolic solvers (MIP, IP, LP, CP, SAT, SMT) have slowly been
incorporating machine learning to advance their capabilities. Their use cases
in these solvers include: branch prediction, branch selection, constraint
evaluation order, solver type selection, search strategy selection, cost
estimations for column generation strategies, problem classification, solve
time estimation, etc.

And since advancements in our abilities at solving symbolic and mathematical
problems have directly enabled the current research in PLT and formal systems,
I see no reason to discount the impact ML has had in pushing that frontier.

------
colorincorrect
Perhaps this paper provides an explanation?
[https://arxiv.org/pdf/1608.08225.pdf](https://arxiv.org/pdf/1608.08225.pdf)

"The exceptional simplicity of physics-based functions hinges on properties
such as symmetry, locality, compositionality and polynomial log-probability,
and we explore how these properties translate into exceptionally simple neural
networks approximating both natural phenomena such as images and abstract
representations thereof such as drawings."

~~~
xfs
This is really good. Deep learning right now is giving off a kind of illusion
of domain-independent general intelligence that can solve any problem, so it
would be really helpful to have some theoretical characterization of the
specific problem domains it's good at and ones it's not good at.

------
mcguire
Branch predictions are an interesting use, although I'm wondering how
expensive a misprediction really is.

But this:

" _Another example is the use of regression techniques from machine learning
to build models of program behavior. If I replace 64-bit arithmetic with
32-bit arithmetic in a program, how much does it change the output of the
program and how much does it reduce energy consumption? For many programs, it
is not feasible to build analytical models to answer these kinds of questions
(among other things, the answers are usually dependent on the input values),
but if you have a lot of training data, Xin Sui has shown that you can often
use regression to build (non-linear) proxy functions to answer these kinds of
questions fairly accurately._ "

I'm not sure whether I am fascinated or horrified.

~~~
pm215
Wikipedia reckons 10-20 cycle penalty for a branch mispredict (which sounds
plausible given it fills the pipeline with useless junk). Given that branches
are quite common, that's painful enough to want to avoid, but definitely not
so painful that you'd want to devote as much silicon to solving it as you do
to, say, L1 cache.

I do recall a bit of research (published by a Nokia R&D team I think) that
reckoned you could get a mostly-ok performance estimate by tracking about half
a dozen indicators including instructions executed, cache misses, tlb misses
and brancb mispredicts and weighting them appropriately. The trouble there is
nobody wants a performance model that's right 90% of the time but
significantly wrong 10% of the time with no way to tell if the workload you
want to test is in the 10%. But it's an indication of the importance of branch
prediction still, I think.

------
mfreed
Keith Winstein & collaborators have good work about using ML to train TCP's
congestion control in different scenarios:
[http://web.mit.edu/remy/](http://web.mit.edu/remy/)

------
ganfortran
I guess computer is much more deterministic than what is required for ML to be
useful.

ML, in a very inaccurate way, can be seen as:

1.We have observations and conclusions.

2.We don't know exact those observations leads to the conclusions.

3.The assumed procedure that leads the observations to conclusions is called
model.

4.With enough pairs of (observation, conclusion), we can train a good model
that is good enough to make good decision on future observations.

Problem for traditional computer science is that, the system is so
deterministic that we know EXACTLY how it works on instruction level, while ML
is good at dealing problem that is inherently probabilistic.

------
leecarraher
The latter part of the article seems to focus on deep methods not influencing
lower level system architecture and design. Perhaps the reason is that those
systems and problems are fundamentally different than the dynamics systems
that NNs are finding so much success in. In short, compiler, programming and
architecture are very formal exact systems, while driving, tts, stt, image
association, etc are nowhere near as controlled of environments.

------
elvinyung
To contrast this opinion, a promising research project from CMU that uses an
RNN to manage a database based on "forecasted" workloads (I know it's not
_quite_ architecture, but still): [http://pelotondb.io/](http://pelotondb.io/)

------
austincheney
Perhaps the biggest hurdle in this regard is the approach to machine learning.
Nearly everything I have seen on machine learning is a primer on big data
followed by a series of algorithms on making the best and smartest decision
upon that mountain of data.

This is completely the wrong approach. Machine learning can be done on a dime,
provided the proper nurturing and environment, but you have to be willing to
make some concessions.

First and for most you have to be able to write a program that can make a
decision. A simple "if" condition is sufficient.

Secondly, that decision is open to modification by asserting the evaluation
(the "if" condition) against its result. In this regard the logic is fluid
opposed to a series of static conditions written by humans hoping to devise
organic decisions.

Finally, the decision is allowed to be completely wrong. Wrong decisions are
better than either no decision or the same decision without deviation. This is
how humans learn and it should be no surprise that computers would benefit
from the same approach.

The key to getting this right is bounds checking and simplicity. A decision
must find a terminal point in which to stop improving upon its outcome, and a
narrow set of boundaries must be affirmed to prevent unnecessary deviation. It
is perfectly acceptable if some grand master must occasionally prod the
infantile program in the right direction. This is also something that people
do to other people who are learning.

If you can do that you have machine learning. You don't need big data to get
this. You certainly don't need complex transportation machines or voice
activated software to validate it. AI on a dime. If you can do it on a dime
you can certainly do it with a multi-billion dollar budget and thousands of
developers.

~~~
rattray
You're advocating an evolutionary approach, correct? Doesn't such an approach
need lots of examples to trial, before it generalizes broadly?

"Big data" is often shorthand for "lots of examples", no?

~~~
bmh100
The poster is referring to control theory (often seen in ML as reinforcement
learning), while also touching on the explore-exploit tradeoff in optimization
more generally.

------
csours
OT: How and why is this page overriding my control key, and how can I stop it
from doing that?

I use ctrl+scroll wheel to zoom and it is very annoying when that behavior is
overridden.

~~~
emurray
I think rather than overriding your control key, it's overriding scrolling. It
loads up some smoothscroll javascript, and there's likely a bug someplace
that's interfering with the ability to zoom. Holding control and pressing +/\-
to zoom in/out should still work.

~~~
csours
Thanks

------
justicezyx
Since when the belief that "a silver bullet exists for computer science
research" becomes a thing?

