
Ways to think about machine learning - anastalaz
https://www.ben-evans.com/benedictevans/2018/06/22/ways-to-think-about-machine-learning-8nefy
======
almostarockstar
A couple weeks ago I was asked to meet with two actuaries. Their company (a
big one, I'm working for them on contract) is trying to promote the use of
Machine Learning and when they heard I had just completed post-grad in ML,
they were eager to pick my brains.

In a 60 minute meeting, we spent 5 minutes discussing how ML worked and 55
minutes going in circles. They had lots of data, but no problems to solve. And
there are only so many ways to explain that a large dataset is not equivalent
to a business problem.

I'll be sending this on to them.

~~~
ggg9990
If actuaries don’t have business problems that could use ML what the heck do
they do all day? Insurance is a great target for ML.

~~~
lordnacho
As with many lines of work, the cool part may be a very small part of their
day.

With insurance, there's probably an already entrenched way to think about the
data that's been there for decades, as well as regulations that help to
entrench it.

It might also be the case that the business problem isn't risk analysis at
all, as one might expect. It may be finding investments for the float that
provide a sensible return within the regulatory remit is harder than figuring
out how much needs to be charged for dinging someone's car.

~~~
nerdponx
Correct on all points. There is also the issue that the pricing models
themselves need to be able to stand up to some form of regulatory scrutiny
(depending on the state and the nature of the insurance product).

------
kenjackson
> They address a class of questions that were previously ‘hard for computers
> and easy for people’, or, perhaps more usefully, ‘hard for people to
> describe to computers’.

Those aren't the only problems. ML can also solve problems that were
previously 'hard for people and with no good algorithm for computers'. These
are problems where there is a good labeled dataset, but no good algorithm to
map from data to label. For example, the work determining sexual orientation
from images ([https://osf.io/fk3xr/](https://osf.io/fk3xr/)).

The problem with this approach is you get predictive ability, but no insight.
But still can be of great value, and potentially great danger too.

~~~
fnbr
To be clear, it's not obvious that the paper you linked is actually accurate.
A lot of researchers consider that paper to be deeply flawed, and to show
something other than what it claims to.

~~~
lhnz
What are others saying it shows?

~~~
claytonjy
I enjoyed this take on it:
[http://callingbullshit.org/case_studies/case_study_ml_sexual...](http://callingbullshit.org/case_studies/case_study_ml_sexual_orientation.html)

------
qznc
My mental model: Neural networks are for gut decisions. This fits their
application domain quite well imho.

Gut decisions are quick. You look at a picture and decide its a dog quicker
than you can say it. Executing a neural net is feasible even on mobile devices
on battery.

Good gut decisions require experience/training. You must train extensively to
be good and your training data must be good. An example from "Thinking, Fast
and Slow": Experienced stock traders often claim to have a gut feeling, but it
is bogus because their training data is bogus (good decisions lead to bad
outcomes and vice versa). In contrast, experienced firefighters have a gut
feeling if it safe to enter a burning house. This works in practice. The
observe stuff about the environment reliably without being able to point them
out consciously.

Gut decisions are not about planning or knowledge. This requires different AI
techniques than neural nets and intuitively shows their shortcomings.

------
graycat
It's tough to know just what _machine learning_ covers, includes, consists of.
From what I've been able to see, currently in practice, apparently 90+% of
machine learning is _curve fitting_ and nearly all of that is some form of
classic linear regression.

Linear regression and curve fitting more generally have been around,
available, and used going way back in electronic digital computing and well
before. E.g., for software we've long had the IBM Scientific Subroutine
Package (SSP), SPSS (Statistical Package for the Social Sciences), SAS
(Statistical Analysis System), ..., R. There are stacks of polished textbooks
in statistics and specialized to some fields, e.g., econometrics, time series
analysis, etc. There has been some usage, but headlines have been rare for
decades.

But, there's a lot more to applied math than anything much like that curve
fitting. E.g., maybe take the _learning_ as the ability to do statistical
estimates -- well, there's a lot to statistical estimation than curve fitting.

There's also the field of optimization -- linear programming, integer linear
programming, network linear programming, dynamic programming (discrete time,
continuous time, deterministic, under uncertainty), quadratic programming,
other cases of non-linear programming, and more.

And lot more can be done in applied probability and stochastic processes.

IMHO it would be better progress and description of the current state to
discuss both the applications and the solution techniques in more detail.

E.g., the crucial core of my startup is some applied math I derived (with
theorems and proofs) based mostly on some advanced pure math prerequisites. I
wrote the corresponding software. Then from 50,000 feet up or from the point
of view of a user of the work, what I did can look like a _machine_ that
_learns_ a lot very quickly, is very _smart_ , and puts out really
_intelligent_ stuff. Still, my math is not covered by any of the machine
learning I've heard of.

Point: There's a LOT of applied math that can be done and, really, has been
done far from current descriptions of machine learning.

~~~
synthc
Which startup would that be?

~~~
graycat
See below, to be precise, in

[https://news.ycombinator.com/item?id=17396176](https://news.ycombinator.com/item?id=17396176)

That's my first description of the work on HN. Now I want to rush to an alpha
test, to be announced on HN.

------
ggregoire
> 1) Machine learning may well deliver better results for questions you're
> already asking about data you already have, simply as an analytic or
> optimization technique. For example, our portfolio company Instacart built a
> system to optimize the routing of its personal shoppers through grocery
> stores that delivered a 50% improvement (this was built by just three
> engineers, using Google's open-source tools Keras and Tensorflow).

> 2) Machine learning lets you ask new questions of the data you already have.
> For example, a lawyer doing discovery might search for 'angry’ emails, or
> 'anxious’ or anomalous threads or clusters of documents, as well as doing
> keyword searches.

> 3) Third, machine learning opens up new data types to analysis - computers
> could not really read audio, images or video before and now, increasingly,
> that will be possible.

Are there any online resources to learn and implement step-by-step some ML to
solve one of these use cases (or other ones)? Like a GitHub repo with data
samples, a programming environment and then a step-by-step guide to solve
trivial but real business problems?

~~~
chasil
I am working through Google's machine learning crash course.

[https://developers.google.com/machine-learning/crash-
course/...](https://developers.google.com/machine-learning/crash-course/ml-
intro)

Tensorflow sprays a lot of calculus. The idea is that all your known
quantities ("features") become terms in an n-dimensional polynomial. The act
of "training the model" is finding minima by traversing the negative gradient
(the terms of the partial derivatives of any point when projected as a vector
at that point).

I'm glad I had calc 3, even if that was only 3-dimensional.

~~~
fnbr
Technically, it's not a polynomial as you can have nonlinear activation
functions, such as the ReLU function. There's no polynomial that's equal to a
network with ReLU activations (although, of course, a sufficiently large
polynomial could come arbitrarily close).

I would state that a neural network is a large, complicated, differentiable
function, and the beauty of deep learning is that it turns out that by doing
optimization that's derived from basic calculus, you can optimize this
complicated function to do surprisingly useful things.

~~~
chasil
I'm not at the point where I can debate technicalities yet, alas.

~~~
lgas
When you get there you probably still shouldn't do it.

------
brootstrap
Wow good write up, i just made it a few paragraphs in but had to comment.
especially after just reading thru the IBM watson 'AI' bullshizz, this is
refreshing and reminds me that there are folks who are trying to think about
this stuff in new ways rather then just hyping/selling the ____out of 'AI' so
your new BigCo can burn thru 10/100s of millions on some sh __y project =)

~~~
AndrewKemendo
In fact the majority of people who actually do ML/DL or Applied AI in other
forms think this way by default.

As usual, Ben Evans backs into the standard understanding from the experts in
the field as though it's revelation.

~~~
benedictevans
That's the point - to explain how people in the field are thinking about this
stuff. Most people outside SV don't know what the experts in the field are
saying. Indeed, many people _in_ SV don't know this ;) If you already know all
of this, you're not the target audience.

~~~
AndrewKemendo
That's fair enough. It's a running joke with applied ML people that half of
your job, if you have any sense of professionalism, is convincing possible
clients that no, they probably don't need to use a 41 layer DNN to send a
follow up email.

~~~
benedictevans
A database consultant once told me that half his gigs were telling people with
a database to use Excel

~~~
boxy310
Funny - half of my experiences talking with nonprofits were telling people
with an Access database to migrate to a proper SQL database.

------
JoeSmithson
There's an XKCD [0] that hasn't aged well at all about how non-programmers
fail to understand that some things that seem easy are actually really hard
for a computer.

Modern ML research moved a big chunk of these tasks into the "easy for a
computer" category, which is very exciting for programmers. However, as the
comic points out, most of these things are stuff that normal people sort of
felt a computer could do already.

I think this is the reason for some of the backlash/disappointment.

[0] [https://xkcd.com/1425/](https://xkcd.com/1425/)

~~~
glial
Given that this was created 4-5 years ago, I'd say it aged perfectly.

------
sebleon
This article is more realistic than most ML posts, but it’s clear the author
is not a practitioner.

> More handwriting data will make a hand-writing recognizer better, and more
> gas turbine data will also make a system that predicts failures in gas
> turbines better, but the one doesn't help with the other. Data isn’t
> fungible.

Transfer learning is one of the most interesting aress of machine learning.
The focus is on taking learnings for one task, and applying them directly for
another. More directly, Jeff Dean from Google had a fascinating talk about
using these techniques to create a single super-model that combines learnings
from thousands of tasks to accomplish new things quickly. [1]

[1] [https://youtu.be/HcStlHGpjN8](https://youtu.be/HcStlHGpjN8)

~~~
outworlder
> This article is more realistic than most ML posts, but it’s clear the author
> is not a practitioner.

No so fast.

Genuine question: how close is the research on 'transfer learning' to
something that can be readily used to solve business problems today?

If you can't fire up tensorflow and the like and use it to solve a real
problem, or if only the likes of Google are able to successfully apply it,
then the author would be correct.

~~~
claytonjy
You _can_ fire up TF to solve real problems without being Google.

Transfer learning is _the_ way to do image classification for most kinds of
images in 2018, and is covered heavily in most classes. In the fast.ai class,
you use transfer learning in the very first lesson to build a dog/cat
classifier. Takes less than an hour to get to 97+% accuracy with no prior
knowledge of deep learning.

~~~
mercutio2
It sounds like you’re saying that transfer learning is helpful for image
classification, which seems like an uncontentious position.

Are you really arguing that you think transfer learning would be useful from
handwriting models to turbine failure models?

Using techniques that are successful with image classification as an example
and generalizing to other domains that don’t look much like imaging seems like
a stretch to me.

But perhaps I’ve missed some more convincing examples of the state of the art
in transfer learning.

~~~
claytonjy
That's a good point, as far as I know there's no examples of cross-domain
learning. There's new work in NLP for cross-task transfer learning, but that's
as close as it gets at the moment.

It's hard to imagine there's anything to learn from handwriting images that
could apply to turbine failure; a much broader kind of multi-task model than
anything well see for awhile.

~~~
Dzugaru
The argument is still false. You can very well get an advantage from vast
amounts of data in similar domains. And more importantly you can have ML
insights not possible without it. What if ImageNet was not open to the public?
Would we get an AlexNet breakthrough?

------
CGamesPlay
This article is pretty great and I agree with the framework it sets up around
automation and future jobs.

One thing I felt was dismissed too easily was the "Google has all the data"
line. Sure, nobody said "Oracle has all the database", but people ARE saying
the former. Why should we ignore the piles of collected data when thinking
about how the landscape will look in the future?

~~~
zerostar07
There are probably plenty of problems that do not require that much data to
solve (e.g. in reinforcment learning). And google doesn't have all the kinds
of data - they didn't own medical imaging data, they had to ask for it. There
are some problems where they have advantage (speech / translation) and they
've probably already solved those problems, but their data advantage is not as
important as they make it to be.

Whats more worrying IMO is that they 're hoarding all the researchers.

------
mcrad
Sophisticated machine vision has been around a heck of a lot longer than the
author suggests. Only because that technology has gotten so accessible do we
argue about who has the role to apply it (and ML in general) in a responsible
or sensible way.

