
How to recognize AI snake oil [pdf] - longdefeat
https://www.cs.princeton.edu/~arvindn/talks/MIT-STS-AI-snakeoil.pdf
======
reggieband
I don't have time to read the entire paper but I would like to share an
anecdote. I worked at a company with a well staffed/funded machine learning
team. They were in charge of recommendation systems - think along the lines of
youtube up next videos. My team wanted better recommendations (really, less
editorial intensive) so the ML team spent weeks crafting 12 or more variants
of their recommendation system for our content. We then ran that system for
months in an A/B testing system that judged user behaviour by several KPI. The
result was all variants performed equally well within statistically
insignificant bounds. The best performing variant happened to be random.

Talking to other groups that had gone through the exact same process our
results were pretty typical. These guys were all very intelligent and the code
and systems they had implemented were pretty impressive. I'm guessing the
system they built would have cost a few million dollars if built from scratch.
We did use this "AI/ML" in our marketing so maybe it was payed for by
increased sales through use of buzz words. But my experience was that in most
limited use cases the technology was ineffective.

~~~
stefan_
Since we are sharing anecdotes, I can report it's been 20 years of buying
stuff on the internet and the combined billions of ad tracking research
dollars spent by Amazon and Google have not yet come up with a better
algorithm than to bombard me with ads for _the exact same thing I just
bought_.

~~~
andybak
I just spent 15m on Amazon trying to prod the recommendation algorithm into
finding something I actually wanted to buy so I could get above the "free
delivery threshold".

Think about that. I wanted to spend money. I wasn't too fussy what it was.
Amazon has a decade of my purchasing and browsing history.

And they still failed.

~~~
Jestar342
Amazon infuriates me. I regularly buy gadget-y bits like electronic components
and peripherals. Probably at least once a month. Never see adverts for
similar.

That ONE TIME I buy a unicorn dress for my 2 year old daughter? That's a
lifetime of unicorn related merchandise adverts and recommendations for you!

~~~
radarsat1
Actually that's really interesting because it exposes a bias in their
recommendation system. They must be heavily biased towards things with _mass
appeal_ instead of specifically targeting user preferences, which is funny
because it goes against the grain of the whole "targeted advertising" promise
of ML. You'd think if anyone could get that right, it would be Amazon, yet..

------
formalsystem
Over the years my heuristic has turned into: "Did the team formulate their
problem as a supervised learning problem?" \- If not it's probably BS.

In longform if anyone is interested [https://medium.com/@marksaroufim/can-
deep-learning-solve-my-...](https://medium.com/@marksaroufim/can-deep-
learning-solve-my-problem-a-type-theoretic-heuristic-e57f4d1658f)

EDIT: I would consider autoencoders, word2vec, Reinforcement Learning examples
of turning a different problem into a supervised learning problem

EDIT 2: Social functions like happiness, emotion and fairness are difficult to
state - you can't have a supervised learning problem without a loss function

~~~
chimi
It's hard to verbalize this, most of it is "intuition" but I think it boils
down to "supervised learning is BS."

Humans are smarter than computers. How can a human teach a computer how to do
something when the human itself can't teach another human that something?

We haven't solved _that_ problem. The snake is eating its tail.

You can't teach a human how to do something when the methodology to do that is
the student trying something and the teacher saying "Yes" or "No".

Well.... _why_? Why is it yes or why is it no? What is the difference between
what the human or the computer, or in general, the student, _did_ and what is
_good_ or _correct_? And then you still have to define "good" and many times
that means waiting, in the case of the PDF linked to above, perhaps many years
to determine if the employee the AI picked, turned out to be a good employee
or not.

And how do you determine _that_? How do you know if an employee is good or
not? We haven't even figured that out yet.

How can we create an AI to pick good employees if _human beings_ don't know
how to do that?

Supervised learning isn't going to solve any problem, if that problem isn't
solved or perhaps even solvable at all.

In other words, over the years, my heuristic has turned into, "Has a human
being solved this problem?" If not, then AI software that claims to is BS.

~~~
tensor
Supervised learning in machine learning is nothing remotely like a human
teaching anyone anything. It's a very clear mathematical formulation of what
the objective is and how the algorithm can improve itself against that
objective.

The closest analogy for humans would be to define a metric and ask a human to
figure out how to maximize that metric. That's something we're often pretty
good at doing, often in ways that the person defining the metric didn't
actually want us to use.

~~~
chimi
> Supervised learning in machine learning is nothing remotely like a human
> teaching anyone anything.

I disagree, I think it's exactly the same. As an example, a human teaching a
human how to use an orbital sander to smooth out the rough grain of a piece of
wood.

The teacher sees the student bearing down really hard with the sander and
hears the RPM's of the sander declining as measured by the frequency of the
sound.

The teacher would help the student improve by saying, "Decrease pressure such
that you maximize the RPM's of the sander. Let the velocity of the sander do
the work, not the pressure from your hand."

That's a good application of supervised learning. Hiring the right candidate
for your company is not.

~~~
Thrymr
But that's not at all how "supervised learning" works. You would do something
like have a thousand sanded pieces of wood and columns of attributes of the
sanding parameters that were used, and have a human label the wood pieces that
meet the spec. Then you solve for the parameters that were likely to generate
those acceptable results. ML is brute force compared with the heuristics that
human learning can apply. And ML never* gives you results that can be
generalized with simple rules.

* excepting some classes of expert systems

~~~
chimi
One of the columns of sanding parameters is the sound of the sander.

------
kspacewalk2
The author says "AI is already at or beyond human accuracy in all the tasks on
this slide and is continuing to get better rapidly" and one of his examples is
"Medical diagnosis from scans". That is an example of precisely the sort of
snake oil hype he's berating in the social prediction category.

In an extremely narrow sense of pattern recognition of some "image features",
i.e. 5% of what a radiologist actually does, he's probably right. But context
is the other 95%, and AI is nowhere close to being able to approach expert
accuracy in that. It's a goal as far away from reality as AGI.

"AI" tools will probably improve the productivity of radiologists, and there
are statistical learning tools that already kind of do that (usually not
actually widely used in medical practice, you can say yet, I can say who knows
but nice prototype). But actual diagnosis, like the part where an MD makes a
judgement call and the part which malpractice insurance is for? Not in any of
our lifetimes.

A radiologist friend complains that it's been 10+ years since they've been
using speech recognition instead of a human transcriptionist, and all the
systems out there are still really bad. Recognizing medical lingo is something
you can probably achieve with more training data, but the software that
sometimes drops "not" from a scan report is a cost-cutting measure, not a
productivity tool. It makes the radiologist worse off because he's got to
waste his time proofreading the hell out of it, but the hospital saves money.

~~~
randomwalker
Author here. I appreciate your criticism. What I had in mind was more along
the lines of Google's claims around diabetic retinopathy. I received feedback
very similar to yours, i.e. that those claims are based on an extremely narrow
problem formulation:
[https://twitter.com/MaxALittle/status/1196957870853627904](https://twitter.com/MaxALittle/status/1196957870853627904)

I will correct this in future versions of the talk and paper.

~~~
jl2718
Then I shall write to you directly. I don’t know how you can make the claim
that automated essay grading is anything but a shockingly mendacious academic
abuse of student’s time and brainpower. To me, this seems far worse than job
applicant filtering, firstly because hiring is fundamentally predictive, and
secondly because many jobs have a component of legitimately rigid
qualifications. An essay is a tool to affect the thoughts of a human. It is
not predictive of some hidden factor; it stands alone. It must be original to
have value; a learned pattern of ideas is the anti-pattern for novelty. If the
grading of an essay can be, in any way, assisted by an algorithm, it is
probably not worth human effort to produce. If you personally use essay
grading software, or know of anybody at Princeton that does, you have an
absolute obligation to disclose this to all of your students and prospective
applicants. They are paying for humans to help them become better humans.

------
OscarTheGrinch
My brush with AI snake oil:

I interviewed at a startup that seemed fishy. They offer a fully AI powered
customer service chat as an off the shelf black box to banks. I highly suspect
that they were a pseudo AI setup. LinkedIn shows that they are light on
developers but very heavy on “trainers”, probably the people who actually
handle the customers, mostly young graduates in unrelated fields, who may
believe that their interactions will be the necessary data to build a real AI.

I doubt that AI will ever be built, it's just a glorified Mechanical Turk
help-desk. I guess the banks will keep it going as long as they see near human
level outputs.

~~~
claytonjy
I have a feeling I know _exactly_ which company you're talking about...so it's
either just that obvious, or there's more than one of these, or both!

~~~
TeMPOraL
It seems to be the latter. In fact, the trick (should I say, fraud?) is so
common that they were even several articles about it in the press over the
past two or three years. Even the famous x.ai had (and I guess still has)
humans doing the work.

[https://www.bloomberg.com/news/articles/2016-04-18/the-
human...](https://www.bloomberg.com/news/articles/2016-04-18/the-humans-
hiding-behind-the-chatbots)

------
avip
My company is sourcing AI from MTurk. It's actually cheaper than running fat
GPU model training instances. The network learns fast and adapts well to
changes in inputs.

I envision the sticker "human inside" strapped on our algorithms.

~~~
buboard
You should emphasize that this is Organic AI. It's low carbon and overall
greener.

~~~
mumblemumble
Or keep calling it AI, and concede that AI stands for "actual intelligence" if
someone asks you directly.

~~~
dghughes
AI now is like Cyber was in the 1990s it's seems to be nothing but a buzzword
for many organizations to throw around.

The term AI is used as if humanity now has figured out general AI or
artificial general intelligence (AGI). It's quite obvious organizations and
people use the term AI to fool the less tech inclined into thinking it's AGI -
a real thinking machine.

~~~
mumblemumble
Remember 5-ish years ago when IBM's marketing department was hawking their
machine learning and information retrieval products as AI, and everyone in the
world rolled their eyes so hard we had to add another leap second to the
calendar to account for the the resulting change in the earth's rotation?

I suppose their only real sin was business suits. Everything seems more
credible if you say it while wearing a hoodie.

------
Tistel
I worked at a place that was selling ML powered science instrument output
analysis. It did not work at all (fake it till you make it is normal, was
told). So there was a person in the loop (machine output -> internet -> person
doing it manually pretending to be machine -> internet -> report app). The
joke was “organic neural net.” Theranos of the North! ML is a great and
powerful pattern matcher (talking about NN not first order logic systems)
right now, but, I fear we are going into another AI winter with all the over
promising.

~~~
rm_-rf_slash
We won’t ever have an AI winter like in the 70s again. A lot of ML is already
very useful across many domains (computer vision, NLP, advertising, etc). Back
then, there was almost no personal computing, almost no internet, smol data,
and so on. Stuff you need for ML to be useful and used.

So what if some corporate hack calls linear regression “AI”? The results speak
for themselves. The ML genie is too profitable to go back in the bottle.

~~~
BlueTemplar
Didn't linear regression used to be called "AI" as recently as a decade ago?

~~~
TeMPOraL
It's still better in many cases than modern ML (especially if you incorporate
explainability and efficiency as metrics of "better" next to the predictive
power), so I wouldn't object much if a company called it "AI". In fact, if I
learned that an "AI" behind some product was just linear regression, I'd trust
them more.

------
SkyBelow
What I dislike far more than the idea of using such systems to predict social
outcome is that the usage of such systems is done behind closed doors. I would
be much more willing to accept such systems if the law required any system to
be fully accessible online, including the current neural network, how it was
trained, and training data used to train it (if the training data cannot be
shared online, then the neural network trained from it cannot be used by the
government).

Independent companies using AI is far less a concern for me. If they are snake
oil, people will learn how to overcome them. Government (especially parts
related to enforcement) is what I find scary.

~~~
closeparen
Using an association between features to make a prediction about something,
rather than measuring the thing itself, is exactly what’s meant by
“prejudice.” Even when the associations are real and the model is built with
perfect mathematical rigor. ML is categorically unsuitable for government
decisions affecting lives.

~~~
SkyBelow
You seem to think this level of prejudice for prediction is wrong. Why?

If someone has killed 12 people, being prejudice about their chance of killing
another and using that to determine the length of a sentence seems reasonable.

Even with something like a health inspection. Measuring how they store and
cook raw chicken is about predicting the health risks to the public eating it,
not about measuring the actual number of outbreaks of salmonella. And even if
they were to measure the previous outbreaks of salmonella and use it to
prediction the future outbreaks, that is still two different things.

------
sgt101
I read "Why are HR departments apparently so gullible?" and as someone who has
worked in a corporate for 20 years I spotted my underwear.

The identification of facial recognition as problematic because of accuracy
doesn't match my thinking. I believe that the key issue is that given a set of
targets facial recognition systems will find near misses from the wider
population of all faces offered as candidates, that they then flag as
potential matches. This leads to real world problems (like innocent folks
being arrested).

Automated essay grading and content recommendation are both very problematic
because they do not account for originality and novelty. A lecturer grading an
essay that is written my a strong mind from a different culture might be able
to recognise and credit a new voice, a learned classifier never will.
Similarly content recommenders have us trapped in the same old same old
bubble, nothing strikingly new can get through.

~~~
erikig
...I spotted my underwear?

What does this mean?

~~~
wrinklytidbits
Interpretation 1: they pooped their pants

Interpretation 2: they introduced a non sequitur in their excitement of
finding their underwear (perhaps they lost it)

Interpretation 3: when they had their flashback to having worked in corporate
20 years ago they recalled where they misplaced their underwear

------
dekhn
Top textual feature predicting snake oil: calling the product AI rather than
ML.

~~~
Hitton
That doesn't work. Everything is called AI these days and in mountains of
bullshit there are also some actually useful results, these few are not snake
oil.

~~~
MadWombat
Somewhere out there, a biotech R&D company has developed an effective penis
enlargement treatment. Unfortunately they have been having some trouble
reaching potential customers.

~~~
scabarott
This is a tragedy for half of humanity. Actually, all humanity come to think
of it

------
segfaultbuserr
This presentation categorized AI-related tech with social outcomes as
_fundamentally dubious_ , such as predicting criminal recidivism, predicting
terrorist risk, or predictive policing. The rationals are that

(1) technically, the technology is far from perfect.

(2) Social outcomes like fairness are fundamentally difficult to state.

(3) Its inherent problem is amplified with the ethical/moral problems.

Of course many find it's unacceptable due to the ethical problems in a typical
Western liberal democracy. But what if I'm an authoritarian who wants to find
the tools of suppression, and I don't care the false positives or the ethical
problems? Is it going to help my regime, or is it going to have the opposite
effect?

I highly suspect that the answer is the former one. Fortunately, the technical
limitations of "AI" means it's still more or less ineffective today, but it
can only get better.

Therefore, I don't think AI with social outcomes are _fundamentally dubious_ ,
but rather, _fundamentally dangerous_.

------
jacquesm
Lots of AI is actually large numbers of humans working on small bits of
problems that are beyond our ability to automate. Not infrequently these are
passed off on the outside as 'ai' startups. There are some good examples too
where the companies that use machine learning properly and to good effect.
Interestingly they don't blab about it because it is their edge over the
competition and often just knowing that something can be done is enough to
inspire someone else to be able to copy it.

So here is my own theory on how to recognize AI snake oil: if it requires
advertising it is probably fake, if it is very quiet and successful it is
likely genuine.

~~~
cosmojg
> if it requires advertising it is probably fake, if it is very quiet and
> successful it is likely genuine.

This is true of almost any product being offered for sale. Good advice.

~~~
TeMPOraL
Agreed, and I personally use the following heuristic in my product evaluation:
the heavier the intensity of advertising, the worse the product. After
correcting for rough company size (bigger company = bigger baseline
advertising budget), I found it to be quite accurate.

------
tlb
The economic point of AI decision systems is that they can make an automated
decision for $0.0001 of computer time, instead of $10 for a few minutes of an
expert's time. For something like spam filtering, you obviously need the cheap
solution. But you don't need that for the social interventions where you're
making a decision about parole or something. You can spend $10 (or $1000) of
people's time to make those decisions, because both their volume and impact is
at human scale.

~~~
streetcat1
Moreover, the big advantage of AI is consistency. I.e. it does not get tired,
bored, lose motivation, etc.

Hence it might better to get 70% correct answer all of the time, than 90%
(human) some of the time.

~~~
KnightOfWords
On hard problems these AIs are doing no better than chance. Nor is a badly
trained AI in any way consistent.

------
codeulike
Perhaps ML could be applied here to help filter out the barrage of AI snake
oil. Funding, anyone?

~~~
jonplackett
You should fund it with an Initial Coin Offering for maximum credibility

~~~
ukd1
Where can I buy in?

------
lekanwang
I've worked on data related to healthcare and security (and sometimes both)
for quite a while now, and I think there are a couple of general contextual
themes, where, if present, means that you have to be extremely careful about
applying "AI" (some kind of ML in most cases): (a) where there's a high cost
for incorrect predictions (e.g. criminal recidivism, educational attainment,
terrorist attacks, etc) (b) where causation is important (e.g. drug efficacy
and safety, educational attainment, almost all of healthcare) (c) where you're
in an adversarial domain (e.g. fraud, cybersecurity, security in general) (d)
where high technical performance (precision/recall/F1/etc) isn't correlated
with predictiveness of what you're actually looking for (much of healthcare)

In healthcare and security, there's starting to be an awareness of the snake-
oil that's out there, but I still run into people regularly who ask for a
magic algorithm that predicts patient outcomes or a security breach.

------
gok
I really wish we could stop using AI or ML for things in the "predicting
social outcomes" category. Naming them more like "computational astrology" or
"machine alchemy" would be a better fit.

~~~
BlueTemplar
"Astrology" (where mathematicians used to hang out before the scientific
revolution, so no "computational"/"mathematical" qualifier needed) has been
going on for decades in the financial and economic fields, so this one seems
to be promised a bright future too!

------
black_puppydog
Looking for a job as a "deep learning" PhD soonish, the amount of BS in this
field at the moment has all my meters maxing out. Going through job listings
is pretty exhausting when I'm constantly torn between laughing and crying...

------
Isamu
Excellent takeaways:

>AI excels at some tasks, but can’t predict social outcomes.

>We must resist the enormous commercial interests that aim to obfuscate this
fact.

>In most cases, manual scoring rules are just as accurate, far more
transparent, and worth considering.

------
dhairya
From the slides:

Harms of AI for predicting social outcomes

• Hunger for personal data

• Massive transfer of power from domain experts & workers to unaccountable
tech companies

• Lack of explainability

• Distracts from interventions

• Veneer of accuracy

Human behavior is not IID and these models will struggle and fail due to the
fundamental statistical assumptions of modern AI techniques. I also agree that
as a result, we will normalize the collection of personal data in the name
social progress.

------
bodeadly
Isn't "AI" pretty much snake oil. IIRC Artificial Intelligence used to mean a
computer that could think like a person. But that just is not the case. Even
with IBM ads with a computer talking to people saying it's going to fix the
network and stop cyber attacks that is just complete nonsense. And it will
probably always be nonsense because of course there's no way a computer can
think like a person because a computer is not a person. It did not grow up and
fall of it's bike and skin it's knee and take a road trip to the rock concert
and meet someone and so on. AI is being used as a marketing term to compensate
for the fact that sophisticated pattern recognition algorithms and the like
are not particularly marketable even if they are useful.

------
wsn_101
Once upon a time, I used to do research in the field of Wireless Sensor
Networks, which used to be abbreviated into WSN. You can still find lots of
research papers using the abbreviation in the title, circa 10-12 years ago.

It went through a hype cycle, and then people sort of moved on - into IoT. And
IoT, naturally, just sounds better and is more accessible, plus the tech did
catch up, so it became a lot more popular than WSN (the term).

I told a friend of mine that no one paid attention when the field was called
WSN, and now it is the exact same thing, but IoT is taking off like crazy. (At
that point, I had left research altogether). He said "Yes, and that's
perfectly reasonable. If you don't invent new terminology every few years and
manufacture some kind of new hype, the funding agencies stop sending you
money."

This is a community which is based on the mantra of "making something users
want". Once you realize that the upstream user here is the funding agency, and
what they really want is to make bets on "cool stuff for tomorrow" rather than
boring old 3-5 year old tech (such as WSN), the hype actually makes perfect
sense.

Sure, there is still a need to separate the snake oil from the reality. But
that is true of tech in general. I am not sure if AI/ML is particularly bad in
some way.

------
ouid
Here's the compressed decision tree:

Does it claim to have high performance on a task? Can humans quickly and
cheaply verify that claim? If yes->no, then it is snake oil.

------
jmmcd
Several references to simple linear models and even _improper linear models_ ,
so I will recommend this paper: Dawes, _The Robust Beauty of Improper Linear
Models_.

[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.188...](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.188.5825&rep=rep1&type=pdf)

------
cellular
It seems like at least half the tech people I talk to work with AI now... no
matter their field.

~~~
nabla9
Any type of heuristic search = AI. hand-made decision tree or lookup table =
AI, Naive bayes = AI, KNN = AI. Uses numpy = AI. One employee in mobile app
startup has graduate thesis in ML = AI startup. Approximate string matching in
SQL query = AI. Not clear what the product is going to be = AI.

"Information is not knowledge. Knowledge is not wisdom. Wisdom is not truth.
Truth is not beauty. Beauty is not blockchain. Blockchain is not AI. AI is THE
BEST.”

~~~
jnwatson
In my space (cybersecurity) I’ve heard “more than 2 joins” (in an SQL context)
is ML.

~~~
goatlover
So it's turning into a meaningless buzzword.

------
DonHopkins
The classic Bone Tumor Differential Diagnosis program that Apple published for
the Apple ][ in the original Apple Software Bank Volume 1 from 1978 sounds
legit. But beware: it requires 32K of RAM, bigger than any other program in
that volume! (Available on both cassette and 5-1/4" floppy disk for free from
your local Apple dealer.)

[https://archive.org/stream/Apple_Software_Bank_Vol_1-2/Apple...](https://archive.org/stream/Apple_Software_Bank_Vol_1-2/Apple_Software_Bank_Vol_1-2_djvu.txt)

[https://archive.org/details/a2_Biology_19xx_](https://archive.org/details/a2_Biology_19xx_)

[https://mirrors.apple2.org.za/ftp.apple.asimov.net/documenta...](https://mirrors.apple2.org.za/ftp.apple.asimov.net/documentation/applications/misc/Apple%20Software%20Bank%20Vol%201-2.pdf)

Program Name: BONE TUMOR DIFFERENTIAL DIAGNOSIS

Software Bank Number: 001 1 4

Submitted By: Jeffrey Dach, M.D.

Program Language: APPLESOFT II BASIC

Minimum Memory Size: 32K Bytes

This program is intended for use by qualified medical practitioners. While the
specific data are of interest only to those familiar with bone pathologies,
the programming techniques may well interest a wide range of computer users.

INSTRUCTIONS

LOAD the program into APPLESOFT II BASIC, and type RUN. Follow the
instructions displayed on the screen. The program asks a series of questions
concerning radiographic and clinical details of the bone tumor in question.
For each question, type the number of the appropriate answer and press the
RETURN key. Finally, the program uses Baye's rule and a predetermined
probability matrix from Lodwick (1963) to calculate the relative probabilities
of 9 different diagnoses.

Some knowledge of descriptive terms for bone tumors is needed to answer the
questions. Only a qualified physician should attempt to use this program as a
diagnostic tool.

------
jimbob45
Can we do blockchain next?

~~~
smabie
We don’t even need AI for that. Here’s some flawless pseudo-code:

fn is-snake-oil: return !(coin is BTC or coin is in stable-coin-list)

~~~
dmd
Sadly, that gives a false negative for when coin is BTC or coin is in stable-
coin-list.

~~~
deusofnull
I think that was the joke

------
0xdeadbeefbabe
> Actually, Major Major had been promoted by an I.B.M. machine with a sense of
> humor almost as keen as his father's.

Catch 22

------
jusonchan81
I once figured I can use ML to predict stock market movements and I went about
buying 5 years of data from CBOE and started working on modeling. At one point
I managed to get fairly accurate predictions of movements and I thought I had
struck a lottery.

Turns out my training data had tomorrow’s movement indicator in it which I
accidentally added. Despite that dumb mistake it was only accurate up to 80%
or so. That’s when I realized Im better off with a random number generator.

~~~
radarsat1
I went down that path briefly too. After having trouble getting results, I
realized I needed to check the fundamentals. What did it in for me was finally
just doing a simple correlation analysis of earlier prices to later prices,
and realizing there is NO CORRELATION. I realized, if there's nothing to
predict, no amount of ML will magically figure it out for me, no matter how
complex the indicator.

(I was just looking at single products.. I suppose this may be completely
different when considering multiple products and markets and integrating
external information like news sources.. not at all saying that ML can't be
applied to the stock market, just that it's not nearly as simple as looking at
previous prices to predict future prices, as everyone says... the point being,
always check basic correlations!)

------
jl2718
There is a particular form of statistical inference which is not well-
understood as problematic. I will use my “smart” scale as an example. It takes
weight and several impedance measurements, which are taken from a large
population along with body composition truth measurements from DEXA, and the
coefficients of the algorithm are determined by regression. It should be no
surprise that the weight measurement overpowers all other measurements in the
regression, and the impedance covariance is ill-conditioned, so really it’s
just a height and weight formula. My information gain from the impedance
measurement is zero. The correct way to do the regression is in reverse, from
more (relevant) information to less, so the parameters should be well-
conditioned. This is assuming that DEXA contains all useful information to
predict impedance. If not, forget about the whole thing. If that model works,
then the reverse can be found through Bayesian optimization. You still have
bias set by the covariance prior, but it is at least known, and you can give
information about how it is affecting the result.

------
nsainsbury
It's a bit of a harsh heuristic, but after working now on several projects
involving ML/AI and reading and watching about the experience of others in the
industry too, I've come to associate most claims of ML as snake oil.

In industry today, I believe very few businesses are reaping much benefit from
ML as compared to trivial statistical/analytical tools (linear regression,
most popular recommenders, common sense improvements/optimizations, etc.). The
only real benefit I would argue ML has brought for businesses has been in
marketing to the general lay audience and misleading investors.

The main reason for this in my opinion is you can't really just come in and
make recommendations/improvements to a given problem domain without deeply
understanding that domain back to front - and that's an understanding that
academic types that get hired to build ML systems almost never have. You can't
stand at an arms length from real business problems and just throw maths at
them and expect to make good (or even sensible) recommendations.

------
kache_
>The fast way forward involves being critical, being pragmatic, not
overselling, and not drinking the Kool-aid.

------
jacobsenscott
Does the company mention AI in their name, marketing material, or investor
pitch? Then it is snake oil.

------
coleifer
I feel like if AI were so easy, someone would be making billions beating the
stock market. It's got way more of the historical data and features than any
of the problematic examples in this slide deck, but it's essentially unsolved.

~~~
alexanderchr
Some are making billions beating the stock market

[https://en.wikipedia.org/wiki/Renaissance_Technologies](https://en.wikipedia.org/wiki/Renaissance_Technologies)

------
yellow_postit
I tend to find the following few questions a quick way to evaluate an ML/AI
pitch in an elevator:

Ask yourself: could a human given the inputs reasonably produce the outputs
you are looking for? — this helps avoid/identify the pure magic pitches.

Then ask the person pitching: 1\. What’s your training data and how is it
collected? 2\. What’s your validation data and how is that constructed, and
how does your system perform on that set? 3\. What are the blind spots and
biases in your model and how are you mitigating them?

If they don’t have succinct and competent answers or major red flags like no
validation or claiming no biases then run away.

------
dgritsko
This was a really interesting read. In relation to the discussion of the
predictive accuracy of a dataset with 13,000 features, I thought it might be
worthwhile to bring up the idea of the "Curse of Dimensionality" for anyone
unfamiliar:
[https://en.wikipedia.org/wiki/Curse_of_dimensionality](https://en.wikipedia.org/wiki/Curse_of_dimensionality)

The "tl;dr" is basically that more features is not necessarily a panacea and
can actually cause more problems.

~~~
abhgh
Thanks for adding that to the discussion. I'd like to point out a couple of
things:

(1) that adding features can create problems is well known among good ML
practitioners (I daresay, esp. to those who have a fair amount of exposure to
non-deep-learning techniques). With deep learning you can afford to worry less
since with enough data and compute cycles, the network can figure out what to
ignore. Which is convenient. Throwing out uninformative features however, may
still have a practical benefit: less features -> smaller dataset size ->
faster training.

(2) This is probably a minor nitpicky point: adding more features can lead to
no improvements not only because of the curse of dimensionality, but sometimes
simply because the feature has absolutely no bearing on the label; that is to
say you might not be adding noise, but you might not be adding information
either.

------
plaidfuji
Easiest way to cut through AI BS: ask “what’s their dataset?” If it’s not
obvious, there’s a problem. AI is only as powerful as the data it learns from.

There is one exception to this: if an exhaustive simulation of the problem
exists. This is why AI is so successful at sandboxed games like chess and Go.
It can generate its own data with zero ambiguity.

So: what’s your dataset? What simulation are you inverting? If neither, you’re
just writing an expert system based on heuristics.

------
Uhhrrr
It would be interesting to try to train an algorithm to detect snake oil in
companies claiming AI, but I don't know how it would work with no negatives.

------
harry8
How to recognize AI snake oil. The salesman calls it AI and claims it is not
simply statistical inference.

------
classified
That's ridiculously easy: If it contains the letters "AI", it's snake oil.
There is _nothing_ in man-made software that would merit the term
"intelligence". Real intelligence would be creative and unpredictable. We
would run for cover.

------
Beltiras
"Lack of explainability" might be seen for some to be a feature, not a bug.
LEOs don't care how they have to justify an action, just that the
justification exists for them to operate the way they want to.

------
BlueTemplar
This got me thinking - are there any cases of _literal_ snake oil (or any
snake-derived products) having better than placebo effects ? (I assume that
it's different from snake venom?)

------
netwanderer3
Company founders have learned that by simply being present in a big growing
industry, they are virtually guaranteed for receiving investments. A rising
tide will lift all boats.

------
1_over_n
Love this, many problems can be solved with regression analysis.

------
splatcollision
How to use AI to recognize AI snake oil?

------
jakeogh
1\. If purports to have "self".

------
SkyMarshal
This is a good framework and categorization for evaluating the effectiveness
of AI.

TLDR:

AI is good at: Perception (how things are)

AI is ok at: Judgement (how things are)

AI is poor at: Predicting (how things are going to be)

E.g., AI is effective to varying degrees at observing and characterizing how
things are, but not at predicting how they're going to be.

For predicting, AI with hundreds of variables fairs no better than simple
linear regression over single digits of variables. Simply because, _the future
is not reliably predictable_ , by any means.

------
nkoren
This is good. I co-founded Futurescaper[1], a company that does work in
strategic foresight systems -- collective mapping of complex systems, and
analytical tools to help people and organisations to understand them. This put
us in a prime position to be vendors of AI snake oil. Due to a stubborn
overabundance of ethics, we've refused to do so, which has undoubtedly cost of
a lot of business. People _want_ to buy snake oil. It goes down much smoother
than hard truths.

To amplify what this presentation says, here's the hard truths about
predicting social outcomes: either they

1\. Are simple and obvious, and can be easily understood and predicted via
regressions and trendlines.

2\. Are complex, non-obvious, and neither can nor should be predicted. In
fact, attempting to predict them is often dangerously wrong. But this doesn't
mean that they can't be understood.

In the first instance, you don't need any fancy software, so there's no snake
oil to be sold. The second instance, however, gets caught in a cognitive bias:
we think that the future can be predicted. This is because, in simple systems,
it can: drop a glass above a hard floor, and you can accurately predict that
it will fall and shatter. That is a simple system. We think that complex
systems must be similar... just more complex.

But complex systems are fundamentally different. Consider a double-pendulum.
Its movement can't be predicted for more than the next few swings. Even if you
know the _exact_ starting configuration of the pendulum -- and by exact, I
mean not just every single sub-atomic particle in the pendulum arms, but the
gravitational influence of literally every object in the universe -- you still
wouldn't be able to greatly extend your foreknowledge of its movements. This
is because the feedback loops are driven by chaos, and chaos is baked directly
into the mathematical fabric of the universe itself.

For the mathematically inclined, consider the Mandelbrot set: it is,
essentially, the equivalent of a double pendulum. It asks a question: "by
starting with this number and iteratively exponentiating it, will in trend
towards zero or infinity?". When you ask this question on a simple number
line, then the answer is obvious: below 1, it trends towards zero; above 1, it
trends towards infinity. However when you ask this question on the complex
number plane -- with two numbers feeding back to each other as they
exponentiate -- then the only way to answer the question is to keep iterating
the numbers to find out. There's no shortcut. In some places, the answer is
found quite quickly. In others, it takes thousands of iterations. In still
others, it takes infinite iterations: you could build a computer the size of
the galaxy and you still wouldn't be able to answer the question of whether
such-and-such coordinates trends towards zero or infinity. That's the
complexity you get from just two number lines and a very simple feedback loop.

The real world of psychological and social cause-and-effect is _far_ more
complex than a double pendulum or a Mandelbrot set, and attempting to predict
it is even more futile. In fact, it's dangerous.

What makes this dangerous is that you _can_ throw statistics at complex
futures, and make statements like "Future A has a 40% probability; futures B,
C, and D all have a 20% probability". But the human mind is terrible at making
good judgements based on this kind of information. Hearing it, people tend to
think: "right, that's settled then: Future A is twice as likely as any other
scenario, so that's what we'll plan for." We fixate on what looks like the
most likely future, and disregard the rest.

The problem is that if the preparations for Future A are contrary to what
you'd want to do in Futures B, C, and D, then betting everything on Future A
means that 60 % of the time, you'll lose.

What's interesting is that increasing the accuracy of those percentages
doesn't necessarily help, and in many cases can hurt, since it only reinforces
our tendency towards target fixation. In a 40/20/20/20 scenario, we might
still make some concessions towards planning for the "20s". But in a
70/10/10/10 scenario, those "10s" will simply be discarded. Which means that
30% of the time you won't just lose: you'll be blindsided. Utterly fucked.

Unfortunately, most of the predictive AI that I've seen is focused on either
increasing the "certainty" via extremely dubious means, or simply hiding the
non-dominant answers altogether. So the AI does the target fixation for you.
That's not a good thing.

The entire discipline of Scenario Planning[2] evolved to help people and
companies "un-predict" the future. Rather than putting percentages on
_probabilities_ of outcomes, people need to understand the _possibilities_ of
outcomes. Even if complex systems can't be predicted, they can be better
understood, and that can be very valuable for helping to navigate them in
real-time. Rather than giving people the easy (and usually wrong) answer of
"here's what is going to happen", you can give them a range of futures that
_could_ happen, and understanding those possibilities can aid navigation and
lead to better outcomes.

It may even be that AI has a legitimate role to play in this process -- but it
won't be in falsely predicting the outcome of complex systems, nor will it be
in absolving people of the responsibility of thinking for themselves. This,
unfortunately for my company's ability to raise capital, is a much less sexy
sales pitch than the AI snake oil. (Although it does get us smarter and less
annoying clients, so there's that!)

1: [https://www.futurescaper.com/](https://www.futurescaper.com/)

1:
[https://en.wikipedia.org/wiki/Scenario_planning](https://en.wikipedia.org/wiki/Scenario_planning)

------
ipsa
This seems contradictory:

> AI can't predict social outcomes

> In most cases, manual scoring rules are just as accurate

So manual scoring rules don't work either for predicting social outcomes?
There is some magic sauce that humans use for prediction that we haven't
cracked yet? Nothing can predict social outcome?

AI is perfectly capable of predicting social outcomes, and only in very few
cases are manual scoring rules as accurate as black box AI. The ethical
concern is not about accuracy, but about our sensibilities when it comes to
protected classes. The author cherry picked examples where simpler approaches
also worked, but says nothing of practical feasability or increase in
variance. Try actually doing face recognition or spam detection with manual
rules.

Face recognition being way more accurate is just as much an ethical concern as
a gun that is way more accurate. It all depends on who you point it at.
Accurate face recognition at the border helps save lives as much as equipping
the police with more accurate hand guns.

The talk of AGI is misguided. Everybody can see that the economy will be
increasingly automated with narrow AI. Just because "big data" was a hype
word, does not mean companies haven't been monitizing their big data (and were
thus right to collect it).

We _can_ predict probabilities about the future. The author is attacking these
systems for not being 100% sure. Predictive policing is automated resource
management. Militaries have been doing this for decades. It has its drawbacks,
but also benefits (wiser usage of tax money, protecting low-income
neighborhoods from falling in the hands of gangs).

The author also claims that algorithms automatically turn away people at the
border for posting or liking or being connected to terrorist propaganda. But
these systems just give a score and a human border guard makes the (more
informed) decision.

A system not being 100% accurate is not an ethical concern, as long as we not
treat those systems as 100% accurate and give proper recourse.

Just a spelling check can and does weed out poor candidates. Why does HR want
to automate? Because they get 1000+ resumes for a single position. The manual
glance they give them pale in comparison to what an automated system can do.

What is more likely? That these HR systems show promise? Or that the VC market
has completely lost it (despite working with software and automation for
decades, and have AI experts on staff) and is pumping billions into tealeaf
reading, because now its called "AI"?

If you cheat the system by adding "Cambridge" or "Oxford" in white letters to
your CV, is that ethical? Why not add it to your education section in black
letters? Would you hire a good potential candidate, if you knew they acted
like 90s search engine spammers? Maybe a candidate from Oxford or Cambridge
really deserves to be on the top of the pile, or is it now unethical to look
at education when hiring?

This presentation likes to mix ethics with technical success. Just say that a
HR system is unethical, without calling it bogus with zero proof other than
"some AI experts agree that this is impossible".

Yes, there is a lot of snake oil AI, and this will only increase. But these
systems can and do work. I am sure there are AI experts building these systems
right now.

------
ibalioalex
Uhu seems good

