
Attacks against machine learning – an overview - ebursztein
https://elie.net/blog/ai/attacks-against-machine-learning-an-overview
======
bhnmmhmd
It'd be great if there was a service that you could sign up for, which would
"deceive" Facebook, Twitter, and other social media websites by producing
false information about you. For example, if I don't want FB to know what
movies I'm interested in, how about liking "random" movie pages on FB? If I
don't want FB to know about my political orientations, how about run with the
hare and hunt with the hounds?

~~~
itronitron
In order to wash out the signal, all the service would need to do is 'like
everything'. In addition to masking your interests it would also grind their
algorithms to a halt if enough people did that. A lot of these algorithms gain
performance due to the sparsity of the data, so if everything became connected
it would negatively impact the performance of their algorithms.

Anyone know how to get, or compile, a list of everything likable on Facebook?

~~~
pferde
Until you get into problems (legal or personal, doesn't matter) for "liking"
stuff related to child porn, terrorist propaganda or, I don't know,
scientology, without even knowing about it, because it was done on your behalf
by this "like automaton".

~~~
sacado2
Exactly. In France people have been convicted because they "liked" illegal
opinions. As if the fact that such a thing as an illegal opinion exists was
not enough of a problem, it's been decided by justice that the semantic of a
"like" was "I make this opinion mine".

~~~
pferde
Can you please share some links on this? All I could find was a similar case
in Thailand: [https://www.theguardian.com/world/2015/dec/10/thai-man-
arres...](https://www.theguardian.com/world/2015/dec/10/thai-man-arrested-
facebook-like-photo-king)

~~~
sacado2
I found that, but this is in French, obviously:
[http://www.leparisien.fr/rozay-en-brie-77540/rozay-en-
brie-c...](http://www.leparisien.fr/rozay-en-brie-77540/rozay-en-brie-
condamne-pour-avoir-like-une-photo-de-daesh-22-08-2017-7206846.php)

My own (approximate) translation of parts of the text:

"Sur Facebook, le trentenaire avait apposé un «J’aime» sur une image d’un
combattant de Daesh brandissant la tête décapitée d’une femme. Il a été
condamné à trois mois de prison avec sursis." \--> "On Facebook, the man in
his thirties had clicked "like" on a picture of an ISIS fighter holding the
head of a beheaded woman. He was given a 3-month suspended prison sentence".

"«Quand on met J’aime, c’est que l’on considère que ce n’est pas choquant ou
que l’on adhère», considère pour sa part Jean-Baptiste Bougerol, le substitut
du procureur de la République." \--> ""When you click "like" on something, you
consider it's not shocking or you agree with it"", said the prosecutor".

------
Noumenon72
I just watched a presentation about using deep learning to detect cheaters in
CounterStrike: Go
([https://youtu.be/ObhK8lUfIlc](https://youtu.be/ObhK8lUfIlc)) and the
question he didn't seem to have an answer for was data poisoning -- what if
the cheaters all volunteer to be on the anti-cheater jury? Of course they are
cross checking juror reliability ratings and stuff, but it's definitely a
treadmill.

~~~
simsla
Whenever you're crowdsourcing, bad actors are a possibility. You'd usually
track agreement to root out both the bad and incompetent actors, but what
you're saying would essentially amount to a 51%-attack. That is, with enough
bad actors working together, consensus stops being trustworthy.

I see two ways to address this (there are probably more, this is just me
thinking out loud):

1\. Increase the size of the pool of total reviewers so a 51%-attack becomes
infeasible. Incentives can be offered to the rest of the community to get them
to participate. (This is similar to what bitcoin tries to do, with the added
obstacle of actor anonymity. In an anonymous system, 1 bad actor can trivially
simulate an arbitrary number of actors. Bitcoin tries to solve this by
increasing the _operating cost_ for each perceived actor. Counter Strike can
be seen as having a fixed lump operating cost: purchase price of the game +
time investment to accrue enough XP to qualify for the cheater jury. )

2\. Create an additional set of people you trust unconditionally. (These can
be people you train and pay a wage.) This means you can spot-check anyone, and
a consensus between bad actors is an investigative clue (to find more bad
actors) rather than a hindrance.

------
mholt
Very related: about a year ago, I wrote about weaknesses of neural networks
specifically:
[https://matt.life/papers/security_privacy_neural_networks.pd...](https://matt.life/papers/security_privacy_neural_networks.pdf)

With powerful machine learning systems, we need to think about security a
little differently. See especially the section 4.8 about function
approximation:

 _> Given a task for which no discrete algorithm is known to solve, there is a
good chance a neural network can at least approximate it. The extreme value of
neural networks are their ability, in many cases, to act as an unknown
function that can map inputs to outputs with good enough generalization almost
as if the actual function was known. This makes any system that relies on the
difficulty of implementing an unknown function vulnerable to the malignant use
of neural networks_

~~~
joewee
Very well said, never thought about it in this way. It also nullifies a lot of
propriety risk scoring models, like credit scores. I wonder what research is
done around this for automated trading systems? I see an “attacker” that
creates models who’s only purpose is to force another financial institution to
make unprofitable trades based on reverse engineering the other traders
trading modes. Eventually, if not already happening, trading becomes machines
attacking other machines.

~~~
tzahola
>Eventually, if not already happening, trading becomes machines attacking
other machines.

Welcome to high frequency trading. You’re a bit late to the party though
(around 15 years).

~~~
robertk
I believe you are wrong, if I am to believe my trader friend to whom I showed
this thread and answered:

“Trivially true in the 'of course does behaviour of others matter' sense and
in the 'could my actions influence others'. Not necessarily operational
though.

Fine line from there to 'spoofing' (== placing trades solely with intent of
engaging others to trade at price level) -- with is VERY EXPLICITLY not
allowed and for which you can get fined and go to jail. Recall the case of
that poor SOB out of London who was made a poster boy for the flash crash?”

It seems there is regulation against this.

------
Bromskloss
> Model stealing techniques, which are used to “steal” (i.e., duplicate)
> models or recover training data membership via blackbox probing. This can be
> used, for example, to steal stock market prediction models

I would like to hear stories about such attacks on stock market models.

~~~
taurine
There are not any, since there are no public stock market models worth
copying, and no stock market model takes external input. But if they did (you
could give a time-series to a model in the cloud, and it would give you
predictions) then it would be possible.

Copying models is a problem for cloud-hosted pay-per-prediction image
classification, not for constantly retrained stock market models that don't
take external input.

~~~
Bromskloss
I thought it would be about observing the behaviour of a system that is
trading on the market. The input to the system would consist, for example, of
other people's trades.

~~~
taurine
It is referring to
[https://arxiv.org/abs/1609.02943](https://arxiv.org/abs/1609.02943)

What you are referring to is possible, but is not "copying" per se, just
trying to infer what the system is doing (inverse RL), and then exploit
that/make it do mistakes. If you are not HFT it is very difficult to
distinguish bots from humans, so you'd have a hard time even finding a target.

------
iamaaditya
One simple way to minimize impact of these attacks is our work called Pixel
Deflection (CVPR 2018 Spotlight). Here is a short (4 min) video introduction
to the idea [https://youtu.be/VgjOXJ9QKWo](https://youtu.be/VgjOXJ9QKWo)

------
frag
Interesting post. Maybe to complete about adversarial examples in medicine
[https://medium.com/fitchain/attacking-deep-learning-
models-3...](https://medium.com/fitchain/attacking-deep-learning-
models-380b71a14747)

~~~
Noumenon72
Is the thesis there that Big Pharma would pollute the data to sell more cancer
cures to people with moles?

------
itronitron
a couple of requests in case any of you find yourself writing something
similar...

Please don't title your article one thing (ML) and then in the first sentence
set the context to something else (AI).

Please lead with a short paragraph stating what you did, in what context, and
for what purpose, instead of trying to grab the whole pie and implying that
your experience and worldview are commonly shared by everyone else.

