Hacker News new | past | comments | ask | show | jobs | submit login
Adversarial.io – Fighting mass image recognition (adversarial.io)
249 points by petecooper on Feb 20, 2021 | hide | past | favorite | 63 comments

It's interesting that they only tackle a single model architecture (a pretty common one). It makes me think that is is likely an attack technique which uses knowledge of the model weights to mess up image recognition (if you know the weights, there are some really nice techniques that can find the minimum change necessary to mess up the classifier).

Pretty cool stuff, but also if my assumption is correct it means that if you _didn't_ use the widely available ImageNet weights for inception v3 then this attack would be less effective (or not even work). Given that most actors who you don't want recognizing your images don't open source their weights this may not scale/or be very helpful...

This was my first thought as well. The question is: How robust are adversarial perturbations? In other words: Given such a perturbation that was generated using one model, how well can we expect to work on a similar model (in the sense that both models are fooled)? I would be curious to know if any research has been done on this question.

Transfer of adversarial perturbations between models is one of the main avenues of "black box" (where you don't know the target model's weights) attacks. Perturbations don't translate between models 100% of the time or anything, but many attacks are surprisingly transferable. There are also methods to make perturbations more transferable, for example by finding an attack that is effective against an ensemble of models, you increase the chances that it will transfer to an unseen model.


A lot of adversarial attacks tend to correspond to very narrow peaks though and are not robust against some very simple image transformations. Often by slightly disturbing the input image with e.g. blurring or brightness/contrast changes and seeing how the output layer changes you can often eliminate many adversarial attacks.

A simple example would be if an image identifies as a basketball but you blur it slightly and it identifies as a cat, you might be looking at an adversarial attack.

Not sure on published research, but anecdotally I've played with this and found it depends on the technique used. The most simple attacks tend to be model specific. This often means that it won't work at all on another model or will work but less effectively (the confidence of the adversarial target will increase and the true class will decrease but not necessarily to a degree that will change the classification). It can also depend on the dataset in addition to the model architecture.

The more simple ones don't even really work after basic transformations (like rotating, scaling, etc) on the target model, so those attacks are often brittle. But there are lots of techniques and some of them are more robust across more models and transformations. This sometimes has a tradeoff of causing the manipulation to the image to be more noticeable to the human eye.

Adversarial attacks are a bit of a cat-and-mouse game between new attacks and new attempts to find where they fail.

My feeling is that even inception trained on imagenet with a different random initialization would probably not fail (or at least be much less likely to fail) when exposed to these adversarial examples.

If these are hacking the specific, underspecified, realization of the classifier (I.e. the set of public weights trained from a specific seed and visit order through the data), the adversarial examples are probably just as fragile as the classifier.

What makes me think is that, in principle, there should be similar adversarial examples for the human optical system.

Practically speaking it wouldn't fool anyone for more than a split second, not least since our input is video instead of snapshots, but it's an interesting thing to wonder about. Maybe we could build an AI which would be in most senses as smart as us, but which would be more vulnerable to such things?

Prominent examples would be optical illusions and the kind of thing you find on https://reddit.com/r/confusing_perspective/. Both of which tend to fool for longer than a split-second.

It's not entirely the same method as these "adversarial noise" inputs, but some optical illusions are pretty close in how they mess with the localized parts of our optical processing (e.g. https://upload.wikimedia.org/wikipedia/commons/d/d2/Caf%C3%A...).

We can't backprop the human vision system to find "nearby" misclassifications as easily, and presumably our own "classifiers" are more robust to such pixel-scale perturbations, but especially lower-resolution images can trip us up quite easily too (see e.g. https://reddit.com/r/misleadingthumbnails/).

See this (by now archaic from 2018) video about adversarial attacks, he explains cross-architecture attacks here https://youtu.be/4rFOkpI0Lcg?t=462

There's a theme in this discussion that ML operators will just train new models on adversarially perturbed data. I don't think this is necessarily true at all!

The proliferation of tools like this and the "LowKey" paper/tool linked below (an awesome paper!) will fundamentally change the distribution of image data that exists. I think that widespread usage of this kind of tool should trend towards increasing the irreducible error of various computer vision tasks (in the same way that long term adoption of mask wearing might change the maximum accuracy of facial recognition).

Critically, while right now the people who do something like manipulate their images will probably be very privacy conscious or tech-interested people, tools like this seriously lower the barrier to entry. It's not hard to imagine a browser extension that helps you perturb all images you upload to a particular domain, or something similar.

The adversarial ML arms race seems similar to the rest of the security/privacy arms race, where these endeavors will make recognition stronger, not weaker, similar to how any other manual red team attacks ultimately get (a) automated and (b) incorporated into blue team's automatic defenses.

Hard to see why that wouldn't be the case, esp. for techniques that are general, vs. exploiting bugs in individual models. As long as a person can quickly tell the difference, it's in the grasp of deep learning for ~perception problems, and the economics of the arms race determines the rest of what happens when..

Definitely agree that there will be cases in which a computer vision operator ships features specifically intended to create a new automatic defense.

One thing that seems unique to technologies that are mostly just statistical learning is that each new manipulation approach can basically widen the distribution of possible inputs. In particular, I'm thinking that as more obfuscation and protest technologies are made public like this, the distribution of "images of faces available for computer vision training" becomes more complex. That is to say, whenever a adversarial tool creates a combination of pixels that's never been see before, if that "new image" can't be reduced back to a familiar image via de-noising or pre-processing, the overall difficulty of computer vision tasks increases.

All a long winded way of saying, I think for ML systems, there's a unique opportunity to "stretch the distribution of inputs" that may not exist for other security arms races.

Totally agree that economics of the arms race(s) will a huge factor in determining how much an impact obfuscation and protest can have.

yeah maybe sql injection is a good analogy: bug bounties for it, then automated fuzzing, and now built-in to frameworks. there a companies and oss here, so building robustness into training sets, tf, is normal . I'm not sure if bet on a new Coverity wrt VC $, but definitely r&d and smaller groups

> It's not hard to imagine a browser extension that helps you perturb all images you upload to a particular domain, or something similar.

Ideally I’d like to see something like this be part of the camera filter itself.

Why can’t Apple, if they choose to do so, just add something like this as part of their camera app itself?

If it's opensource, there's no need to wait on the business interests of a trillion dollar company to align with your wishes. Camera app developers can be made aware of it and add it to their apps. If there are app developers on HN, they can create pull requests to their favorite apps and add the feature.

That's the power of opensource.

Yes, for you and me, but not for ordinary folks..

Folks interested in this kind of work should check out an upcoming ICLR paper, "LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition", from Tom Goldstein's group at Maryland.

Similar pitch -- use a small adversarial perturbation to trick a classifier -- but LowKey is targeted at industry-grade black-box facial recognition systems, and also takes into account the "human perceptibility" of the perturbation used. Manages to fool both Amazon Rekognition and the Azure face recognition systems almost always.

Paper: https://arxiv.org/abs/2101.07922

Can't wait to read about Inception V4 being trained on adversarial.io for better noise resistance :)

Yes, the irony: this app makes image recognition even more effective.

I don't really think this is true for two reasons. First, the app doesn't add anything that someone training a classifier couldn't just do themselves. If I want to train my network against adversarial inputs, I can just generate them myself at training time on my own training data - it's not particularly helpful for me to have a bunch adversarially-perturbed but unlabeled images. Second, and more importantly, it's not sufficient to just train on adversarial examples taken from another network. You might learn to be robust against the specific weaknesses of that other network, but your new network will have its own idiosyncratic weaknesses. To be effective, adversarial training (https://arxiv.org/pdf/1706.06083.pdf) needs an adversary that adapts to the network as it trains. In other words, you need your adversary to approximate your current weaknesses at each step of training.

It’s really surprising to me how easily AI can be fooled. Maybe there is a fundamental difference between our visual system and what is represented in a visual recognition CNN. Could it be the complexity of billions of cells vs. the simplification of an AI, or something about the biology we haven’t yet accounted for?

Image recognition is currently done with so called weak UI. It was not a popular term, however it became such some years ago so that marketing can sell machine learning as an artificial intelligence.

However machine learning is nowhere near what we consider as an AI, equivalent of our intelligence.

You can compare machine learning with training a hamster to jump on a command. If you will repeat learning process a lot of time hamster will jump. But change anything in the environment and he won't.

Machine learning is just a hamster that is trained thousands of times.

It can do one thing, sometimes quite good, but still it is as intelligent as a hamster.

Machine learning does not aim to become an intelligence. It is just a well trained hamster. Nothing more.

It is just a fuzzy algorytm.

That is why it is so easy to fool the algorytm. I hesitate to call machine learning any kind of AI just for the reason it generates such confusion.

If it comes to developing real AI we are nowhere near currently. However we enjoy machine based models that are easier to brute force train with processing power he have today

There are optical illusions ...


Because there is no I in AI.

At a mile high conceptual level, AI is nothing but a program created by a computer based on the data it is provided

Which is why it is extremely easy to fool using techniques that it is not trained to handle, today, but might be able to handle tomorrow. It is a race...

Right. I’m naive in assigning more capability to these models than they posess.

I found a quote from Geoff Hinton where he talked about this last year.

From [1]: “I can take an image and a tiny bit of noise and CNNs will recognize it as something completely different and I can hardly see that it’s changed. That seems really bizarre and I take that as evidence that CNNs are actually using very different information from us to recognize images,” Hinton said in his keynote speech at the AAAI Conference.

“It’s not that it’s wrong, they’re just doing it in a very different way, and their very different way has some differences in how it generalizes,” Hinton says.

[1] https://bdtechtalks.com/2020/03/02/geoffrey-hinton-convnets-...

> Because there is no I in AI.

This. In a nutshell every sort of algorithm we call "AI" today is reductive pattern matcher. This limitation isn't due to computational capacity or even, IMO, algorithm design, but due to our collective lack of understanding of how intelligence itself works. We'll get there eventually, but not for a long while.

> At a mile high conceptual level, AI is nothing but a program created by a computer based on the data it is provided

Your brain is but a preprogrammed, biological computer that reacts to data obtained from its interfaces and attach peripherals.

> Because there is no I in AI.

This a common refrain but fairly obviously untrue. It assumes there's some secret sauce in human brains that makes us "intelligent" whereas AI is "just a machine".

It's pretty clear that human brains are just programs. Extraordinarily complicated highly optimised programs, sure. But nobody has even found a shred of evidence that there's anything fundamentally different to programs in them.

Thinking otherwise is along the same lines as thinking that animals don't have feelings.

Every time there's an advance in AI the "it's not really intelligent" goalpost shifts. Clearly intelligence is a continuum.

I don't think AI is unintelligent, but human brains aren't just programs. We don't even have a definitive answer to how neuron signals are 'stored' in the brain. There is no equivalent to a memory address in there. It's an open problem. How could one say human brains are basically just programs if we don't even operate on the same hardware as programs, and the hardware we do operate on is too complex to be understood by us?

I think you might have misunderstood. When I say they're programs I mean that the answer a brain outputs could also be computed by a (sufficiently powerful) computer program. I don't mean that they're literally dealing in bits and bytes.

Artificial neural networks don't have "memory addresses" to store data in the same way that a conventional program does either. But they can still store data. GPT-3 knows the first page of Harry Potter, but if you feel through its weights you won't find any of the text. The knowledge is distributed somehow (in a way that we don't fully understand).

Despite that GPT-3 is clearly a program.

I think you are right, it would be surprising if we discovered something outside our understanding of physics (god) to give us what we perceive as consciousness. It's been hypothesized that some unknown quantum effect might provide us this ability, but there is no indication of that yet.

My general thought is, that since our brain is made of matter, it can be dissected and understood and eventually copied. Except, we are reverse engineering millions of years of evolution, which is exceedingly hard! We have had access to the information about all the proteins that make our brain cells for almost two decades, and still, their function has to be teased out in year long experiments. Not to speak of understanding the workings of the Homo Sapiens brain as a whole.

It is also an analog 'computer', the performance of which we have no perceivable chance to match, at least in the near and most likely also distant future.

> It's pretty clear that human brains are just programs. Extraordinarily complicated highly optimised programs, sure. But nobody has even found a shred of evidence that there's anything fundamentally different to programs in them.

That's not the case. Or rather, we don't know, we only have models and some are useful. In your statement there's a whiff of you having only a hammer, and everything looking like a nail.

I'm not saying I entirely disagree, only that the computer analogy is only an analogy, and has problems and detractors.

For example, this from 2005: "In cognitive science, the interdisciplinary research field that studies the human mind, modularity is a very contentious issue. There exist two kinds of cognitive science, computational cognitive science and neural cognitive science. Computational cognitive science is the more ancient theoretical paradigm. It is based on an analogy between the mind and computer software, and it views mind as symbol manipulation taking place in a computational system (Newell and Simon, 1976). More recently a different kind of cognitive science, connectionism, has arisen, which rejects the mind/computer analogy and interprets behavior and cognitive capacities using theoretical models which are directly inspired by the physical structure and way of functioning of the nervous system. These models are called neural networks—large sets of neuronlike units interacting locally through connections resembling synapses between neurons. For connectionism, mind is not symbol manipulation. Mind is not a computational system, but the global result of the many interactions taking place in a network of neurons modeled with an artificial neural network."

Google is making it nearly impossible for me to get a URL for this, here's a link I hope works:


We compare what we see against our internal model of the world, a CNN pattern matches, it doesn't think critically about the result that comes out.

It's very intersting, but once you understand how these attacks are performed, not too surprising (at least the attacks that I'm familiar with, there may be others that are different). Basically (and I'm drastically oversimplifying here), there are a class of succesful techniques that can run an optimization to maximize classification loss for a certain model architecture/parameter set by modulating the content of the image (for example by adding noise). This loss critically depends on having the parameters of the model in hand, and is very hard to avoid.

So it's a little more sophisticated than just adding random noise, it's adding very specific quantities of noise to very specific locations, which are based on perfect knowledge of how the predictive system (the deep model) works.

Is this stuff interesting? Absolutely. Is it worth studying? Yes, again. Does it mean that CNNs as we know them are poor computer vision systems and fundementally flawed? No. It's a limitation of existing deep models, and one which may be overcome eventually.

This talk from Grayhat conference last year gave me a lot of insights on how powerful but brittle ML is: https://m.youtube.com/watch?v=-SV80sIBhqY

Our CNN models are not meant to replicate the human visual system. They are a convenient mathematical tool that is quite unlike our brain. It’s not even necessary to use convolutions, though they remain the most popular method.

Switching the result from "tabby" to "catamount" is not nearly as "adversarial" as I expected. Is that really worth it?

Is the idea that it's useful if you're trying to stop targeted facial recognition of individual people?

Perhaps the algorithm they're using doesn't know that those are similar things and just knows they're different, so it optimises towards them.

What happens when the perturbed images are processed by some noise removal method? On the crude end, even something like aggressive JPEG compression will tend to remove high frequency noise. There's also more sophisticated work like Deep Image Prior [1], which can reconstruct images while discarding noise in a more "natural" way. Finally, on the most extreme end, what happens when someone hires an artist or builds a sufficiently good robot artist to create a photorealistic "painting" of the perturbed image?

There's a lot of work on compressing/denoising images so that only the human-salient parts are preserved, and without seeing this working past that I think it's better to interpret "adversarial" in the machine learning sense only. Where "adversarial" means useful for understanding how models work, but not with any strong security implications.

[1] https://arxiv.org/abs/1711.10925

Couldn't you easily infer the attacking noise by comparing the original and the changed images? Once you have the attacking noise it would be pretty trivial to beat this, no?

I also don't see how this would do much against object recognition or face recognition. More insight to the types of recognition this actually fights against would be helpful.

I think the idea is that the AI doesn't have access to the original. That being said, I'm not sure what would stop such AIs from being _trained_ on images that have been attacked.

We'll end up in a similar cat&mouse game as online "pirates" been at for a long time. Developers create something to break the AI, the AI adopt because they figure out the noise profile, developers change the noise profile and the AI has to adopt again.

> I also don't see how this would do much against object recognition or face recognition.

That’s precisely the point, you are creating noise that humans are insensitive to, but that severely affects AI.

The idea as I understand is that if you need to upload an image (of yourself for instance), you can use this to complicate matters to AIs by uploading the modified picture.

This 2017 article "Google’s AI thinks this turtle looks like a gun[0]" made me realise ai in the near future, might need to take lethal action, based on flawed data. But then I just comfort myself with the following quote:

"The ai does not love you, the ai does not hate you. But you are made out of atoms, it can use for something else."

[0]: https://www.theverge.com/2017/11/2/16597276/google-ai-image-...

I'd pay for API access for this, are there any plans for this?

As a thought experiment this is cool but from a practical perspective it’s too focused on a specific architecture and if anything adding perturbations might (slightly) help the training process.

From the thought experiment side. I think the moral implications cut both ways. Mass image recognition is not always bad - think about content moderation or the transfer of images of abuse. As a society we want AI to flag these things.

We're still in the phase where different models can play cat and mouse, but I wouldn't count on this lasting very long. Given that we know it's possible to correctly recognize these perturbed images (proof: humans can), it's only a matter of time until AI catches up and there's nothing you can do to prevent an image of your face from being identified immediately.

Just tested on some movie snapshots, doesn't seem to do the trick to me on Google Images (and the noise is very noticeable).

Shame, I thought I would be able to trick Google Images and stop giving away answers for my movie quiz game that easily.

The only method that works randomly as an anti-cheat measure is to revert horizontally the image. It fools Google Images a lot of times.

I love how you can almost see a lynx in the attacking noise. I'd be interested to know if that's my brain spotting a pattern that isn't there, or if that's genuinely just the mechanism for the disruption.

I think it's unlikely that the noise will generally have any relation to the target class. I can't find anywhere they say exactly which attack method they use, so it's hard to say for certain, but all of the attacks I'm aware of don't generate noise that has any human-interpretable structure. See for example Figure 1 in the seminal paper on adversarial attacks: https://arxiv.org/pdf/1412.6572.pdf

Makes me think of those old ‘magic eye’ stereogram images that used to be so popular.

I'm skeptical - what happens when NNs are no longer susceptible to simple adversarial examples, or they take proportionally more power to compute?

I'd sooner spend the effort on legal challenges.

Absolutely the coolest project I read about this year. It will be an arms race between hiding and finding. I went through this with web and email spam.

Given that the most likely input is privacy sensitive, I would prefer a small CLI tool over uploading files to some server.

Great idea. Please consider distributing this as open an source downloadable app to avoid privacy concerns.

Begun, the AI wars have.

AI already controls the world through social media. It knows everything you do and decides what you see and when you see it.

sadly there will be a lot of time and machine power spent on this war.

Why does this page request access to my VR devices?

If my human eyes can identify a picture then, eventually, so too will algorithms. This is fundamentally a dead end concept.

> it works best with 299 x 299px images that depict one specific object.

Wow. How incredibly useful.

How do we know that mammalian visual systems aren’t fundamentally different from AI? What you predict is nothing but an assumption.

The poster is not wrong. This attack only works against a specific model trained against a specific dataset. It does not work against models that understand that a animal face is made up of sub features like a nose, two eyes and a mouth.

In the end, this is a completely useless exercise and will not have any impact on mass image recognition. For this to even work the attack needs to be tailored to the exact weights in the neural network that is being attacked.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact