
Is It a Duck or a Rabbit? For Google Cloud Vision, Depends on Image Rotation - _0nac
https://www.reddit.com/r/dataisbeautiful/comments/aydqig/is_it_a_duck_or_a_rabbit_for_google_cloud_vision/
======
ratel
I quite surprised at the comments on HN so far as nobody seems to see the
significance of this. Yes, the image is ambiguous. The point is that Google
Cloud Vision gives an unambiguous answer of that image based on the rotation.
Transformations of an image are regularly used to improve the results of image
recognition. That process fails quit dramatically if in the course of a
transformation the answer given is presented with higher confidence than
should be.

~~~
gambler
I'm glad that at least someone here sees the problem, but I am not surprised
by the typical reaction of AI apologists in this thread. You always get at
least one of the two responses:

"OMG, this is amazing, it's just like humans. We're probably close to AGI."

"Ha-ha, humans are stupid, so the algorithm giving unexpected result is just a
proof that it's better and less biased."

Here, we have both in response to the same demo.

Still, I honestly don't know why some people are so biased in favor of neural
nets and have zero interest in edge cases and flaws (the most interesting
parts if you want to gain deeper understanding of how the algorithm actually
operates). Wishful thinking, I guess.

~~~
hn_throwaway_99
Can someone explain why this is a problem? I'm not an "AI apologist", but I
would consider it a good thing that the model pegs it as a rabbit when it is
in more of a "rabbit orientation" and a duck when it is in more of a "duck
orientation".

~~~
gambler
In the real world, you don't want AI to instantly flip from 90% confidence in
one direction to 90% confidence in the other direction, because it would cause
erratic behavior. What would be preferable is a large zone where it gives both
labels .45 score. Then you can apply higher-level reasoning based on the
possibility that the object could be _either_ of those two labels (i.e. act on
the possibility of the most dangerous or most beneficial scenario of the two).

~~~
karakrakow
This. Maybe the AI needs to know somehow that it's the same image it saw a few
seconds ago, but now presented at a different angle or the AI itself should
look at an image at different angles, just to be sure, that is how humans
sometimes look at pictures if they are confused. Something to contextualize
every image wrt to what it recently saw and make the current decision a little
less overconfident.

~~~
angry_octet
CV usually does look at images with different rotations, scaling, stretch. The
point is that GCV doesn't get stuck on duck when the duck is upside down, it
behaves the same way as a human classifier.

I'm looking forward to ReCAPTCHA asking "Click the images of things that are
upside down."

------
minimaxir
Creator of the animation here. Most of the relevant information/context behind
the animation (including a link to the repo) is in this Reddit comment:
[https://reddit.com/r/dataisbeautiful/comments/aydqig/_/ehzyo...](https://reddit.com/r/dataisbeautiful/comments/aydqig/_/ehzyozr/?context=1)

~~~
minimaxir
To answer the question _why_ I made the animation: there isn't an ulterior "I
found an AI gotcha!" motive, I saw a tweet where the API returned different
things depending on orientation and expanded on it. It was also an opportunity
to test a few animation hypotheses via gganimate.

------
abhisuri97
When the output switches to rabbit the picture actually resembles a rabbit. I
am unsure if this experiment was supposed to be a “haha look how stupid AI is”
type thing or not, but it seems like the cloud vision api is performing as
intended.

~~~
mikejb
The interesting part (IMO) isn't that the AI classifies it as both a rabbit
and a duck, but that the classification is dependent on the rotation of the
picture.

I can somewhat understand how that happens, but I find it's in an interesting
observation (rather than a criticism on the system, though the title is
somewhat unclear and had me expect something else).

~~~
tanilama
But rotation does contain information. 6 and 9 can be considered as a case
where rotation SHOULD change the classifier's output.

AI does makes dumb predictions from time to time, but I my opinion, this isn't
that strong a case. When it rotate upside down, it does look like a rabbit
even to me.

The more interesting 'failure' here to me, is that while the rotation is
smooth, the prediction is not, instead it is flickering, which does raise some
interesting questions about what the model's internal distribution surface
looks like.

~~~
salty_biscuits
But only with context right? It is completely ambiguous if the character is 6
or 9 without some other clue, like which way up the paper is, or from the 4
next to it (if you assume that an arbitrary rotation may have occurred). It is
just a sign to me that doing some rotations as data augmentation is not good
enough. Rotation invariance needs to be built into the architecture of the
network (like translation invariance sort of gets in via max pooling). I think
it should be giving a 50/50 classification of duck and rabbit at all rotations
if it was working as expected.

~~~
devin
How often do you read upside down?

~~~
salty_biscuits
All the time, especially when I read to my kids.

~~~
swixmix
Your kids might enjoy this cartoon.

[https://youtu.be/9-k5J4RxQdE](https://youtu.be/9-k5J4RxQdE)

------
WhuzzupDomal
It's a seagull not a duck. Don't confuse the dumb AI even more by not knowing
what a duck doesn't look like in the first place. Jeez.

~~~
randomsearch
No such thing as a seagull, as my ornithologist friend likes to remind me.

------
Illniyar
That image is a visual illusion. I find it hard myself to detect that it's a
rabbit when it's ears are horizontal like a mouth.

Not sure what is the purpose of it, is it to show that even computers vision
algorithms can get confused by visual illusions?

~~~
2muchcoffeeman
I find your response more interesting that the experiment.

You don’t just rotate the image in your mind or focus on specific features to
bring out the duckiness or rabbit-ness? I can make it more duck or more rabbit
as will.

------
Felz
Is it concerning that there are short, sudden drops in prediction in the
middle of a block otherwise solidly classified as rabbit/duck? I don't know
much ML, does anyone know why it'd be so discontinuous?

~~~
minimaxir
Specifically, those drops are where the top/bottoms of the image are _very
slightly_ cropped out.

When making the animation I didn't _intend_ for the occlusion, but the fact
that the occlusion causes the prediction to _drop to zero_ is itself an
interesting data point. Many objects in real life are occluded.

------
stared
While the title is clickbaity (as in adversarial examples for fooling neural
networks e.g. by adding a baseball ball to a whale to make it shark), I think
it shows a nice phenomenon. I.e. a given illusion works similarily for humans
and AI alike.

Vide "dirty mind" pictures like posting
[https://images.baklol.com/13_jpegbd9cb76b39e925881bdb2956fd3...](https://images.baklol.com/13_jpegbd9cb76b39e925881bdb2956fd32ac91.jpeg)
to Clarifai [https://clarifai.com/models/nsfw-image-recognition-
model-e95...](https://clarifai.com/models/nsfw-image-recognition-
model-e9576d86d2004ed1a38ba0cf39ecb4b1) gives 88% for NSFW.

~~~
wodenokoto
Clickbaity? The title is downright misleading.

~~~
IanCal
I'm a little confused, it seems quite accurate to me. It's the famous
duck/rabbit rotation illusion and the google cloud vision API returns
different results depending on rotation.

What do you find misleading about it?

~~~
stared
See e.g. [https://towardsdatascience.com/breaking-neural-networks-
with...](https://towardsdatascience.com/breaking-neural-networks-with-
adversarial-attacks-f4290a9a45aa)

~~~
jasode
I wasn't the one who downvoted you but I agree with IanCal and this blog title
didn't trigger my "clickbait" sensor.

I disagree that this is an example of "adversarial attack". The famous
duck/rabbit illusion has been around since ~1892[1] and therefore was not
deliberately constructed to be an "adversary" to image classification neural
networks.

To me, it's an interesting example of feeding a well-known optical illusion to
an AI algorithm and observing its behavior.

[1]
[https://en.wikipedia.org/wiki/Rabbit%E2%80%93duck_illusion](https://en.wikipedia.org/wiki/Rabbit%E2%80%93duck_illusion)

~~~
stared
Yes, I agree with you. (No idea who downvoted my and why either.)

This (well known) illusion is NOT an adversarial example. Though, I explain
why for people working with AI (e.g. me) the title seemed like mentioning an
adversarial example. There are plenty of examples of "just rotate and a
vulture becomes an orangutan" where it does not look like an orangutan for
humans.

Vide: "A Rotation and a Translation Suffice: Fooling CNNs with Simple
Transformations"
[https://arxiv.org/pdf/1712.02779.pdf](https://arxiv.org/pdf/1712.02779.pdf)

------
Criper1Tookus
It would be cool to visualize this as a kind of pie chart, based on where the
ears/beak is pointing. Blue for directions where it sees duck, red for rabbit,
and empty for neither.

------
dusted
Looks like proof to me, that the classification works correctly.

------
foota
I wonder whether it would stay consistent if you gave it a solid background
line

------
miguelmota
This seems like a serious concern. What's a possible solution to this problem?
Should all orientations be considered valid types? so in this case the image
should be both a duck and a rabbit as the response?

------
EugeneOZ
On still (not animated, not rotated) preview I saw rabbit first, then in a
second I found it can be a duck also, and now it takes efforts to see rabbit
again (but I can do it).

------
Gunstig2Snath
I was ONLY seeing clockwise in all images until the counter-clockwise one went
about 8 rotations and all of a sudden I saw it counter-clockwise. Now I can’t
unsee it.

------
ChlorophZek
When I look at the anticlockwise one I can see it as going either direction.
When I look at the clockwise one I can only see it going clockwise

------
iscrewyou
Does Google Cloud like Duck or Rabbit? That’s where the answer lies.

In addition, if Cloud could taste one, it would really help itself with the
answer.

------
AlphaWeaver
I wonder if this was hardcoded/specifically trained to do this for this image?

~~~
usrusr
The image was created specifically to fool humans, it's one of the classics in
the optical illusion genre. It would be a shame for ML if it required specific
trickery on both ends to reach that outcome.

------
mrashes
There is a children's book about this pairing: [https://www.amazon.com/Duck-
Rabbit-Amy-Krouse-Rosenthal/dp/0...](https://www.amazon.com/Duck-Rabbit-Amy-
Krouse-Rosenthal/dp/0811868656)

------
Seldaek
But does it see a blue or a white dress?

------
MoD411
it cannot be a rabbit because there is no nose nor mouth

~~~
SmellyGeekBoy
But how does it smell?

------
zavi
WAI

------
randomsearch
It’s a drawing of a creature that looks a bit like a rabbit or a duck from
different angles but is very clearly neither, at best a bad drawing. That’s
the failure here - it’s classifying into one of its categories when it
shouldn’t be classifying at all.

~~~
mikejb
It's one of those optical illusions, where we (humans) do recognize animals. I
don't think it's reasonable to limit classifications to the specific case
where a drawing perfectly resembles a certain animal.

The interesting part (IMO) isn't that the AI classifies it as both a rabbit
and a duck, but that the classification is dependent on the rotation of the
picture.

~~~
Sharlin
I don't find it super interesting. The orientation distinction is clearly in
the training data, and making the algorithm completely rotation invariant
would likely a) be more difficult and b) result in worse classification
compared to us humans who very much use orientation as a cue having been
evolved in an environment with distinct "up" and "down" directions.

~~~
krehl
As the AI has likely not seen anything remotely similar during training, it is
quite interesting that it is detecting the animals. As the picture was set up
to confuse humans, does it somewhat show, that the representation that the AI
learned is similar to the one humans have?

~~~
Sharlin
The AI has seen ducks and rabbits. It has build a good enough internal model
of "duckness" and "rabbitness" that even more abstract renderings of ducks and
rabbits activate the relevant parts of the net. I mean, that's exactly what we
want them to do! Figure out the aspects of the input data salient to the
classification task and ignore the irrelevant parts.

That said, I the response here is likely filtered and normalized to only
include "duck" and "rabbit" classifications; after all the bird looks much
more like a seagull than a duck.

------
laichzeit0
This is the infamous Duck-Rabit illusion, right? The classifier seems to be
doing a good job.

[https://en.wikipedia.org/wiki/Rabbit%E2%80%93duck_illusion](https://en.wikipedia.org/wiki/Rabbit%E2%80%93duck_illusion)

