Hacker News new | past | comments | ask | show | jobs | submit login
Quick, Draw (withgoogle.com)
1083 points by madmax108 on Nov 16, 2016 | hide | past | favorite | 328 comments

This is the best "visualization" / "explanation" of the possibilities and limits of AI that I've seen.

I can show this to someone and say:

1. The software can recognize a feather, as long as it looks similar to what it thinks a feather looks like.

2. The software can't recognize a feather if it's never seen a feather like that. It's not a sentient being.

This is good, because most examples focus on point #1 and -- if enough marketing is involved -- don't go enough into point #2.

People read news articles like "X can recognize cats in a picture with Y certainty!" and are quick to assume that this "AI" can make sense of a picture and understand it, when all it does is apply certain methods for a certain use case.

This does a much better job by letting people write (or draw) their own test cases and figure out the limits intuitively.

> 1. The software can recognize a feather, as long as it looks similar to what it thinks a feather looks like.

I was prompted to draw a hurricane. I drew something that looked like the typical hurricane doodle used on news reports.

The software didn't recognize it.

When the game was over and I was able to look at all of the doodles that were used to train the software to recognize a hurricane ... the majority of them instead looked like tornadoes!

So maybe we should more precisely say:

1. The software can recognize a feather, as long as it looks similar to what the humans who contributed its training set think a feather looks like.

My hurricane was just terrible. I ended up with a scribbled mess because I got that in the first set or two, didn't really have a plan and drew components of a hurricane as I remembered them.

I'm also ashamed to admit I drew some less than ideal stuff due to forgetting details on things and then panicking because of the timer. Like the spots on a panda's face for some odd reason.

Hopefuly my drawings were treated as outliers.

Apparently most players of this game didn't see the "carrier" part in "aircraft carrier" and just drew airplanes. Probably because of the time constraint.

Or maybe they mistook it for "carrier aircraft", as in "cargo plane".

Which is actually a pretty big win. After all you could also say this:

1. The person can recognize a feather as long as it looks similar to what the other people who contributed to it's learning think a feather looks like.

I was asked to draw: brush I drew a hair brush. It was trained to see brush as a bunch of circles or trees.

Are you sure that wasn't "bush"?

...or something for sweeping the floor

When the game was over and I was able to look at all of the doodles that were used to train the software to recognize a hurricane ... the majority of them instead looked like tornadoes!

Idiocracy was prophetic -- except it missed the aspect that "Idiocracy" would first manifest on the Internet.

The premise of Idiocracy is actually false; IQs have been rising over time.

Citation needed. IQs are defined such that the average IQ is always 100. https://en.wikipedia.org/wiki/Intelligence_quotient

So the internet is really just revealing more of the idiocy that's always been there.

You are really stretching to find a way to feel superior to people.

Alas, if I had to stretch. Basically, there's rampant prejudice and anti-intellectualism from all points on the political spectrum. The response to the enabling of trolling by anonymity is an upsurge of authoritarianism by (of all people) many on the Left. The Right? Not much better.

If I had a dollar for every time someone pattern matched me or a phrase I wrote, then jumped to conclusions about my ideas of internal emotional state then even insisted I am lying when I tried to disabuse them of the notion -- I'd have a whole lot of dollars. (Hint, if you start sniffing around trying to justify that they're right, I haven't left you sufficient evidence and you're probably also doing that.)

Apolitical people? Mostly just as bad.

It exhibits gender disparities very nicely too.


Yes, an absolutely classic example of implicit biases in training sets.

On the one hand, the network should eventually learn to classify high heels as shoes.

On the other, when these classification system actually get used, they're always at some arbitrary point in their training, so you can't just wait for "all the biases to go away."

Erm... high heels are not the only kind of shoes women wear. They're not even the most common kind of shoes women wear. Pointing to this as a 'gender disparity experience' is showing your own bias. Yes, high heels are shoes and it should learn to recognise them, but most women don't actually wear them most of the time.

Makes sense to me. The training data set focuses on generic, gender-neutral shoe examples instead of highly gender-specific ones.

There is another take on this issue. That it's not that the shoes are gender-neutral, it's that "male" is neutral. This essay explores that take: http://www.tarshi.net/inplainspeak/marked-women-unmarked-men...

Sit in a shopping centre, movie theater lobby, or even just out on the street. Watch the shoes of the women as they stroll by[1], and you'll find very few of the ridiculously high heels that are pictured in that tweet. Claiming that tweet's shoe as the typical women's shoe is laughably erroneously stereotyped.

[1] Not just the young fashionistas that specifically dress up, but every woman.

I don't think that's true for shoes. The male equivalent to high heels would be dress shoes, and women wearing male dress shoes would be weird and unusual. The examples shown appear to be casual or athletic shoes, which are indeed neutral.

You might be able to explain it, but it still shows that it's wrong. (Though I disagree that these shoes are gender neutral; only ~5% of the shoes in my household look like "gender-neutral" shoes, and they're all mine)

There is their problem, they had Al Bundy train the AI. How else do you get from shoe to whale in only three pictures, with one involving food.

Can we please keep gender identities discussion from Hacker News?

The comment was pointing out a specific example about how an AI miscategorized something because of a small sample-size in data, something that has been shown to be often the result of unintended biases in the training set, and you say that we're just talking about "gender identities"??

This is the kind of thing AI researchers write papers on (source: AI MSc), not some SJW topic, yet you saw the word "gender" and assumed it didn't belong?

Humans usually can't do your 2. either. In some cases, people may be able to recognize things based on descriptions alone, but those are typically simple combinations of known entities.

For recognizing relatively simple entities, are there advantages humans still have over neural nets (assuming the same scope of knowledge)?

Definitely my 3 y/o can recognize a cat in an abstract drawing of a cat that is unlike any cat he has seen before.

Humans are great at learning abstraction from concrete examples. That's also what deep learning does and the big reason for its success as well. I'd guess that some neural nets architecture can do the same with your cat example (perhaps with adaptation). Can any expert weigh in?

An idea: We can also run several cat photos through image processing algorithms to filter out details. The output would be outlines similar to the drawings in the Google Quickdraw app. We put those through the app to generalize (perhaps the app needs some training with a few categories of objects, not necessarily animals). Voila! Software can now recognize drawings based on photo examples.

> Humans are great at learning abstraction

Of course, there's severe bias here, in the sense that what we consider abstraction is by definition "human shaped" abstraction

If multiple humans try to "abstract" a cat, the overlap in underlying processes will be pretty big, making it more likely that we can recognise each other's abstractions.

Of course, there's severe bias here, in the sense that what we consider abstraction is by definition "human shaped" abstraction

I can read the words here, but I don't understand the meaning.

We abstract to find a common set of features in things that are supposed to be the same but that are not present in things that are not supposed to be the same. Grouping these features then produces higher level abstractions, and so on.

Where would the bias be?

Even if the features differ, the process is the same.

And even the features are often the same. If you reverse a DCNN to see what it uses to classify things as "cats", expect to see whiskers and fur.

Think of Bugs Bunny. He looks nothing like a real rabbit, yet humans recognise him as a rabbit (presumably) because we look at the characteristics that separate him from a normal human, then compare those characteristics with our list of things with those characteristics (long ears, big feat, eats carrots) and get a rabbit. If he'd been made to look like a rabbit-octopus hybrid instead of rabbit-human, we may have struggled more.

Computers don't look at things from a human perspective; they're still good at abstraction, just different to human abstraction. i.e. there's a human bias in there.

That's OK though; the objective is to make a computer that sees things the way people do; so it's a bias we want.

However the issue isn't that the computer's not a sentient being and therefore can't abstract things it's never seen before; only that the algorithm hasn't been written to sufficiently take account of human bias.

I think the word you're looking for is "familiarity", insofar as it describes a particularly efficient means of recognition. E.g. humans have become pretty good at identifying cats.

I don't see a fundamental difference between biological and electronic neural nets; so please take the following with a physicalist grain of salt. Imho, precisely because NNs will be fed with nothing else than the reality (physical or virtual) we live in, it should gradually develop the same familiarity as humans have; i.e. nothing more and nothing less than elements of our lives/civs. Visually lots of cats, lots of cars, mountains and coasts; functionally all the tasks we accomplish daily, like driving or cooking or cleaning.

I don't really think you can hard-code "human bias" as it's an emergent property of our biology: too complex (we don't really understand much of it, imho you're bound to miss the mark and induce subjective biases), and somewhat contradictory to how NNs are supposed to evolve (thinking long term here). Basically, I don't think it would be practical nor cost efficient to induce too much perturbations in deep learning, better work on refining the process itself. Think of plants: you can tweak the growing all you want, but the root deciding factors lie in genetics (their potential, and in understanding how to maximize it).

I realize another wording is that we should apply sound evolutionary (Darwin etc.) principles in "growing" AI at large. Because AI and humans share the same environment, we should see converging "intelligence" (skills, familiarity, etc). It's a quite fascinating time from an ontological perspective.

It's interesting to think about what the limits of an AI that doesn't have a full human experience are. I think you're probably right that machine vision will be competitive with human vision. It's already much better in specialized areas.

General purpose machine translation is harder, for instance. Brute force algorithms have gotten decent, but aren't in the same ballpark as humans (though professional translation services now often work by correcting a machine translation). However, MT systems trained on a specific domain do much better (medical or legal docs, etc).

What would be the hardest task for machines that's trivial for humans? Maybe deciding if a joke is funny or not?

Perhaps not the hardest, but one where there's tons of room for improvement: the Story Cloze Test [1] is a test involving very simple, five-sentence stories, where you pick the ending that makes sense out of two endings.

A literate human scores 100% on this test. No computer system so far scores better than 60%. (And remember that random guessing gets 50%.)

[1] http://cs.rochester.edu/nlp/rocstories/

Interesting study; whilst it's possible to guess which ending is expected as correct, the alternate could be easily argued. For example, in the case of Jim's getting a new credit card, I recall during my uni days many students took that exact approach to debt...

Good point; I'd not considered whether the human imprint would be down to familiarity (individual's) or in-built through evolution (inherited familiarity); likely a combination of both. In fact, I recently read that chimpanzees raised by humans are believed to identify as human rather than chimp; so individual familiarity does seem stronger.

The book, "We are all Completely Beside Ourselves" is fiction, but refers to findings from real studies.

Hmm, I always assumed bugs bunny was a hare.

You implicitly (and I think without realising) presume objectivity + complete knowledge in the observer.

Human perception is heavily biased towards features that had evolutionary advantages, and limited by whatever technical flaws our eyes/brains/etc have. That's a selection bias in our perception of information, in our processing of said information, and therefore in the abstractions that result from it.

I agree with what you say, but it doesn't support your earlier statements.

I presume it's possible that the limitations of our visual system means we may miss powerful features and hence the ability to build some more powerful abstractions. (I didn't even argue this, just pointed out the process is the same even if features differ)

But I don't see how this supports your original claim of bias, which was: "If multiple humans try to "abstract" a cat, the overlap in underlying processes will be pretty big, making it more likely that we can recognize each other's abstractions."

If humans are good at recognizing each others' abstraction, that's a validation that low-pass (for lack of a better term) filtering the features due to human's physical design still creates very good abstractions and classifiers. That is to say, if anything you're confirming that humans are designed in a way that makes the abstractions they can make maximally useful.

"you're confirming that humans are designed in a way that makes the abstractions they can make maximally useful."

... to other humans.

What's the meaning of this?

Are you arguing that the classifications themselves are biased?

That's exactly what I and others have been arguing. Now to be clear: it's not that these classifications are wrong, just that out of all possible classifications we could have found, we will most likely find the ones that fit the human perspective of the world.

Think of the Turing test and its criticisms; it's kind of has the same issues.

PS: I've upvoted every comment of yours; asking questions like this should be encouraged :)

Thanks for taking the time to explain your argument!

Confirmed; thanks vanderZwan.

Classifications are also dependent on the capabilities of the language they are expressed in: https://en.wikipedia.org/wiki/Linguistic_relativity

> still creates very good abstractions and classifiers.

My point is that "good" and "bad" are not objective here, but depend on human use-cases.

Now to be clear: I'm not disagreeing with you! These are good abstractions, for humans. It lets us communicate concepts easily, which is great! But it might not be the best abstraction in every circumstance.

For example, I recall reading an article that said that AI is better at spotting breast cancer from photos (which is essentially interpreting abstract blobs as cancer or not). The main reason seems to be that it is not held back by the human biases in perception.

Cats are probably a particularly unfortunate example to use in comparing abstraction forming cabilities, as given our history it's highly likely that we come supplied with some dedicated cat recognition circuitry.

Humans have a bit of an advantage on two levels here. First, we know what a cat looks like. Not a video or a picture or a drawing, but an actual cat. That gives us a solid frame of reference. "That is definitely a cat. That drawing looks kind of like what I know a cat to look like, so it's a drawing of a cat." The closest a computer can get is "This drawing has quite a bit in common with these other drawings, and apparently these other drawings are cats. So this is probably a cat too."

Second, when we look at a picture of a cat, we're looking at a human's interpretation of what a cat looks like. If we asked a computer to draw a cat, it might look nothing like a cat to us, but another computer could look at it and go "Oh sure, that's a cat." I seem to recall Google did a thing with this a while ago, where they effectively created a feedback loop in a neural net - feeding its own drawing back into itself. As I recall, the result looked like the computer had done way too much LSD.

Basically: you are right.

Can you sketch an example of such a drawing? I'm having a hard time imagining something that looks enough like a cat to be recognized as such but unlike any cat a three-year-old has ever seen before.

I'd say that misses both my criteria: it looks just like lots of cat drawings any three-year-old has been exposed to, and it also seems like an image Google would have no trouble recognizing as a cat.

Again, I think first-world children over the age of 3 have been exposed to plenty of drawings like that, and also, Google can recognize it as a cat anyway -- in fact, it even knows which cat; do an Image Search and you'll see, "Best guess for this image: garfield meme"

Your criteria were "looks enough like a cat to be recognized as one, but unlike any cat that a 3-year-old would have seen before".

Google doesn't recognize it as a feline, it recognizes it as Garfield.

I doubt that webpage is as smart as a 3-year-old.

Do you really think any reasonable person is going to mistake this couch for a cat?


The software does:


Sure, you can fool a human. But there are things AI is missing that would be embarrassing if a human made the same mistake. It's hard to say, based on anecdotes like this, how big that gap is, but it's there.

>Humans usually can't do your 2.

I think we do. We see a building we've never seen before and we know it's a building because it has certain features that we use to classify it as a building. The examples aren't scarce.

I also think a good indicator of us doing it is the use of "y" and "ish" and "sort".

As for sthlm's point 2:

>2. The software can't recognize a feather if it's never seen a feather like that. It's not a sentient being.

This is Asimo in 2009:


When it comes to abstraction from a simple rendering – no shading, no sense of depth, no discernible dimensions – it's hard to extrapolate features.

I feel there is an immense difference between recognizing simple sketches and deriving what an object is based on extended characteristics.

The video you linked furthers that by showing that ASIMO was using three-dimensional observation to calculate certain features and ascertain what that object was.

The abstract drawings benefit a lot from the limited selection and the huge implicit context.

If you'd give these doodles to people that are not Western males it'll do a lot worse. Someone already pointed out it doesn't recognize woman's shoes.

Humans frequently misrecognise sketches too.

If you've ever played pictionary you'll see the level of abstraction we can manage is remarkable too.

Familiarity with teammates may factor into that as well, partially from having unspoken frames of reference to infer from.

It is unmistakable how much the difficulty level ramps up when you're paired with those of an unlike-nature to you. Sometimes that level of abstraction is taken way outside of generic context clues.

It is but we've also had decades of practice. What scares the most about AI isn't how advanced computers can become but how slow we are to learn in comparison.

Actually in mind when I was mentioning that was playing a game I coined "foot pictionary" (we've also played "blind pictionary") with kids ages ~6 to 10yo.

We use very generic "words" (eg egg, tree, bike, cloud, plate).

When you're using your foot to draw you really have to distill down to the essence of the item. Yes there is a deal of guessing but in some way the image (however unlike the object) has to have some element of the Platonic nature, if you will, of the object being drawn.


You're just wrong on this one. Humans can recognise a lot of things that aren't in the form that they're used to. It's seen a lot of research in psychology.

As for advantages over neural nets, one of the primary ones is that humans can recognise things from unusual angles much more easily. When I tried QuickDraw and doodled things from non-stereotyped angles (like a three-quarter view of a car rather than the usual 2D side view), it had no idea.

The dalmation optical illusion[1] is another example of human ability to pick out patterns and assign them to belong to certain objects. Neural nets have different abilities, and are sometimes better at picking out different sorts of patterns than humans.

[1] http://cdn.theatlantic.com/assets/media/img/posts/2014/05/Pe...

> 2. The software can't recognize a feather if it's never seen a feather like that. It's not a sentient being.

Why did this word "sentient" sneak in to your comment? I don't see what "sentience" has to do with what you just described; it's just a more sophisticated form of pattern matching.

"See, it can't do this! It's not self-aware!" is almost never the correct answer, because whatever thing it is you want to do will probably be solved in the future with more of the same techniques. Just about the only thing "sentience" or self-awareness is good for is an entity's private experience, which you wouldn't ever be able to see anyway.

I think sthlm's thinking that people as in:

>People read news articles like "X can recognize cats...

may assume sentience when it's not there

I don't think "sentience" is a sufficiently precise term to enable us to judge whether it's there or not.

"Doesn't look like anything to me"

Highly relevant to your two-point example:


>The software can't recognize a feather if it's never seen a feather like that. It's not a sentient being

Like humans brains?

>are quick to assume that this "AI" can make sense of a picture and understand it, when all it does is apply certain methods for a certain use case.

Like human brains?

No, not at all. If you only showed it a bunch of pickup trucks in various colors, it would be really good at identifying pickup trucks. But if you then showed it a Prius, or a motorcycle, it would have no idea that it was looking at a vehicle. A human brain wouldn't have much trouble with that, though, because it associates more information with the vehicle idea than just statistical similarity to previously seen shapes, and can extrapolate without having direct previous experience with the object being seen.

If you showed a small child 10 pictures of pickup trucks and told them "These are cars" then showed them a motorcycle and said "What is this?" what do you expect to happen?

Remember, this child has never been on the road, never driven a car, never had the mechanics of locomotion taught to them. All they know is that objects that are longer than they are tall with a flat bed on one side and wheels on the bottom are classified as cars.

Once the child (or machine) has more information to associate with the 'vehicle idea' it can call on this information when it sees shapes that are also associated with the 'vehicle idea' in order to extrapolate without having direct previous experience with that object being seen.

Trucks are generally not classified as cars, nor are motorcycles. These are all types of vehicles, per my original terminology. I actually did a similar experiment with my friend's daughter (3 years old) and she was able to figure it out just fine. Humans are generally able to extrapolate that things with wheels move, and if they have a seat, it's meant for someone to sit on, while it's moving. Hence a vehicle. It's this level of conceptual understanding and "how would this thing work" thinking that ML lacks in comparison to human brains. People use more than just sight recognition to identify new objects, while current ML models do not.

Maybe some current implementations lack the ability to make these connections but it is in now way even a small stretch to conceive a machine that understands "Wheels are for moving" "Seats are for passengers" "Things that have both wheels and seats are probably vehicles".

So when that machine learning algorithm recognizes wheels in a picture and recognizes seats in the same picture, it searches for results that include both wheels and seats.

The human brain does not inject any magic in to this process.

It sort of does, though. Let's say we train an ML implementation so that it can recognize things with wheels and seats as vehicles. Now we show it a hovercraft. What will it do? How about a helicopter? All the human brain needs is a single example of people getting in or on something, and it transporting them from point A to point B in order to infer that the thing is a vehicle of some sort. This is because we are able to infer purpose of an object even if we have never seen it before. ML is just statistics - it implies no meaning or comprehension whatsoever beyond "thing A is statistically most like thing B I have seen before". There's an important difference between recognition and understanding, and current ML techniques are solidly in the former camp.

The often forgotten difference between ML and humans is that we learn from stereoscopic video streams, not from a bunch of static pictures. There's a lot more information in a few seconds of watching cars on the road than in a thousand pictures of different cars. We get to see the 3D picture (we have dedicated circuits for that), hear 3D audio and perceive temporal data. We correlate all that and many more data sources to form categoties.

ML trained on bunch of static pictures is like humans dealing with those abstract geometrical riddles that are used on IQ tests. They're difficult for us, because they're not related to our normal, everyday experience.

Neural networks can learn new categories of things like that with about 5 examples. They are already outperforming humans on some tests. https://news.ycombinator.com/item?id=11737640

Not exactly: if you've never seen a particular kind of feather before, you may not recognize it at first sight, but most certainly you'll sit, examine it and eventually acknowledge it's a feather -- the neural networks we're using aren't prepared to do this kind of analysis yet.

Drawing your triangle upside down is enough to hit the limits!

#2 applies to humans as well. For example if I show a human something that looks and has all the properties of a car, the human will think I am showing him a car even if the thing I am showing him is actually called a feather.

Any neural net, artificial or not, can only recognize things as long as it looks similar to what it thinks the thing should look like.

Just reminding Google that snakes can eat elephants so please stop confusing them with hats:


Adult neural networks cannot understand that.

I don't know how it sounds exactly for a native English speaker, but saying "adult neural networks" seems to give the intelligence trait to some piece of software.

Times are changing...

In case you don't know the cultural reference, the joke is that kids can easily see that the hat is obviously a snake that has eaten an elephant, while adults only see a hat.

It's told in the first chapter of the famous childrens book, "the little prince"

Argh "the little prince".. Here's what happens when you live your passion for computers, programs, etc. 24/24 : you miss the good books :-)

I'm not sure where you're getting those extra 17 days in your week but my god please share the secret with us.

I'm pretty sure you should read 24/24 as a fraction instead of a tuple. But I guess your neural network is more trained on the other case.

You're missing a cultural reference. http://www.thelittleprince.com/work/the-story/

Chapter 1 of The Little Prince from Wikipedia:

> The narrator explains that, as a young boy, he once drew a picture of a boa constrictor digesting an elephant in its stomach; however, every adult who saw the picture would mistakenly interpret it as a drawing of a hat. Whenever the narrator would try to correct this confusion, he was ultimately advised to set aside drawing and take up a more practical or mature hobby. The narrator laments the crass materialism of contemporary society and the lack of creative understanding displayed by adults. As noted by the narrator, he could have had a great career as a painter, but this opportunity was crushed by the misunderstanding of the adults.

I was going to link the book because I thought it was on the Public Domain, but apparently not in every country [1]. Also, the site you reference seems to be purposely misguiding visitors by omission about the book's licensing status [2].

[1] http://www.communia-association.org/2015/01/23/the-little-pr...

[2] http://www.thelittleprince.com/licensing/

>saying "adult neural networks" seems to give the intelligence trait to some piece of software

Ignoring the cultural reference everyone here is talking about:

How do you get that perception? We apply the adjective adult to animals to whom we (rightfully or wrongfully) don't attribute intelligence, and even to other lifeforms which we usually categorize as non-sentient (e.g. adult trees).

I think it is perfectly legitimate to assign the adjective adult to a neural network that has either left the training phase, or that is only undergoing marginal changes in further training. This seems to be mostly in line with how the word adult is used in other contexts.

Not that I'm opposed to calling software intelligent, in fact I think it would be weird if we couldn't call something intelligent just because it's silicon-based instead of being based on organic neurons. I just find it odd that you associate "adult" with intelligence at all.

Well, you're right. But somehow, my mind had wired "adult" to this kind of definition (wikipedia) : "Biologically, an adult is a human being or other organism that has reached sexual maturity". And therefore, I associated that to "organism" (I didn't think about animals, but I'd venture to say that when compared to computers, many familiar animals like a dog are more intelligent, but that's a debate, I understand that).

So in this sense, I found it surprising to associate a word that I use for "living"/biological things to a digital thing; specially in the context of A.I.

Maybe the term adult is, for me at least, very loaded with lots of meanings that go far behind the simple notion of "maturity"

Or, it's a site that lets you train neural nets using pornography, then see what pictures/videos result from it when you start killing off neurons.

Setting the "Little Prince" reference aside, "adult neural network" sounds like porn to me.

Spoiler: They're talking about the first page of The Little Prince.

To be honest, there was a little kid who drew a snake eating an elephant too. The teacher thought it was a hat as well.

Love Le Petite Prince! Read it both in French and English.

I will try it in French, I've read it in Spanish, English, Italian and Japanese so far (the latter barely) so I think French would be a great addition and I probably will understand it being a Spaniard and all. Thanks for the recommendation

You didn't draw the tusks, that's why.

It's indeed difficult to draw when trapped inside an adult perspective.

Dessine-moi un mouton...

What if the snake swallowed a really big hat?

It seems the answers are pre-programmed because I drew just a car and it said "I know! It's a police car" even though I'd drawn no siren.

Also with the sweater drawing I'd drawn just a T-shirt accidentally then when drawing a long sleeve it said it knew it was a sweater when it could have been similar things.

So this demonstration isn't about it identifying the drawing correctly, it's just about saying when it's found something considered close enough in an overly-broad range of ambiguous things based on the answer it's been given already.

It did identify some partially drawn things, like a line or a circle or more complex things the partial drawing could have been, but the pre-loaded rigging made me stop the test.

It's because it's gathering training data from the things other people are drawing. So if it's just asking for police cars anything car like should just fall into the police car category.

It also stops you once it correctly identifies the drawing, even if you're far from done. If it incorporates that into its training, then it'll presumably learn to figure out what you meant earlier and earlier.

It's interesting to consider whether it would end up making guesses that it sometimes gets correct, or whether it would actually learn to discriminate based on very early stages of drawing. It's conceivable to me that people might start out drawing "police car" differently from "car" in some subtle but detectable way, even if it doesn't involve what we humans would consider to be the distinguishing features of a police car.

> It seems the answers are pre-programmed because I drew just a car and it said "I know! It's a police car" even though I'd drawn no siren.

Exactly the same thing happened to me (I did a CTRL+F on 'police' to find this post), which immediately turned me from 'this is cool' to almost quitting, and coming here to see if anyone else experienced the same thing (perhaps with different images).

It also didn't guess my drawing of a bulldozer, which I was surprised at since my drawing wasn't bad, and easily recognizable to any human of any age as a bulldozer.

I went back to complete the test, and it tells you what it guessed, which helped to explain both things.

Here's what it thinks a bulldozer looks like:


Here's the difference between "car" and "police car":


You, like me, obviously don't draw cars from the Soviet Union.

Edit: I just noticed (thanks to another post here) it also shows several examples of what it thinks an object looks like, and now I'm not sure why it didn't guess bulldozer, when couple of the images look like mine, certainly far more than the things it guessed.

I thought the same thing. It was a bit more subtle for me, but it prompted me to draw a sun, and when I first started drawing, it said "I see circle" but kept guessing, and with a few more lines it concluded, "I know, it's sun". Then a few drawings later I got circle, and in one second it said, "I know, it's circle".

At first I was thinking, how could it know this is a circle, but for the sun, it knew circle wasn't it when I had only drawn a circle. But it just occurred to me that the program probably tells it whether or not it's right before it actually verbalizes the guess/answer.

Then a few drawings after that, I got bus, and it guessed school-bus about 15 seconds in, and it wasn't until 10 seconds after that when it finally guessed just bus. So, I'm guessing this is the same for yours, it was simply guessing police car before car (probably because the NN takes probability into account and has gotten that as a right answer more often or something), and in your case it just happened to be right so it stopped guessing before it tried just "car."

It's just picking the closest thing from its limited training set.

There is no way it should have recognized my potato.

I tested this. It asked for a shovel and I drew an aeroplane and it guessed aeroplane. I did the same thing a number of times and I got around a 60% correct response rate.

I agree and believe that it's pre-programmed in some ways. I took the test twice, and in the first set I had 'zigzag' and in the second set I had 'stairs'.

I drew the exact same doodle for each and for each it guessed correctly on its first try.

No, I believe it makes a series of guesses ranked for 'closeness'.

If what you're supposed to be drawing is ranked 'sufficiently high' the game/training side of it moves on to the next.

What? My point is that I drew the same doodle for different words, and for each word it put forth a different _first_ guess.

The first guess for each doodle was the word that was provided -- if this software was completely legitimate, I would think it should have guessed the same word for each (and then moved on to other guesses).

They weren't really the same though, right? Only similar.

A good test for that would be to script the game and give it pixel-identical answers for each questions and see if the result is different.

Correct that they were not identical, but they were quite close; five lines that looked like a 'w'.

FWIW it guessed my lightning correctly (a simple `z` like shape). Its 2nd guess was "zig-zag" and the 3rd one "star" for some reason. So it really looks like for this network a zig-zag and a star are more similar than I would've guessed.

I guess a star is a zig zag looping on itself, kinda.

I wonder if the zig-zag drawing and the stairs were actually as similar as you remember.

For instance, our sub-conscious tendency when drawing stairs would be to have straight lines at 0 and 90 degrees but a "zig-zag" wouldn't necessarily have that feature. Thus, even though you thought the drawings were the same, there were sufficient detectable differences for the AI to recognize.

It isn't pre programmed to what I understand. All it does is guess what your drawing looks likes based on other peoples drawing, and continues to guess until it is correct. So when you drew a car with no siren, it's guesses could have been truck, car, police car. Since it guessed police car, it got the answer correct. The program wasn't told you had to draw a police car. That would be silly. Why would they create this and lie about how it works? What would be gained?

I had to draw a car, and I lost because it didn't recognize my drawing and said "police car" instead.

It's interesting to see what it doesn't recognise, especially to compare my failed drawings with the other successful versions.

I couldn't tell the difference between my basket and most of the successful baskets, but the robot didn't like my drawing.

I think the QUESTIONS are definitely pre-programmed, so the answer set is probably limited too.

It couldn't guess my aircraft carrier. The reason: Other people draw aircrafts instead.


For reference, here's my carrier


People really can't draw hospitals either. Not one of them looks like a real hospital (because what does a hospital look like. You know one if you see one but it's the people not the buildings).

The drawings at best look like churches (a building with a cross on top), but most just have giant crosses on the side.

Disagree - I recognize hospitals by the brutalist architecture and the huge ventilation pipes.


I guess if I was trying to doodle one, I'd do something like this: http://med-fom-emerg.sites.olt.ubc.ca/files/2012/06/surrey.j...

A carport with ambulances parked in front.

Yeah, I got the same problem. I drew an aircraft carrier similar to yours. Google guessed "cruise ship" and "submarine" because it was comparing my drawing to drawings of airplanes.


I'm also amused that one of the aircraft carriers has wheels :)

>I'm also amused that one of the aircraft carriers has wheels :)

It sounds reasonable to call the Crawler-Transporter [1] a spacecraft carrier. From there it's not a big jump to wheeled aircraft carriers

But I'm probably giving people too much credit here.

1: https://en.wikipedia.org/wiki/Crawler-transporter

That's a sad looking crocodile

Yeah, for me it was "streetlight"; everyone else drew traffic lights.

Most people probably ran out of time when drawing the carrier

Those planes start to devolve into banana skins as you go down the page

For first few minutes, I was testing this neural net. After 10 minutes, this starts influencing me on what I should be drawing.

After sometime I start scribbling randomly. Then it says "I see motorbike". And, wow! that was a motorbike. I din't see a motorbike in the drawing until it told me what it resembled. I didn't freak out, but closed the window immediately and told myself - "its a machine, I'm a programmer, I know how NNs work and there's nothing to worry about". The kind of feeling non-techies have when they experience predictive-text for the first time and they they say "it knows what I am thinking"

I got suspicious when it asked me to draw camouflage. Nope. Not going to help train the neutral net that will be used to find me when the singularity comes

And by drawing nothing, you've provided it with the best possible drawing of camouflage.

well actually I drew a smiley face and the word "nope" beside it

Beside what?

Beside the smiley face?

"draw NUCLEAR LAUNCH CODES in under 20 seconds"

Nice try, Skynet.

I suggest you just go planking and it'll think you are the sea.

or hide inside a snake and it'll think I'm a harmless baseball cap

It does a nice job with invariance to scaling and translation (probably normalized in preprocessing) but it's definitely not invariant to rotation :)


Which is understandable. Certain image categories aren't rotation invariant. Turn the letter M upside down and you get W.

I agree that some classes (like M) should exhibit only limited invariance, but the examples here are clearly rotation invariant to us since we can readily recognize them.

The network guessed some of the "correct" answers far to quickly. For example, with just an L shape (|_) it guessed suitcase. Feels like the model is suffering from overfitting.

Fun though.

For me a bit less extreme, but it guessed 'police car' when all i drew was a poor car doodle. there was nothing about that car the would make it a police car.

I was asked to draw a car and it kept guessing police car, so I guess there is some strange over fitting going on.

Similar for me.

I was asked to draw 'eyeglasses', kept guessing 'glasses'.

The premise seems to be a little odd, i.e. I tell you what to draw and then I guess what it is.

I guess it would have been more of an A.I. challenge if the premise was; draw anything and I'll try to guess what it is.

It's weird but the computer can choose not to cheat and forget the answer when guessing. Not possible for humans.

It's possible, but I'm not entirely convinced that this is honest, or at least not really accurate. It's far too eager to try to match it to something than it is to figure out what it has.

It could figure out most of my drawings but it would get them well in advance of me completing anything substantial (like others, I would be asked to draw a leg and draw just a curved line and it would guess leg before I finished).

Trying to draw what it asked for but with some unusual features (like lines or dot patterns around what it asked for before drawing it) and it gets extremely confused; it doesn't really seem to be good at filtering out any noise: http://imgur.com/a/oE1j2 (gallery of results and what it thought it saw.

Drawing things it didn't ask for just to see what it was guessing resulted in some really strange responses and fits. The answer set it has is extremely limited, so something like a hand giving the horns (\m/) was last guess a duck. A moose was a scorpion, then a duck, then a hand. Godzilla (or a bipedal dinosaur if you prefer) was a vase, then a scorpion, then a boat. My loaf of bread was a washing machine, an anvil, then a postcard. The Deathstar was a bandage, a helicopter, and a lighthouse. And a chainsaw was considered an aircraft.

Between the disruptive patterns and drawing things outside of it's vocabulary, the system seems really confused. Looking at the comparison results, I can see how when drawing some things it got it real fast. (Tennis Rackets were mostly defined by a crosshatch pattern, Harps by a series of parallel vertical lines). This makes sense. For other things, not as much.

It might be a more convincing presentation to give the user a list of items the machine knows (the full list) and tell the user to try to draw some, and then the computer could check it off as it gets them. That seems like a better way of presenting this than "Draw a box. hey! you drew a box! Isn't that cool?"

I thought your "frying pan" was Roobarb out of Roobarb and Custard (http://comicvine.gamespot.com/roobarb/4005-86519/). How impressed would I have been if the 'net had guessed that ?!

Huh, I had no clue what you meant until I looked at the picture then back at my scribbles and saw the accidental drawing of the face I made.

To me this is another interesting distinction on the NN recognition versus a human recognition - QuickDraw having a limited "vocabulary" to refer to really highlights this, as does my own lack of knowledge of Roobarb. Some of these things can really blindside us, and I suspect that it's going to require a lot of human hand-holding for awhile for the machines to get a strong vocabulary.

For some time I've been pondering how far you could take a machine's tabula rasa learning, for something like language, and how closely it would mimic a child's learning. (Language, color, math, etc).

Yeah, but it's cheating itself. If it's telling me what it recognizes before I'm done, it severely reduces the usefulness of my input for further training. Anything submitted will just reinforce existing patterns, and do comparatively little to improve recognition.

They probably want data on how people draw certain shapes to improve the model.

Also, I realized how incredibly hard I suck at drawing.

> Also, I realized how incredibly hard I suck at drawing.

This. I drew an Ant but the system couldn't guess it. Later it showed me how normal people draw ants.

Man, i suck at drawing ants

They just want you to provide annotated training samples. Their guess is visible to you, to gamify this for you and make it more interesting, it might also be used to force you to draw some distinctive features once it shows you some other category.

Yea I tried drawing previous words on other screens and it didn't seem to guess them.

It should give you 8 ~10 words to choose from.

I wonder if it has a very small pool of potential drawings. Eg if it only has "police car" and no other cars it might be able to jump straight to police car after seeing a car.

It also seems to constantly have a "best guess" to some degree and if that happens to be correct it confirms pretty quickly.

On the flip side of this, I was asked to draw a "telephone". It guessed first "cellphone" and then "phone" before giving up and saying it didn't know.

I was asked to draw a "phone". Drew a telephone (the old ones with a rotary dialler, because that was quickest to draw), and it guessed "telephone", but it was incorrect :-(

EDIT: Drew a "cake", it guessed "birthday cake". Wrong answer apparently.

It definitely doesn't have any knowledge of similar drawings.

It's because it gets the most plausible answer, even if it's only a small chance of being correct. If there are not that many words, there are only a few objects that could be draw with a L shape like that. I mean, even if you draw something else starting with a L shape, it will probably be a slight different L shape (with a curve, for example). I think that's it, doesn't really mean it's overfitting, it would mean that if the probability was really high.

The ai keep throwing guesses in 20 seconds until he got the right one.

So if you draw a simple shape it start to go trough the list of things he recognize and end the game there.

It works great for this game because he can have very fast answer but only work win the cases when you actually have a feedback that eliminate all the wrong guesses.

Basically if it simply went trought the whole english dictionary fast enough he could get the same result without even looking at the pictures.

Yup, the minimalist Picassoesque renditions I was able to do on my laptop trackpad were still recognised as long as I gave a vague impression of the major shapes I figured other people would draw. The result was the kind of image where a human would frown at it until you said “kangaroo”.

For my L shape, it recognized a leg immediately (which was what I was supposed to draw)

Yeah it seemed to make the correct guess a liiiitle too fast for me.

What's the reason for stopping you drawing as soon as it guesses it? I mean, it's great you guessed "swing set" from me making a basic a-frame, but wouldn't it give you more data if you let me actually draw the swings?

Yeah it does feel abrupt. My guess is that Google only cares about getting the minimal representation needed to guess at the object, even it's incomplete or missing details.

This is really fun.

As a designer/amateur artist, the most interesting part about this to me was seeing other people's drawings/most-basic-concepts-of-a-thing drawn out: which lines they draw, which ones they don't, how they express an idea in the most essential way, basically. (the 20s really helps)

It's a great window into how people think about things.

As a reference: I couldn't understand why it didn't recognize my keyboard (http://imgur.com/fXk6Gg9) but then it hit me hard when I saw other people's drawings (http://imgur.com/J7Lqodw).

Caveat: I noticed I quickly started adjusting my drawings to exaggerate the most stereotypical features, and resort to stereotypes overall, because they seemed more likely to get recognised.

So it is not just how people think about things, but at least in part also how people think other people will have thought about things guided by seeing the examples of other drawings.

Well, true, but I'm not sure anyone outside the HN crowd would try to hack it that way.

Interesting! This is how I play Pictionary, by the way: not how I would draw something (I'm a decent artist) but in the most stereotypical way, exaggerating every feature.

Along those lines, I found it interesting that of the request for "guitar", I drew one with arm from top-right to bottom-left. So did 17 others on the results page. 3 were vertical-no-slant, and only one went from top left to bottom right.

Did we all see the same picture book growing up? Are most righties drawing it one way and lefties another (to which I would be a counter-example)? the guitar from top-left to bottom-right didn't look wrong at all, but why were most drawing it the other way?

Also, no one drew an electric guitar. Shame on us humans. This is fascinating to me more for the human-view than the ML piece...

I drew this... not sure if I should feel disappointed that nobody's done it before, or smug that nobody's done it before.


I am sorry, my Vs are not very consummate.

I can see this soon to be overrun by people trying to mislead it, like what happened to Microsoft's "Tay" on Twitter. Suddenly, ships will be penises and bananas will be Hitler.

I guess this highlights the flaws inherent in all intelligent systems based on teaching. Misinformation fed to them at an early stage can screw them up pretty much permanently.

We've been seeing that since forever with human children raised on doctrines etc. that have no basis in reality.

Could they not easily avoid this by using their Google Search images database? They already have a filter for "drawings". Combining that with the title of picture/alt attribute, you could pretty much avoid that scenario.

Isn't training neural nets for image recognition using Google's image databases (right now Google Street View) the main purpose of reCAPTCHA?

Just like kids...

One approach might be to train it on what naughty words look like first, then ignore any future images that match too closely to them. That is what it's supposed to be best at, after all.

It does not seem to do this at the moment.

When I briefly permitted my utter goober of an inner child to have his way, its top two guesses at the result were "syringe" and "cannon", which...really aren't quite as far off the mark as I'd have expected, I suppose.

(No, I won't try a swastika and report back. Immature as I occasionally permit myself to be, I do have some principle and taste.)

glad i wasn't the only one.

Very cool!

Although I am a bit disappointed by the fact that it did not recognise my amazing whale :( https://i.imgur.com/K5unmKS.jpg

Especially if you compare it to what it thought a whale should look like: https://i.imgur.com/Dxh1F3c.jpg

I guess this is like everything in AI: bad data in, bad data out.

Yeah, I can't get enough of this. Also, having a touchscreen laptop makes it extra fun.

That said, I'm a bit disappointed that it has a lot of trouble recognizing anything outside of "how people normally draw things" (especially if you go out of your way to do more than "the bare minimum", draw things at a different angle, in 3D, etc).

[1] Umbrella (wheel, binoculars, pizza): http://i.imgur.com/jr9mexS.png

[2] Fork (sword, paintbrush, baseball bat): http://i.imgur.com/3uLIPkH.png

[3] Envelope (phone, eraser, leaf): http://i.imgur.com/pemgKpD.png

I like how all the whales have a smile.

This creeped me out a little when I first started playing it, not because of its accuracy or anything but because of what it asked me to draw.

There were lots of innocuous things like [house, teapot, dog, zebra, moustache] but there was also a substantial amount of things like [rifle, aircraft carrier, submarine] that are military-related. I sort of got the feeling that I was helping to train an AI that would later be deployed in combat. I played another ten rounds or so to see what else might come up, and aside from one or two repeats nothing did and I realized I was being silly.

[marshmallow, hotel, ectoplasm]

omg, I just trained this thing to bust some ghosts

As someone who is woefully uninformed about ML and CV: in a case like this, is the computer simply given the bitmap/vector graphic and left to learn how to interpret the data as lines and shapes? Or is there some preprocessing done to transform the data into something more "ML-friendly"?

In Deep Learning, you can feed the raw pixels into the network and it will interpret the data as lines and shapes ..

The only preprocessing done is likely size reduction - the QuickDraw canvas is quite high resolution, so it's probably scaled down with (just linear interpolation)

Thanks. Come to think of it, I remember seeing a series of deep learning examples that were trained on raw data and were able to create well-formed TeX, XML, and other formats, so it makes sense that the same principle would apply to image data.

It seems like you can draw garbage (e.g. faces/text/swastikas) on the picture in the moments after it correctly identifies the object and before it starts the next object, and it seems to "save" it (though I am not sure if it is persisting it in their training data).

Well, if there's one thing this has taught me, it's that the trackpad on my ThinkPad is absolutely terrible.

It taught me that everyone is a better artist than me. As a consolation, I'm using a Macbook Pro touchpad, touted as the best and even it can't compensate for my lack of artistic skill. Not sure if it would be easier with a mouse or not. It'd be really fun to see a post-mortem done where maybe we could see if any artists used their drawing tablets as well as error rates across different devices like tablets.

The tricky bit with this task is context. For example, my 'spider' sucked big time, and I started trying to draw a web right at the end (before running out of time). This is exactly how I would play 'Pictionary', with plenty of big arrows pointing to the part of the drawing representing the word. However, that really isn't appropriate when trying to train a neural net to recognise abstract images.

Similar complexities arise in relation to perspective (should 'bridge' be side-on or top-down?) and even something as obvious as homonyms (I drew 'nail' as the small metal thing you use to hang a picture, but I'm sure others would have drawn the thing at the end of a finger).

I guess my point is that this comes across very much as a game (maybe that's all it really is), which will probably give poor 'neural net training' results, as opposed to if it weren't a game - e.g. by increasingly the ridiculously short time limit. I'm sure the results will still be interesting, but I think they'll end up very abstract rather than very accurate - maybe we'll just 'invent' a new set of logograms.

>However, that really isn't appropriate when trying to train a neural net to recognise abstract images.

Or is it? Beyond the most simple shapes, humans heavily rely on context for image recognition. Maybe a neural net should do the same to be successful.

Well this is surprisingly fun. I just worry I'm making a mess of the machine learning with my terrible doodles.

Disable privacy badger. Without it the app "works" but says it can't guess anything.

When it couldn't even guess my envelope I was suspicious and disabled pb. Then found out it "live" guesses as you go!

Thank you for this, it got 0/6 the first time for me, and I'm literally DaVinci on the doodles...

This was embarrassing. I'm a terrible artist, I mean, well and truly incapable of drawing, and I think the questions become "easier" to draw the more she gets wrong. So I struggled to convey cello, then church, and finally "circle." Which she almost didn't get.

OTOH, you have a great career in Q&A ahead of you!

It's a neat demo, but it would be a lot neater if it recognized something that I chose to draw rather than something I am told to draw.

Agree. And the 20sec timer is annoying. I guess it has been introduced to keep the load on their server down.

I'm pretty sure the 20 second timer is meant to keep you focused on the essentials of what you're trying to draw. Makes sense to me.

Is google using the drawings from the game to train the AI further? I assume so.

Yes, they state it on the website as well: "See how well it does with your drawings and help teach it, just by playing."

This is interesting, because once it recognizes my drawing, it cuts me off before I finish, which would seem like it's adding incomplete drawings to the training set. Maybe it just wants examples of versions that it can't already recognize.

Yea, apparently two disconnected spoked-wheels are a bicycle now, based on the drawing it accepted from me.

it's a training set before it's a game, i'm sure. the fact that they were able to make it fun just makes it a win/win.

this is nothing like Apple QuickDraw

Kids these days, probably don't even realize the reference to Quick Draw McGraw.

Fun hack to play along with humans:

* In Chrome, open the developer tools, and find the style for "#challengeword-text" (you can hover over the challenge word, hit right-click and then "inspect" to get there).

* Set that style to opacity: 0 or something similar. (Be sure it's the #challengeword-text style and not just the element.style)

* Then, in your console, type "Array.prototype.slice.call(document.getElementById('challengetext-word').getElementsByTagName('span')).map((v)=> v.textContent).join('')". You should see the clue word appear after that.

* On two screens, show the browser window with the game to other humans, and keep the console on a screen only visible to you.

* Keep repeating that JS command in the console to get the word every time the challenge word screen appears (you can just hit up and enter, if you've already typed it.)

* And... bam! Now you can do a human vs computer round.

I, for one, am really excited that Google has approached these "experiments" in the manner it has. I'm a designer with a keen eye on new media; so by no means a programmer/developer. For me, it has made machine learning a lot more accessible. Up to this point, I hadn't figured out how to "play" with this technology and now I can.

For Quick, Draw specifically, I'm interested in a "search" functionality where I can input a term such as "computer" and see not only the average representation, but more importantly, the fringe.

The possibilities for creative brainstorming are tremendous.

One it gave me was "camouflage". I have no idea how to represent that with a simple line drawing.

I guess the best bet would have been drawing some piece of clothing with the most commonly used camo pattern, sth like: https://www.google.ch/search?q=camouflage+pattern

same for me. I was given "steak" but the only thing I could thin of is that I need to add a hole like in the cartoons, you know like this:


After I did add the hole, it looked nothing like a steak to me. The only thing I could think of was to put a pan around it. As soon as I started drawing the pan it said, "Oh I know, steak". Makes zero sense to me.

Picture of my drawing:


Does that look like a steak to you? In no way.

Here are the rest of my badly moused submissions if you're interested. I literally had no idea how to draw asparagus.


I was pretty shocked it didn't figure out my pencil. In the last few seconds (I was already done) I started coloring in the eraser but it still didn't get it.

---------------------- EDIT:

Just did another one. Shocked it didn't get this!!:


Completely obvious what it is. First I drew the first 3 circles. It didn't get it. I added flashing. Didn't get it. I added a pole. DIdn't get it.

Added a car and arrow. Still didn't get it.

Everyone else just had in the center of their image, the three circles. Some people included "shining" lines.


(third from bottom and second from bottom rows.) Obviuosly mine looks a LOT more like a traffic light than these other people's, because I contextualize it!

For me it failed hard... chrome (the browser I use) kept freezing with 100% CPU use :/

The single time it worked without freezing it nailed correctly what I drew though :)

EDIT: tried again

performance still bad... but I saw one interesting, it requested me to draw a Knee, and... failed to recognize it.

Except after it ended and I went to see other people drawings, mine was identical, but flipped! I guess because I am left-handed or something like that...

Still, it is funny that it failed hard to recognize a flipped drawing.

As for the explanation: everyone drew a leg, seen from the side, slightly bent (me included). But I drew like this ">" while everyone else drew like this "<"

EDIT 2: someone asked me why it is not in portuguese :(

also I saw the reply below, I am on a Chrome on OSX right now.

I noticed the same thing with regards to orientation. I drew a hammer in a way no human could possible not guess what it was supposed to be but the AI didn't see it. The examples of other people's drawings had hammers all aligned vertically, whereas mine was aligned horizontally. I guess orientation matters to the AI.

I drew bread as a loaf from an isometric perspective. It didn't get it. All but one of the other pictures were of a single slice of bread. There was another picture of a loaf of bread, but it was mirrored in relation to mine. Seems like some bit of knowledge should be devoted to whether orientation is important, but the computer doesn't understand that. It seems like this is because most people draw pictures one way initially. It might be helpful to have each person draw a second picture of the same thing.

Worked fine for me in Firefox on Linux.

Looking at a closest line match doesn't seem to work well, simply by factor of positioning of the drawings. I suspect they need to rotate the recognition objects around their axis, but even then, you're drawing 3d objects and using 2d recognition to detect them. (an airplane looks differently from its side than from its front). I drew a very clear to see pencil horizontally and it matched it to two horizontally drawn airplanes - one of which was supposed to be an aircraft carrier by the description. The existing pencils in its memory were positioned vertically when drawn, so no chance it would match them.

"Draw Animal Migration in 20 seconds"

And I'm done.

Couldn't recognize anything I drew with keyboard mouse control. Then again, neither can I.


That diving board is pretty good for twenty seconds. Also with my limited experience using keyboard control, acceleration can mess you up easily.

Initially it couldn't get any of mine, but then I realized that Privacy Badger was blocking something important, causing it to silently fail. After allowing it through it got most of them.

That must be what happened to me too. I thought it was just stupid at first, because all of my drawings looked like the examples given at the end.

Of course it just takes one juvenile prankster to say in a crowded room "You won't believe what it recognises phalluses, poo emoji, and bottoms as" and it's going to be Microsoft Tay all over again.

So nobody do that, right.

Seriously, don't even think about it.

(More seriously, I think I might have been de-trainging it by trying to draw the correct drawing using a touch-pad... rather too many of my phones seem to look like tornados by the time the dang thing's got stuck in draw mode trying to click-and-drag to draw)

Apparently a phone and a telephone are two different things http://i.imgur.com/HujDN5W.jpg

Do you know what year it is?

Either Googles AI has very high drawing standards or I am a terrible artist. It didn't guess ANY of my drawings, and they all looked very similar to the other images :(

I had the same thing happen, turns out Privacy Badger blocks some important component in the recognition, so switching to a private window fixes it.

> Draw the Mona Lisa in under 20 seconds


This game seems to essentially be win lose or draw the gameshow; the computer is throwing out some random variations like contestants guessing. It makes it easy for the computer to try 'car' the getting a no it tries 'police car'. I wonder how many things it understands...

We are so far from general AI. It's so strange how our brains can refine existance down so easily... how many CPUs is the human brain worth as far as we know?

it guessed phone when asked to draw phone but obviously wasn't strong enough a match because it moved on without getting it right before coming back to it twice. Finally stayed on blackberry after doing cellphone. Requires some weighting to get there?

This is fascinating at looking at how people create abstractions for things. Also funny how sometimes people just give up and write words as that's the best way to label whatever they were given to draw. Lastly I'm surprised at how hard it was for me personally to imagine some of the things given even though I would be able to recognize them in an instant, even from the drawings here.

It got all of them for me except "Hexagon" which kind of baffled me. It's a basic shape. (yes, I drew it with the right number of sides)

Many of the images it showed for what it thought a hexagon is had the wrong number of sides,when I did it.

It also didn't get my hexagon.

Really fun. The algorithm generates new nouns based on my drawing on the fly and add to the original pool. It stops and say 'Oh I know, it is ___', when there exist a guess that is the same as the answer. Might seem like it is really clever, or, it could have like 1000 guesses for every stroke and one of it is the correct answer.

My bathtub looks too ridiculous for it to be able to guess it with confidence.

It tends to never guess what I draw. :/

I'm not sure if it's because I'm left handed, but I tend to draw everything backwards from all the examples that the bot has seen.

I also think that the other examples it's seen are horribly drawn in many cases, making me think perhaps a 1 minute time frame would yield better results. I can't draw anything meaningful in 20 seconds haha.

Researchers are trying to collect data for training. I wonder how many examples they receive and how many are bogus.

The microsoft chat experiment from a few months ago shows that people can hack these systems to train them into racist slur-spewers.

Similarly, i wonder how this project aims to get accuracy given that many people are going to draw the porno objects no matter what the prompt was to mess with it.

I managed to guess Mona Lisa.



This must be using a relatively small corpus of terms to guess from.

I have no idea. Here's another weird one:


It asked me to draw an helicopter, then a shovel, and guessed right in both cases. But I noticed that the training dataset was of poor quality: "shovel" drawings were full of helicopters.

I also noticed it was very laggy on my phone, making it hard to draw anything before the end of the countdown.

My doodles are terrible data points but this was a lot of fun! Guessed a few of them right, although a bit too fast :)

Doesn't even recognize a cup.

Amusingly, the one I had drawn looked pretty similar to one of the examples drawn by other people.

Cup didn't work for me too. I had the feeling that it's pretty good too.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact