Why can’t a bot tick the 'I'm not a robot' box?

renlo · on Feb 14, 2019

I’ve had a number of reCAPTCHA incidents where I could not pass the test for tens of images, it was a very frustrating experience. Please do not use reCAPTCHA.

The items one is supposed to select often overlap the grids, so it becomes a kind of Keynesian Beauty Contest[1] at that point; I assume they validate based on how much in alignment you are with previous answers, so it becomes a problem of “What nearby grids would a person reasonably select when there’s overlap”, or, you’re tasked at selecting <some_item> and you see <some_item> in the distant background of the image you’re supposed to classify and you need to determine “How visible would <some_item> need to be for a reasonable person to classify <some_item> as being in this image”.

On top of this, when you’re clicking through image after image after image, the additional frustrating thing is that you’re helping train their algorithms; you’re doing work because their service isn’t smart enough to know you’re human, or, you’re seen as a marginal customer they can piss off by forcing you to work for them for free.

It’s a frustrating experience when it fails, and my current strategy is to leave the website when the ‘I’m not a robot’ checkbox fails.

[1]https://en.m.wikipedia.org/wiki/Keynesian_beauty_contest

retro64 · on Feb 14, 2019

You’re overthinking it. I have had a similar experience. The worst were the “stop light” questions. Does it mean the pole, the light, the tiny corner overlapping another square? I used to try to include everything as it was technically true. Very frustrating – until I finally started to not care. Click on the most obvious pictures. Click click click. Done. Get it wrong? Click click click on the next one. Way faster with a much better success rate doing so. It usually only takes a couple of tries now.

Legogris · on Feb 14, 2019

"Store front" always makes me squint at garages, driveways, sidewalks and front doors.

JoshTko · on Feb 14, 2019

Just a suggestion to not say "You're overthinking it" it can be considered pretty dismissive.

retro64 · on Feb 14, 2019

Maybe a cultural thing? For me, "you're over thinking it" is a familiar/friendly way of expressing an idea, similar to how you would approach a friend. It was not intended to be dismissive.

vectorEQ · on Feb 15, 2019

you are right. it is a cultural thing. that being said, the advice is sound. the internet is a mishmash of cultures :) if i would bring my culture to the internet, everyone would think i'm troll or horrible person. just because ppl from my culture are a bit direct and cynical :D if you'd like people to respect and take into account your culture it's good to do vice versa.

now that's over thinking :D

xnevs · on Feb 14, 2019

You're overthinking that.

thatoneuser · on Feb 15, 2019

Hell I try to figure out what wrong answers to give to fuck up their training data.

_lhlo · on Feb 15, 2019

That would only work if everyone who got the same image gave the same wrong answer. Otherwise, the system just keeps sending the same image to more random people until it's satisfied with the consensus. Also, it keeps sending YOU more random images if you obviously don't comply with previously observed consensus.

xvector · on Feb 14, 2019

reCAPTCHA is horrible. I am almost certain that some tests are intentionally marked as failed even if they are correct, simply so they can get more training in. And if you take any anonymizing measures, reCAPTCHA makes much of the web nigh impossible to use.

It's easily the worst UX I encounter online. I can't describe the relief I feel when I come across a "normal" CAPTCHA. It's coming to the point that if I could pay a few cents to outsource every reCAPTCHA, I gladly would simply to avoid this atrocity.

ConceptJunkie · on Feb 14, 2019

The tests are often poorly defined. I've been caught in reCAPTCHA hell a few times (but not recently), and it's often the case that I'm guessing... is this a bus? I'm not sure.

It's interesting to find out that this might have nothing to do with my ability or inability to recognize objects in blurry photos.

sl1ck731 · on Feb 14, 2019

Are reCAPTCHA images used to train or classify data like Mechanical Turk? I thought at one point the word ones were used that way and was a major incentive for their usage.

JonathonW · on Feb 14, 2019

Original reCAPTCHA images were used to transcribe text for Google Books, and later to transcribe house numbers for Google Street View.

I don't think Google's publicly said what they do with any data derived from the new "I'm not a robot" reCAPTCHA, but, given the content it usually uses, it seems likely that they're still using it for image classification in Street View, or for their self-driving car projects.

xvector · on Feb 14, 2019

They are almost certainly being used to train Google's self-driving/Maps/etc recognition.

darkhorn · on Feb 15, 2019

You probably share your IP with others like Tor or a single IP for a big company.

lilyball · on Feb 13, 2019

Lately I've been getting reCAPTCHA prompts all the time even though I'm not browsing in incognito mode and haven't cleared cookies. All I'm doing is running a very basic ad blocker, using Safari (which blocks third-party tracking), and very rarely loading a Google site. The most interaction I have with Google is when I end up having to use my corporate Google account as SSO for some other site.

Given that I'm not doing anything unusual, it really feels to me like reCAPTCHA, for all its complexity, boils down to "what's your history using Google software? Oh you rarely use it? I'm gonna give you a captcha". It didn't used to be this aggressive, but it's really ramped up in the past few weeks.

iheartpotatoes · on Feb 14, 2019

I recently replaced a bunch of securimage captchas with reCAPTCHA v2. During testing I had to shut it off because it became increasingly more complicated every page reload. First it was just one page of traffic lights, but 20 minutes later I was having to click through 5-6 pages of images. This worries me that user's might get pissed off. I'd really like to know if I've made life harder for my users in an attempt to stop the spam from the horribly broken securimage captcha.

Anarch157a · on Feb 14, 2019

I cancelled and deleted my Spotify account because of that. Now any site that asks me to fill a Google recaptcha is met with a swift click on the Back button.

tylerl · on Feb 15, 2019

Well. You showed them, then. I'm sure someone will notice.

iheartpotatoes · on Feb 15, 2019

Really? You gave up one of the best streaming music services because you had to use a captcha every now and then? Huh. Good data point. But you're probably not our target audience if you give up that easy, we sell engineering tools for solving difficult problems.

lilyball · on Feb 16, 2019

You said

> During testing I had to shut it off because it became increasingly more complicated every page reload. First it was just one page of traffic lights, but 20 minutes later I was having to click through 5-6 pages of images. This worries me that user's might get pissed off.

Why are you now badmouthing someone else for deleting Spotify over this exact same issue?

RussianCow · on Feb 13, 2019

Anecdotally, I've gotten way more reCAPTCHA prompts since disabling third-party cookies and installing Cookie AutoDelete, so I suspect you are correct.

tomaskafka · on Feb 14, 2019

Same here.

kevin_thibedeau · on Feb 13, 2019

Randomized user agent also seems to trigger reCAPTCHA without real cause. Should be an ADA violation for doing that.

rtkwe · on Feb 13, 2019

ADA? Americans with Disabilities Act? Why would that have any bearing on user agent randomization?

Bonooru · on Feb 13, 2019

I think the argument boils down to "how do you fill out the captcha if you use a screen reader"?

rtkwe · on Feb 14, 2019

They already have an accommodations for the visually impaired with the audio captchas.

https://support.google.com/recaptcha/answer/6175971?hl=en

darkpuma · on Feb 14, 2019

If you score low enough on their automatic checks, they refuse to serve you the audio challenge.

PurpleBoxDragon · on Feb 14, 2019

What about the deafblind?

rtkwe · on Feb 14, 2019

What's their solution for any other website? It seems like they'd have a very difficult time accessing ANY site.

In a quick search it seems like NoCaptcha is the accessible answer for the issues with regular Captchas. For the most part it seems to work, most of the complaints here seem to stem from people trying to actively block some of the evaluation metrics used by the checkbox (cookies,javascript,user strings,fingerprinting,etc) which makes them look very different from normal traffic which kind of by necessity makes them look a lot more like bots.

https://simplyaccessible.com/article/googles-no-captcha/

PurpleBoxDragon · on Feb 15, 2019

>which makes them look very different from normal traffic which kind of by necessity makes them look a lot more like bots.

But if they are doing so because they are disabled, and the difference means they receive a worse experience, may result in an ADA complaint (especially if a government service falling under section 508 is involved).

amanaplanacanal · on Feb 14, 2019

Braille interfaces are a thing.

Mirioron · on Feb 13, 2019

I think he means that having people fill out recaptcha all the time because they don't allow Google to track them should be an ADA violation.

rtkwe · on Feb 13, 2019

Again I still don't see how not wanting Google to track you should qualify as a disability under the ADA.

Mirioron · on Feb 14, 2019

Nonono, disabled people might be barred from using a service because of excessive recaptcha. That's what he means. I also think he really meant CVAA rather than ADA though.

rtkwe · on Feb 14, 2019

Looks like they provide an appropriate alternative for people with vision disabilities: https://support.google.com/recaptcha/answer/6175971?hl=en

That only really leaves blind deaf people out, at which point we might be reaching the limits of any technology to provide access to everyone without a tooooon of work.

TheAceOfHearts · on Feb 14, 2019

As someone already pointed out in another reply, you don't get this option if you score low enough.

rtkwe · on Feb 14, 2019

Yeah but most people won't trigger that. Seems like most of the complaints here about triggering it often are from people who are blocking js/cookies/randomizing user strings. The NoCaptcha check box itself is better than the old system where everyone had to do the Captcha at least.

justtopost · on Feb 13, 2019

It happens to me occasionally and I am basically blacklisted from the internet. I have to solve 5 in a row and if I screw up its idea of what a streetsign is, I have to start over. It has made me cut out usage of most sites that use this broken and abusive tech. Welcome to the digital ghetto.

_8huj · on Feb 14, 2019

I have noticed this too. I've switched to DuckDuckGo for everything and I haven't changed my habits. Started getting more captchas a couple weeks ago and I know I answered several of them correctly (I'd get tested 3 times in a row).

Possible fingerprinting?

wstuartcl · on Feb 14, 2019

It is also plausible that because google analytics runs on so many sites that they could do something shady like put you in a pester segment if they see you coming from duckduckgo to other sites frequently. It is not hard to imaging using Recaptcha as a nuisance against other search traffic providers.

pergadad · on Feb 14, 2019

I doubt that many people would make a connection between their search engine and seeing captchas on other sites. So limited gain for, if anything, many unnecessary complaints.

beatgammit · on Feb 15, 2019

This is why I love container tabs in Firefox. I like putting all of the recaptcha stuff in one container so it can't snoop on my other stuff (I'm too lazy to look into what it's doing with cookies and whatnot).

But honestly, I wish it would just die.

PaulHoule · on Feb 14, 2019

Technically it's a good strategy. It would be much more complex for bots to leave a trail consistent with that of a real user.

Some financial and government benefit web sites query web trackers as an extra factor in the enrollment process.

Liquix · on Feb 14, 2019

It'd be a good strategy if we were aiming for a totalitarian Google-sponsored police state..

Making (online or offline) life more difficult for people who don't want to use company X products could escalate to the point where you either accept the yoke and are admitted to the walled garden of "society" which company X has firmly cemented themselves under -- or you say no and find yourself unable to drive/fly/get a job/go to college/buy groceries in your town. It sounds like a big leap to make right now, but is a real possibility if Amazon/Google/FB don't get broken up soon.

lilyball · on Feb 14, 2019

Using a verified human Google history to allow people who would otherwise be flagged as potentially a bot to skip the CAPTCHA is justifiable. Setting up your reCAPTCHA such that the lack of a verified Google history is used as a "probably bot" signal is really quite awful.

elsurudo · on Feb 14, 2019

I use Safari on a Mac with a simple ad blocker, but I use a lot of Google products, and I also get a captcha most of the time.

So perhaps it's Safari and/or the ad blocker that are to blame? Hard to say, though.

xvector · on Feb 14, 2019

This is likely due to the new canvas fingerprinting protections introduced in iOS 12 and Safari for Mojave. Google's NHT analyzers probably don't take well to these measures that attempt to defeat canvas fingerprinting.

kccqzy · on Feb 14, 2019

Same here. I have an alternative browser with no ad blocker, no tracking blocker, and sometimes I just copy the website from Safari to that other browser to avoid CAPTCHA.

AlfeG · on Feb 15, 2019

Sometime ago we have a provider that give us everyday different IP. Some days we just were not able to do anything without captcha. It seems that some IP addresses were in some sort of spam base

bsamuels · on Feb 13, 2019

Slightly related, but I have a fun conspiracy to share:

I'm convinced that part of the reason Google released headless Chrome is as a honeypot for bot authors to use. The idea is that instead of going through the effort of fingerprinting and identifying new bot software, release something that bot authors will use instead that you have a capability to detect.

Somewhere inside of headless Chrome, there's one or more subtle changes that make it so Google can detect whether you're using headless Chrome or normal Chrome. There's no limit to how subtle the indicator could be - maybe headless Chrome renders certain CSS elements slightly slower than normal Chrome, etc.

It sounds pretty crazy/complicated but I could definitely see it being worth it if it means detecting $X,000,000 worth of ad fraud every year

nhf · on Feb 14, 2019

It's actually not that complicated. Most headless browser drivers have some global JavaScript functions in the `window` namespace that immediately identify themselves.

I once ran into a piece of code from the scammy advertising world that tried to redirect users to a phishing site. They cleverly tried to hide themselves from the automated quality checks some ad networks do, by checking for these functions and appearing benign if they saw them. One of the checks even created an exception and then inspected the stack trace for certain flags that apparently are only there on some type of headless browser. Clever!

mpol · on Feb 13, 2019

Interesting idea :)

I don't think spambots are currently using Chromium or even running JavaScript. Using simple spamfilters in JavaScript still works fine on my setups.

bsamuels · on Feb 13, 2019

Most modern credential stuffers use headless browsers with all the bells and whistles, html5, javascript, etc.

Login attempts are usually spread over a massive botnet of residential IPs as well, where they'll only use each IP for one or two login attempts before moving on to the next.

It's a very fascinating problem space

Damogran6 · on Feb 14, 2019

In my experience, the botnet didn't upgrade their JVM...it was 18-24 months out of date. THAT was what we filtered on at the F5 to blunt the attack.

golergka · on Feb 14, 2019

Does it mean that you're breaking the experience for users who deliberately disable js by default? Can I ask you not to do that? Modern web is unusable if you let js on any webpage

partiallypro · on Feb 13, 2019

Every time I fill one of these out I get the picture test, and I answer them correctly...but am asked 3-5 times to identify which blocks contain a school bus or stop light. It's very annoying.

ivanbakel · on Feb 13, 2019

I think it's speculated that you're recorded as being a useful classifier if you answer correctly on initial test captchas, so you get given Google's datasets for machine learning. It would explain why you get picture tests even after you should definitely pass the check.

kzzzznot · on Feb 13, 2019

If that is true that is a huge breach of trust. Are these practices ever audited?

Scene_Cast2 · on Feb 13, 2019

By whom? For what? (Meaning - probably not, unless you count a few Googlers sanity checking launches)

manmal · on Feb 14, 2019

You mean, they still haven't managed to classify those storefronts correctly, after 3 years or so?

alexpetralia · on Feb 13, 2019

Yes - I feel at this point that our labor is simply being used to help provide training data for Google's algorithms.

teej · on Feb 13, 2019

I mean, this has been exactly the situation since recaptcha was invented.

Theodores · on Feb 13, 2019

This is not how it is supposed to be.

In a parallel universe of fluffy niceness we willingly provide our help and in that way we get all those old books converted to ASCII and available for us to read online. Our efforts are for the good of mankind. Similarly with the newer challenges, we help the maps be up to date and again this is for the good of mankind and those needing help getting around.

Clearly this doesn't work in an era where the 'don't be evil' mantra is long forgotten and people only see Google as some advertiser friendly capitalist monopolist beast.

Google need to work on their relationship with their customers, to be a benevolent dictatorship of sorts. They are lousy at customer service and there are other pain points that they are ignorant to. I don't see how this helps.

JohnFen · on Feb 13, 2019

> Google need to work on their relationship with their customers

Google's customers are those who buy advertising. The rest of us are just cannon fodder.

RidingPegasus · on Feb 13, 2019

The "hills" category is the worst.

Have come to the conclusion it really means any patch of grass.

XCSme · on Feb 13, 2019

How to label your ML data for free.

hombre_fatal · on Feb 14, 2019

By offering a free service to website operators that mitigates the ever-growing challenge of abuse on the internet that they and their users have to deal with.

Seems pretty bilateral.

pfortuny · on Feb 13, 2019

Using highly trusted users thinking that they are just proving their honesty.

hartator · on Feb 14, 2019

At SerpApi.com, we built a bot to check these boxes and an AI to solve the actual CAPTCHA.

Checking the box is actually not that hard. There is no advanced measurements of your mouse and touch speed. This is an Internet myth. It's more a game of cookies, making them age well, and having an organic set of headers.

mlb_hn · on Feb 14, 2019

The misconceptions about nonhuman/invalid traffic (NHT) seem like a problem brewing. There's an arms race between the NHT guys (e.g. ad fraud networks) and the guys trying to detect it (e.g. ad providers). Meanwhile, a lot of people are using analytics to inform decision making assuming most traffic is legitimate (e.g. news organizations, anyone doing A/B testing). The guys naively using analytics with weak feature detection may be totally unprepared to deal with nonhuman traffic from repurposed networks which have been optimized to defeat the more advanced countermeasures =/

SheinhardtWigCo · on Feb 14, 2019

Aren’t you afraid of being sued by Google for selling their search results?

wstuartcl · on Feb 14, 2019

My first thought as well, looks like the business is 100% based on breaking TOS.

From their site: Is scraping legal? In the United States, scraping public resources falls under the Fair Use doctrine, and is protected by the First Amendment. See the LinkedIn Vs. hiQ scraper ruling for more information. This does not constitute legal advice, and you should seek the counsel of an attorney on your specific matter to comply with the laws in your jurisdiction.

ROFL, I guess if you are able to ignore the layers of other issues TOS, breaking of technology to specifically exclude your use case, etc and are only willing to apply some very tangential case law against your reasoning it is "legal".

mrccc · on Feb 14, 2019

The captcha always reminds me of The Stanley Parable:

> Employee #427's job was simple: he sat at his desk in room 427 and he pushed buttons on a keyboard.

> Orders came to him through a monitor on his desk, telling him what buttons to push, how long to push them, and in what order.

fabioborellini · on Feb 14, 2019

Isn't this a typical Quora answer? Full of filler and shitty hard-to-verify details that provide no value to the answer ("the language is encrypted twice", what the hell), and very little effort on answering the actual question (what is the purpose of CAPTCHA).

And the community rules try to block people from writing firm "you're full of shit"-like answers, even though every other answer of Quora is full of lies like "Linux is fast, because it was designed for 16-bit computers".

jaabe · on Feb 14, 2019

I had my “wow” this place might not be that good experience with Quora yesterday when I was trying to google evaluate AWS workmail.

Quite a lot of the “extremely good looking” answers on Quora straight up said that you couldn’t do e-mail in AWS. These were answers from after workmail was a thing by the way.

So I started looking at other Quora answers on stuff I wouldn’t normally need an answer for, and it’s frighteningly how often completely wrong answers look correct.

Don’t get me wrong, there is a lot of truly amazing answers as well, and it’s entirely possible that I just suck at it, but I don’t think I can always tell the amazing answer from the completely wrong one.

adventured · on Feb 14, 2019

My experience with Quora has been that more often than not the older the answer, the better it is. I find that answers in history, are often better than in tech. It always seems like the community that initially built Quora, stopped building it further several years ago and now it's floating out in space Wile E. Coyote style.

distant_hat · on Feb 14, 2019

Quora went significantly downhill a few years back. It was a combination of hordes of new users, bad moderation, and bad incentives (order in which answers get shown etc).

welly · on Feb 14, 2019

I was going to say the same. I don't use Quora at the moment but when I did, I'd be more interested in subjective questions (history, geographical - ie. travel etc., and more philosophical questions) rather than objective and technical questions (and answers) as I've found many of them to be simply untrue.

pure-awesome · on Feb 14, 2019

Sounds like a form of the Gell-Mann Amnesia Effect:

https://en.wikipedia.org/wiki/Gell-Mann_amnesia_effect

KajMagnus · on Feb 14, 2019

The answer is not just fluff. It for example links to https://github.com/neuroradiology/InsideReCaptcha where you can read more.

hmottestad · on Feb 14, 2019

I looked at that and it’s pretty nifty. It actually looks like google did encrypt the client side code and implement their own JavaScript VM and that the decryption key for the source is based on variables inside the VM during execution (some kind of state) in some way and other properties from the webpage (css is mentioned). It all falls into the realm of obfuscation.

ebg13 · on Feb 14, 2019

> the actual question (what is the purpose of CAPTCHA)

This assessment isn't quite right. The actual question is about how the captcha differentiates between a human and a bot at the box checking stage.

You're right, though, that it is both full of filler and also doesn't address the question as posed at all.

> Why can’t a bot tick the 'I'm not a robot' box?

It can, by taking over the mouse...except...

LUCKILY, the top answer (on my screen at least it's https://www.quora.com/Why-can-t-a-bot-tick-the-Im-not-a-robo...) does actually try to answer the question.

I feel like the OP submission might have just been some sort of submarine self-promotion for the "CEO of <redacted>".

thatoneuser · on Feb 15, 2019

That answer talks about mouse movements. Most captchas I do these days are mobile. So I don't trust this as a very solid answer.

ebg13 · on Feb 15, 2019

Mobile device movement variability (IMU, compass, soft vs hard press, and so on) and mouse movement variability are relatively analogous, and both can be measured and analyzed in very similar ways.

It also links to a patent describing a novel mobile captcha invented by the author, so they might have some knowledge about the domain.

golergka · on Feb 14, 2019

[flagged]

cotelletta · on Feb 14, 2019

It absolutely is. By appealing to nicety, you can screw people over subtly enough that it doesn't register to observers as offense, and then act offended yourself when you get pushback.

You can also add in some fundamental attribution error, as in "I am just looking out for everyone. You are being difficult. They are engaging in bad faith."

djflutt3rshy · on Feb 13, 2019

The box has made browsing using TOR insufferable! It fusses and makes me click storefronts and traffic lights until I run out of patience and close out of whatever webpage I was trying to visit. I assume it has to do with a lack of Google cookies on the browser, essentially punishing me for trying to protect my privacy.

Kalium · on Feb 13, 2019

This might surprise you, but it actually has to do with what traffic coming out of TOR looks like. Well in excess of 90% of traffic coming out of TOR is spam, bots, malicious, or some combination!

Google isn't going out of their way to punish you for trying to protect your privacy. They're trying to stop unwanted traffic. By unfortunate happenstance, you appear to be disguising yourself in the exact same way a shocking amount of bad traffic is.

mattlondon · on Feb 13, 2019

Not just for Tor.

I use Firefox with a few basic extensions (Privacy badger, uBlock, Google Container) yet every time I am presented with having to pick out traffic lights over and over and over again. I usually have about 5 or 6 "challenges" before I give up and use another site.

My timezone has not changed, my IP address and rough location has not changed, my screensize has not changed, my broadband speed has not changed, and my general computer dexterity has not changed, yet I am relentlessly targeted. On chrome I never saw these challenges, but on firefox with the privacy plug-ins I am always always always challenged.

At this stage I think the only signal it is using is "is there a google cookie in this browser? and if so has the google cookie got some 'normal' looking activity logged against it?" I.e. they are checking their server-side logs for a given cookie ID and seeing if that looks normal or not (i.e. seen on google search, seen on youtube, seen ads from a variety of third parties on various different sites, mixed up with time of day and speed of viewing etc etc).

Since I have got Google in a container in Firefox, I am guessing that my google cookie is not present when the captcha loads (due to the containers and privacy badger et al) so there is no identity back in the mothership to compare me against.

gcb0 · on Feb 13, 2019

for google, you are the enemy. not even bots.

captcha is google master blow against ad blockers.

a regular user, who they have all the info, give them dollars per ad impression. You, with your doNotTrack (ha! that was a joke) and privacy addons makes them only cents per ad impressions.

you are google's enemy. remember this when you get stuck in captcha hell (and consequently censored from most sites until changing device/ip)

nine_k · on Feb 13, 2019

IDK. I run Firefox on many OSes, everywhere with uMatrix that blocks known trackers, ad networks and such. I don't see most ads (if any).

I rarely see the "I am not a robot" box, and hasn't seen image recognition tasks for a long-long time.

raws · on Feb 14, 2019

That also heavily depends on what kind of/which sites you visit.

gcb0 · on Feb 14, 2019

"that also depends if you have something to hide" was said of every police state and censorship scheme.

gcb0 · on Feb 14, 2019

if you were really blocking all trackers, Captcha would even work. Firefox help page for their new tracker blocking feature says so even.

jplayer01 · on Feb 14, 2019

They're on a lot of sites that I frequent.

jplayer01 · on Feb 14, 2019

Yup. It's insufferable. Even on sites where I'm a paying customer, I have to go through captcha garbage.

mcv · on Feb 14, 2019

If you're a paying customer, complain to the company. Let them know their site is annoying and frustrating to use because of this.

If they lose enough customers over this, they will probably remove the captcha.

raws · on Feb 13, 2019

I think quora over states what Google looks at by a wide margin, just try to access a captcha in incognito, they won't have access to as much info as they do on you and yet you're still presented with the same level of captcha (if not more of them, which is to be expected)

mcv · on Feb 14, 2019

Sometimes just checking the checkbox is enough. Sometimes you need to identify cars and store fronts. I think the better Google knows who you are, the more likely just the checkbox is going to be enough. If you go incognito, you have to train their neural nets, if you give up your privacy, you get in for free.

The clever part from Google's perspective is that you have to trade one of these things to Google in order to get access to sites that do not belong to Google at all. Google convinced site owners to have their users pay a tax to Google.

speedplane · on Feb 14, 2019

There are many services out there that can solve Google's recaptcha for fraction's of a penny. When someone puts one up, they can make things more expensive, and perhaps sometimes uneconomical, but in general, the cost is low (~$2.00 for 1,000 recaptchas).

When someone uses a recaptcha, they should think about why they are doing so. It's one thing to use it to save a business model, but it's another to use it to protect information that should be free anyway. The elephant in the room is government data. Many government agencies think that selling their data can be a nice source of side revenue, and a recaptcha is a good way of enforcing it. In reality, they just increase the costs for everyone, and those with means can obtain the information while those without means cannot.

Governments need to release their data, freely, without captchas or fees for single users and bulk users, no exceptions.

orzig · on Feb 14, 2019

I've actually been pleasantly surprised at how much data /is/ available, and how much of it is available through common formats like Socrata Open Data API (for use with tools like https://github.com/xmunoz/sodapy)

The counter argument is that they do a great job with trivial stuff like registered dog's names, and less well with sensitive/important issues like policing.

What's the right way to leverage the platform developed for the first into the second?

mcv · on Feb 14, 2019

> Governments need to release their data, freely

Totally agree. Fortunately the Dutch government is trying to make as much data open as they reasonably can, and regularly organise events to encourage developers to use their open APIs.

crankylinuxuser · on Feb 13, 2019

> My timezone has not changed, my IP address and rough location has not changed, my screensize has not changed, my broadband speed has not changed, and my general computer dexterity has not changed, yet I am relentlessly targeted. On chrome I never saw these challenges, but on firefox with the privacy plug-ins I am always always always challenged.

That's because Google isn't just profiling "Tor users". They're going after anyone who values privacy in any way or technology.

Simply put, you're being punished for ensuring privacy. And anybody who uses Google's captcha services is an accessory to that.

sneakernets · on Feb 13, 2019

There is no Google "punishment algorithm". It's just computers being dumb.

SahAssar · on Feb 13, 2019

Somebody made those computers dumb in that exact way. That's the complaint.

JohnFen · on Feb 13, 2019

I think that Google is more than happy to punish people for protecting their privacy. That may or may not be the main goal, but it doesn't appear to be something Google considers a downside.

YUMad · on Feb 13, 2019

Sometimes people intentionally make computers dumb.

JCSato · on Feb 14, 2019

Same thing happens to me, same extensions involved, mostly browse incognito. I bet your suspicion is spot on.

LukaD · on Feb 14, 2019

I use chrome with Privacy Badger + uBlock Origin and I have to solve the CAPTCHAs every single fucking time. I even have to solve them multiple times. At this point I just leave a page if they have one of those captchas.

imtringued · on Feb 13, 2019

>This might surprise you, but it actually has to do with what traffic coming out of TOR looks like.

That's a massive load of bullshit. Google has a captcha challenge that only humans can solve. That alone is already sufficient to prevent unwanted traffic. That is how every captcha system works. However google is an exception. If you're logged in to a google account or are using chrome then google can use that information to track your captcha history. Privacy minded people avoid google like the plague and therefore they cannot be tracked.

>Google isn't going out of their way to punish you for trying to protect your privacy. Except this is exactly what happens. It's not "unfortunate". It works like this by design.

If google cannot track you then the captcha will force you to do something that no other captcha system does: give you even more challenges even if you have solved them correctly. You will spend the next 5 minutes solving captchas correctly and then at the end it will tell you you've failed. This again is unique to google: correct answers lead to failure. The problem immediately goes away if you let google track you, it doesn't matter how bot infested the network is. No other captcha system does it this way.

Google is clearly doing this to get free labour to label their datasets, force people to have a google account and encourage them to use chrome.

cortesoft · on Feb 13, 2019

If you are using TOR, and not accepting cookies, they are going to have no way of knowing that you are the same user who just solved the CAPTCHA. Every request is going to appear to be from a new user.

If you do everything you can to prevent google from knowing who you are, don't be surprised when they behave like they don't know who you are.

Dylan16807 · on Feb 14, 2019

Tor Browser accepts session cookies. It won't have an established google identity, but it fully supports a temporary "solved the captcha" identity.

cortesoft · on Feb 14, 2019

> The problem immediately goes away if you let google track you

I took that to mean they were blocking cookies

puzzle · on Feb 14, 2019

What prevents a botnet from sharing that same session cookie?

Dylan16807 · on Feb 14, 2019

A botnet doesn't need Tor in the first place. And you can limit the use of a single captcha solution. It's not much different from the problem of a legitimate google account being borrowed by a bot.

darkpuma · on Feb 13, 2019

The tile fade-in is also egregious. The only reason that exists is to punish humans.

Wowfunhappy · on Feb 13, 2019

Well, it punishes bots in an equal amount, in that the bots have to wait longer before they can retry.

darkpuma · on Feb 14, 2019

It punishes humans more than computers because computers are more efficient multitaskers. A computer can find a productive way to use the second between each tile fade in, but a human has no realistic way to productively use that second. The human sits there staring at the screen waiting, while the captcha-solving computer does other things (perhaps solve other captchas given to it through other connections.)

bduerst · on Feb 13, 2019

Slight nitpick but past captcha successes are a characteristic of cyborg accounts, which still act as a bot most of the time.

A lot of the behavior that captcha exhibits is in part a function of feature analysis from ML models - features that may seem ridiculous to layman humans but make sense to a neural net plugged into the data.

sjwright · on Feb 14, 2019

> That's a massive load of bullshit. Google has a captcha challenge that only humans can solve. That alone is already sufficient to prevent unwanted traffic.

It's not bullshit, it just depends whether your website is being targeted directly or not. We're targeted directly and the robots hitting us are getting the CAPTCHAs solved, presumably with human help.

Mirioron · on Feb 13, 2019

This sounds like it should be illegal.

bogdan · on Feb 13, 2019

I think you are partially wrong. Maybe Google is not doing this intentionally but it also doesn't happen just because traffic is coming out of a tor node. I am using ff with some of the recommended extensions from https://www.privacytools.io/ and I get to fill in traffic signs all the time. And yes I am logged into Google.

morpheuskafka · on Feb 13, 2019

I think what OP is talking about is Cloudflare not Google's decision. Google provides the CATCHPA API but Cloudflare decides to flag nearly all Tor traffic and make it go through the CATCHPA.

eldridgea · on Feb 13, 2019

In the case of Cloudflare specifically, they support Privacy Pass[0], an extension that allows solving one captcha to allow you through to multiple sites without de-anonymizing or reducing the security properties that tor provides.

[0] https://blog.cloudflare.com/cloudflare-supports-privacy-pass...

jorvi · on Feb 13, 2019

Cloudflare is a good actor, they offer the PrivacyPass extension that basically generates 30 auth tokens from one CAPTCHA challenge and then uses those until it needs new tokens. Sadly the overwhelming majority of sites doesn't use CAPTCHA through CloudFlare but directly through Google, rendering PrivacyPass moot.

SahAssar · on Feb 13, 2019

Cloudflare is not a good actor in this, they have shown that they do not care about encryption (allowing non-https backends while showing https to the end user) and embedding trackers in verification pages (the CAPTCHAs on random pages).

jplayer01 · on Feb 14, 2019

Cloudflare is the scum of the internet. They've put a crazy amount of effort towards making wide swathes of the internet unusable for people trying to protect their identity and privacy. I wouldn't trust their implementation of Privacy Pass.

jgrahamc · on Feb 14, 2019

Sigh. We changed this so long ago yet people repeat this over and over again. Do you use the Tor Browser? Please show me a site on Cloudflare which uses CAPTCHA on Tor.

Legogris · on Feb 14, 2019

I don't know about TOR, but a couple of years ago we had a site on Cloudflare that had the CAPTCHA come up for visitors from mainland China - where the great firewall blocked the requests to Google. Chinese users were effectively locked out. We contacted Cloudflare about this and got dismissive replies.

RidingPegasus · on Feb 13, 2019

I ended up removing a chrome extension that randomises user-agents because of this. It dramatically cut down google captchas.

Another thing that sets it off is virtual machine usage, I can be logged into chrome and gmail on the same residential IP for hours but the moment I try to search google for a problem inside a VM it's a minute of slow loading captchas.

Have moved to bing instead, that sort of wasted productivity is a burden.

packetslave · on Feb 13, 2019

This. The reality is, Google (and Cloudflare, and everyone else trying to block scrapers and malicious traffic) use heuristics that boil down to "99% of our traffic behaves like this". If you go out of your way to fall into the 1%, e.g. using Tor, disabling Javascript, randomizing your user-agent, etc., you're going to get CAPTCHAed.

Laforet · on Feb 13, 2019

Yeah, blending in seems to work better in many cases. Remember the guy who sent a bomb threat over TOR? The only reason he was caught so quickly was because he's the only person on the organisation's network to have accessed TOR before the incident.

enriquto · on Feb 13, 2019

> Google isn't going out of their way to punish you for trying to protect your privacy. They're trying to stop unwanted traffic. By unfortunate happenstance, (...)

This does not agree with my experience. I browse without cookies and severely limited javascript (using umatrix), and I also encounter the myriad of ridiculous inconveniences that the OP was referring to. On the good side, however, the web is much faster and generally less annoying.

asdfasgasdgasdg · on Feb 13, 2019

> I browse without cookies and severely limited javascript (using umatrix), and I also encounter the myriad of ridiculous inconveniences that the OP was referring to.

Isn't this also something that many bots do (don't run javascript and don't have realistic cookies)? It seems like just another instance of reducing your distance from the "bot" cluster in agent-space.

Kalium · on Feb 13, 2019

These are exactly the kinds of behaviors that bots sometimes engage in.

everdrive · on Feb 13, 2019

It's often the website provider redirecting users to a captcha based on certain conditions.

sjwright · on Feb 14, 2019

As a webmaster I can confirm that I hard block all TOR traffic for this exact reason. 90% of this traffic is malicious robotic junk of some form.

Also, I’m just not interested in the remaining 10% "legit" traffic from people who are aggressively paranoid about their privacy. Almost all of them ended up being dickheads who were using TOR to abuse other members of our community.

To the people who think every website should treat TOR users with respect, please understand that you are intentionally making yourself indistinguishable from the mountain of robotic junk, abuse and human dickheads. It's not my fault that you have chosen to do this, and it's not my job to provide you with tools to prove you're not a dickhead.

sjwright · on Feb 14, 2019

To the people voting me down, please understand that I am relaying factual information about my specific experience as webmaster of various large-ish regional websites. If you don't like the facts, voting them down won't change them.

...or maybe voting me down will change the facts.

Yeah, that's totally going to work.

skykooler · on Feb 13, 2019

Google seems to do the same even if you check the box while in an incognito window; I doubt the issue is TOR itself, but rather the lack of tracking data that Google has on that particular session.

muzani · on Feb 13, 2019

Have you used Captcha on TOR? It really does feels like they're trying to punish you. They give you about 4 pages of "identify the traffic light", all of which are difficult for humans, then reject and give you another 4 pages. Or that thing where it fades out for about 7 seconds before you click the next image, and then wait another 7 seconds.

duxup · on Feb 13, 2019

Google used to HATE my VPN, couldn't do anything on Google through it without a dozen damn pics to choose from.

My VPN must have gotten white listed (or cracked down on some of their traffic patterns) because that stopped.

jammygit · on Feb 13, 2019

Or did they just get better at fingerprinting us?

duxup · on Feb 13, 2019

Nothing would surprise me.... I mostly experienced it on an android phone....

briandear · on Feb 13, 2019

If Google wants to do that, that’s their prerogative. What pisses me off is when a bank or similar “secure” type of service forces me to train Google’s ML models in order to access my stuff. I didn’t agree to provide unpaid labor to Google.

wl · on Feb 13, 2019

Running uBlock origin seems to trigger the same thing, even on a static IP. It feels an awful lot like punishment to me.

gnulinux · on Feb 14, 2019

This is a horrible argument. What gave Google the right to be the moral authority of the internet (we, we did)? Even if 99% of exits from tor nodes are malicious, Google should have absolutely no capability to throttle this traffic. Unless you claim most of the traffic in tor are from bots, your argument doesn't make any sense. Captchas serve 2 purposes: slowing down bots, annoying humans. By putting captcha to tor exits, Google not only slows down miniscule amount of bots, but also annoys human traffic (good or bad). It is by no means a "good" thing that Google is capable of this.

fock · on Feb 13, 2019

I have exactly the same experience without using Tor, living in Germany...

I personally don't care too much about the hassle, but I really don't like the idea that I'm basically playing Artificial "Intelligence"/doing clickworking for the not so community oriented efforts of Google.

malvosenior · on Feb 13, 2019

> Well in excess of 90% of traffic coming out of TOR is spam, bots, malicious, or some combination!

Do you have any data on this?

Kalium · on Feb 13, 2019

An excellent, wise, and cogent question! In fact I do have data. You can find it here: https://blog.cloudflare.com/the-trouble-with-tor/

> On the other hand, anonymity is also something that provides value to online attackers. Based on data across the CloudFlare network, 94% of requests that we see across the Tor network are per se malicious. That doesn’t mean they are visiting controversial content, but instead that they are automated requests designed to harm our customers. A large percentage of the comment spam, vulnerability scanning, ad click fraud, content scraping, and login scanning comes via the Tor network.

The obvious caveats apply, of course. It's completely possible what Cloudflare saw at the time is no longer true and TOR is no longer mostly spam. It's equally fully possible that the traffic Cloudflare sees is wildly unrepresentative of what TOR traffic actually looks like, and it's mostly people worried about their privacy. This is just the data we have at the moment.

jcoffland · on Feb 13, 2019

A small percentage of bad actors using automaton can produce a lot of traffic. So although it may be true that a large portion of the requests coming from TOR exit nodes is malicious, it would be unwise to conclude that most users of TOR have bad intentions.

kelnos · on Feb 13, 2019

True, but from the perspective of an org like CloudFlare, that doesn't matter. They don't know (or care) about the user breakdown coming from Tor; they just know that the vast majority of traffic coming from it is malicious. And since part of the point of Tor is to make it hard to determine who's who, the good traffic gets binned with the bad.

thfuran · on Feb 13, 2019

I don't think anyone is concluding that.

jcoffland · on Feb 14, 2019

I think a lot of people come to exactly that conclusion.

SideQuark · on Feb 14, 2019

Why would they? Most people cognizant of these terms knows a bot generates more traffic than a human; that’s the point of most bots.

sjwright · on Feb 14, 2019

Cloudflare's documented experience aligns closely with mine; I've been limiting or blocking TOR ever since 2008 because over 90% of the traffic was malicious bots, and the majority of the remainder was malicious humans.

And when you have malicious traffic swimming in an anonymous pool, there's no practical alternative but to block all of it.

crankylinuxuser · on Feb 13, 2019

Isn't cloudflare the org that "Doesnt censor under any circumstances", and then turned around and censored white supremacists? Not that I agree with them (I DONT!), but it was a full 180.

And also, isn't cloudflare also the one to allow booters and stressers to be online behind CF - and they used stolen CC's to boot?

The Tor decisions to screw users over is just the cherry on top. Especially is egregious is when a captcha is demanded on even a simple static page. Seems pretty obvious what's going on here.

Frondo · on Feb 13, 2019

Everyone should censor and shun white supremacists. They have no place in modern society. When they shed their noxious views, we can all welcome them back with open arms.

Frondo · on Feb 14, 2019

[flagged]

dang · on Feb 15, 2019

The obvious explanation is that people were downvoting and flagging your comment because it was unsubstantive and ideological flamewar, not because they are white supremacists.

You continued to post flamewar comments. We ban accounts that do that repeatedly, so could you please stop? We've already had to ask you more than once before.

https://news.ycombinator.com/newsguidelines.html

Legogris · on Feb 14, 2019

That comment is not very constructive for discussion and against the guidelines, just like the replies. Downvoting is not censorship. https://news.ycombinator.com/newsguidelines.html

teddyh · on Feb 14, 2019

Hacker news people are usually against censorship, yes.

Frondo · on Feb 14, 2019

[flagged]

malvosenior · on Feb 14, 2019

Just like the ACLU. Free speech is very important. If someone has something objectionable to say, let them expose themselves. Censorship solves nothing.

Frondo · on Feb 14, 2019

[flagged]

ls612 · on Feb 14, 2019

Says the brave man yelling at people on the internet from his chair.

malvosenior · on Feb 14, 2019

I think you're doing a great job demonstrating why allowing people to expose their horrible ideas does more to dissuade other people than censorship. I'm glad your replies are on display even if I strongly disagree with them.

mamon · on Feb 13, 2019

Ok, so you've decided that being white supremacist is bad. I can agree with you on that, but still the question remains: who get's do decide what has a place in modern society? Who decides what "modern society" even is? Today Google might decide to censor white supremacists, tomorrow it can be human rights advocates. I think that allowing any type of censorship, even for such a noble cause as fighing racism is a slippery slope. Especially when done by a private company that is outside of our control (and governments are only marginally better).

Frondo · on Feb 13, 2019

You're trying to generalize a useful rule ("shun white supremacists") but it doesn't work in this case. I don't think we need to, either.

We're not robots. We can shun white supremacists and leave everyone else alone. This isn't a slippery slope, it's just good sense (no more white supremacists, hey!). Humankind will get along just fine if we tack on that one extra rule and all follow it.

sincerely · on Feb 14, 2019

Good thing the definition of white supremacist is commonly agreed upon and noncontroversial and absolutely isnt subject to definition creep :)

krapp · on Feb 14, 2019

> Good thing the definition of white supremacist is commonly agreed upon and noncontroversial and absolutely isnt subject to definition creep :)

The definition is commonly agreed upon, and what "white supremacist" means is not at all controversial to most people. It certainly isn't so arbitrary as to be meaningless.

Now, the term may be misapplied at times, as may any term, but for it to be misapplied, it has to have an accepted application to begin with. A term without a definition can't be subject to definition creep, and the possible creep of a term like "white supremacist" is that wide to begin with.

Frondo · on Feb 14, 2019

What isn't subject to definition creep?

Murder now, for some, includes abortion.

Censorship now includes, for some, private companies removing bad actors from their private systems.

Come on, that's lazy to dismiss it that way when society literally changes all the time.

mamon · on Feb 14, 2019

Offtopic, but regarding abortion:

Whether or not abortion is a murder is not about definition of "murder" it is about definition of "human being".

There's no doubt that abortion involves killing a living creature, the whole pro-choice vs pro-life debate is basically about one simple question: "is fetus a human being?". If you answer that with "yes", then every abortion becomes a murder, plain and simple.

This also explains why there will never be a compromise between two crowds: it is logically impossible to compromise on yes/no questions.

tomatocracy · on Feb 14, 2019

Isn’t the compromise position essentially ‘after X weeks’, where the value of X is highly contested? (And on the binary yes/no question there’s nuances too which get debated eg if continuing the pregnancy would be a significant threat to the mother’s life)

nprateem · on Feb 13, 2019

I think they do exactly that. For example disabling browser fingerprinting in firefox and not being logged into Google causes the majority of sites to display the captcha, especially when using a VPN.

xiphias2 · on Feb 13, 2019

They could use a memory-hard hashing function, like ARGON2 for proof of work, it would make spamming much harder.

hombre_fatal · on Feb 13, 2019

Not really, because spam isn't done on the spammer's hardware. Not to mention, an expensive hashing function is precisely something bots can do but humans cannot.

If you're putting constraints on Tor traffic, it's not because of raw throughput. It's because it's extremely poor quality traffic.

xiphias2 · on Feb 13, 2019

I see..the goal of ARGON2 is not to be expensive, but to be hard to parallize. Anyways the other points that you wrote make sense.

Kalium · on Feb 13, 2019

You're absolutely right! It could even be integrated meaningfully into browsers to make it easier to work with. Something Cloudflare's Privacy Pass (https://support.cloudflare.com/hc/en-us/articles/11500199265...) could work.

xiphias2 · on Feb 13, 2019

It looks really nice.

It should be default for the TOR browser for sure, if just a few people use it, it decreases the anonimity set.

Kalium · on Feb 13, 2019

Nah, it was released back in 2017. I've seen it discussed periodically ever since.

The issue with just doing memory-consuming work client-side is that it only marginally slows down spamming. Spammers tend to use compromised machines they don't own. Unless you can make it prohibitively expensive to calculate something using machines you don't pay for - perhaps not a trivial ask - you wind up needing a different set of tools. This is why Google tends to look at things that will exhibit human variation rather than pure computation.

It's not that your ideas aren't good. I'm sure ARGON2 has a use here! It's that this might not be a problem easily solved by consuming more resources.

xiphias2 · on Feb 13, 2019

Cool, I'll try it out the next time I have a problem with using TOR. You're right that ARGON2 doesn't help if CPUs/RAM are free, it just makes parallelization hard.

Kalium · on Feb 13, 2019

Parallelization is easy if you have a botnet of millions of machines owned by others to draw on.

teilo · on Feb 13, 2019

Those images are infuriating!

Click all boxes with traffic lights. Ok, well, this one box just barely contains the bottom right corner of the traffic light. Click. Nope, that little corner didn't count. Try again. Ok, well on this one, the right side of the traffic light is only barely over the line, so I won't click it. Nope, that sliver of the light mattered this time. MF!

SilasX · on Feb 13, 2019

Heh, maybe one day they can show a bunch of pictures of sand, where each subsequent pic has a grain removed, with the instructions "click on all the heaps".

Spambots will solve the Sorites paradox!

roywiggins · on Feb 13, 2019

"Click all the ships of Theseus"

Legogris · on Feb 14, 2019

"Is there no ship? Close the browser window."

rolph · on Feb 13, 2019

click on each star that is currently visable out your window :-/

nomel · on Feb 13, 2019

I actually have a few screenshots where the task was impossible since the data was mislabeled. The latest example was "click all of the buses". It wouldn't let me continue because I wouldn't select the fire truck.

My naive assumption is that you should click the "refresh" button in these cases.

Freak_NL · on Feb 13, 2019

Just click whatever you suspect is needed to pass. Don't go above and beyond trying to give the actual right answer; you're just feeding some proprietary database owned by Google. QA for it is their problem.

SahAssar · on Feb 13, 2019

There is some alternate (or future) reality where a google self driving car accident is blamed on bad training data from CAPTCHAs.

darkpuma · on Feb 13, 2019

> " It wouldn't let me continue because I wouldn't select the fire truck."

Another one is "click the mountains". It typically won't let you through unless you click anything with trees on the horizon, even if the terrain is clearly flat. Google's robot thinks mountains are made out of wood, and any human who disagrees is labeled a robot. It's insanity.

cesarb · on Feb 14, 2019

I've recently gotten caught in one of these, where it was "click all of the bicycles" and after a few clicks (it was one of those which fade out to present a new picture) the only "bicycle" left was a bicycle-shaped street decoration. It wouldn't let me proceed unless I clicked on something, so I had to refresh to get a new task.

menacingly · on Feb 13, 2019

I assumed the infuriating ambiguity is intentional, in order to train some algorithm they need to know what the prevailing human correct judgement is in dicey situations

jonas21 · on Feb 13, 2019

I don't think it's intentional -- it probably just emerges from the training process.

I'm guessing they do something like load up a batch of images and once N people agree on one, record the answer and remove it from the rotation. You end up left with the ambiguous images where people couldn't agree.

iheartpotatoes · on Feb 14, 2019

Then why do I keep seeing the same g-damn FIRE HYDRANT! :)

unclebucknasty · on Feb 13, 2019

>Those images are infuriating!

And, does the pole count?

The whole thing is way more stressful than it needs to be for what it is.

dataflow · on Feb 13, 2019

I'm convinced the ambiguity is intentional. What I don't get is what answer they expect in those scenarios.

drusepth · on Feb 13, 2019

I always figure they're looking for a population consensus. They're doing image recognition at scale and these are clearly ambiguous, hard images to classify. They could easily have a few people at Google say, "I determine this is a storefront" and make that the "correct" answer, but I think they're more interested in a consensus of what most "normal" people would classify as a storefront, especially in potentially-volatile classifications where real humans might argue over the answer. They can skip the argument and just know which side will win it.

darkpuma · on Feb 13, 2019

What they're actually getting though is the population consensus of what normal people believes Google's image classifier believes. The system incentivizes users to reinforce misconceptions their classifier has.

Does this look like a mountain to you? https://0x0.st/zzvr.jpg

Google's image classifier would think that's a mountain. If you disagree, google will classify you as a robot. After failing these sort of challenges a few times the user decides to play along and tell google what they think google wants to hear, rather than the truth.