When reCAPTCHA was created, the alternative was CAPTCHA, which tried to impede bots but did not generate any social benefit. This was the genius of the original reCAPTCHA concept: the time taken to 'confirm humanity' could be channeled into the socially-useful endeavor of digitizing books. Capture some of the heat emissions of impeding bots for a useful purpose, rather than letting it all go to waste.
Now, yes, Google is using it to train their self-driving car AI, and there's a bunch else happening in it to connect to Google's surveillance apparatus. There's much to legitimately criticize there. I personally don't view training Google's proprietary AI as the same kind of intrinsically altruistic purpose as digitizing the world's pre-digital books.
But putting the entire concept on blast with erroneous history that can be corrected with about 60 seconds on Wikipedia doesn't help the argument at all.
The risk of sharks is tempered by our experience with them. Few people swim in deep water beaches (because they have signs saying "Danger! Sharks!") And those that do typically take appropriate precautions and maintain awareness.
Sharks don't want to eat you and do quickly let go of swimmers they attack, but that's irrelevant because the damage has already been done. When I was young, a large amount of education was put into stating that shark attacks were rare, and it's true both in absolute numbers and by comparison with how feared they are. Jaws and knockoffs spread irrational fear in the 70s/80s, and my early 90s childhood came with the counterpressue there, but that counterpressue caused many in my peer group to misunderstand the risks. Shark attacks drop off hard to zero if you're swimming in shallow water. Even at 10 meters, which is not uncommon for surfers, they are a real risk. But surfers spend little time at 10 meters out. All of this forms a balance.
This covers any number of circumstances -- why travel is broadening, why rare / degenerative / mental health conditions are so frustrating to explain to healthcare providers / family / others, trying to communicate specialised knowledge, historical bias, what the old know that the young do not (and rarely, vice versa). Tacit vs. explicit knowledge. Theory vs. experience.
There's probably far better existing terminology than what I've come up with (Hume, Kant, and Berkeley address this, as does Plato, within philosophy). But it's also a major concern in a highly diverse yet tightly interconnected world.
This article also doesn't seem to touch on the newer reCaptcha that tracks you everywhere on a website (you'll notice a little blue box on the bottom right with the logo where this happens), not just on login or user input pages.
There is a lot to criticize about reCaptcha, including privacy concerns for sure, and there were some other posts about it on HN before.
I have a hunch that von Ahn knew this would happen and the same scan is shown to multiple users before a word is chosen.
reCAPTCHA v3 has no user interface, it only returns a score upon which the site operator can act, often delaying or blocking access  to services. In this case the responsibility falls entirely on the site, while Google is no longer at risk of being found liable for the damage caused by its discriminatory service.
reCAPTCHA v3 works best when it is embedded on every page of a site . The service collects detailed interaction data on every website you visit which has implemented it. The extent of tracking is similar to Google Analytics, but you cannot block it, otherwise you lose access to large portions of the web.
The collected data is highly sensitive, it not only contains your browsing history, but a detailed snapshot of your actions on sites. Mouse movements can reveal health issues which affect your motor functions, and your interests and desires are laid bare based on how you interact with content.
You must resist against adding reCAPTCHA v2 and v3 to your sites. There are alternatives  which could offer the same level of protection for your services, when used the right way. Their implementation may not be as convenient as reCAPTCHA is, but that is the price you must pay to prevent Google from mining our personal data and our every interaction on the web.
People are forced to hand over their personal data to Google at all times, otherwise they face losing access to services, and being excluded from societal processes that are increasingly happening exclusively online.
This is where privacy rights and human rights are violated, and it is upon all of us to make our voices heard, so that exisiting legislation is enforced, and new laws are put in place to prevent companies from abusing and exploiting us.
Handing over our data to Google must not be a condition to fully participate in society.
i'd even support a ban for other core services like utilities and banking that may not be public entities.
Well specifically, Niantic, which was a google internal thing.
Nor does an entirely fallacious premise. ReCAPTCAH v3 is entirely transparent and non invasive to users. In fact it’s retroactive to help the site admin figure out what to do with the score:
Except when you don't opt into google tracking you by blocking third party scripts, in which case your life still gets to be hell.
... who are opted into fully and completely to all Google tracking and have previously participated in Google's ecosystem.
Pretty odd definition of "entirely fallacious".
Except for the invasion of my time and attention, used to train Google's AI to get better at recognizing traffic signals. I took that as the main point of the article.
v3 is "invisible" and is supposed to be deployed to every page on the site, and the site is the one who decides how to punish you for not matching their normal audience.
What I really hate about recaptcha v2 are those artificial delays before loading the next image (which it can happen to several times on a single card). And then in the end you frequently fail, despite answering everything correctly.
> My understanding of recaptcha v3 is, that it just gives you a score for how well google can track you and then leaves it up to the site operator to block users who aren't transparent enough.
Yeah, that's the one where you're supposed to put it on every page of your website so that Google can collect more information on your users. If they can't, they'll return a low score that you'll use to mark users as "bots".
> What I really hate about recaptcha v2 are those artificial delays before loading the next image (which it can happen to several times on a single card). And then in the end you frequently fail, despite answering everything correctly.
I think this is just what Google does when it thinks you're a "bot": i.e. they don't know who you are.
I do wonder why though. For a long time, I assumed it was a rate limiter, but then another HN commenter pointed out to me that time is more valuable to humans than bots. Bots can work on multiple captcha's in parallel.
I've become accustomed to just closing any page that presents me with a v2 reCAPTCHA.
I'm waiting for the day when it pops up "find the humans" and there's someone clearly wearing CV-camouflage. For anyone that doesn't get it: you let that human hide, because it could be you in twenty years.
Edit: Unless, it occurs to me now, Google is monitoring how the user responds to the fade in. When in the fade process do they click the square, for instance.
Still seems like it wouldn't be all that helpful, though.
Maybe this is to prevent adversarial learning? If the images reload immediately, then the bot can learn (via a neural network) whether its solution was good or not. If there's a delay, it's the same, but the learning is slowed down by the same factor.
No idea if this is true, it just popped out of my head.
There's more saleable data to be collected tracking people's interactions in a website, under the guise of predicting who's a bot.
Or, for absolutely everyone: pretend you're blind and click on the audio captcha option. It takes seconds to solve and is trivial 90% of the time :)
It just scores you and the dev can punish if needed
Can't think of a "normal" website using it.
Sadly, I don't think the author knows the difference between v2 and v3. The article is definitely not talking about v3.
> I don't think the author knows the difference between v2 and v3
The author doesn't seem to have ever visited google.com/recaptcha to know what Google refers to as different versions of its service. The author instead is talking about "versions of bot detection", ie. The "detect human mouse movement" was v2, "scan your Google history to see how human-like your activities are" is v3, and the author envisions "v4" as doing the silly things from the latter half of the article.
This seems to be the best strategy for getting to the top of the front page these days. Even better if the target of the emotionally charged language is Facebook, Google, or Amazon.
- Running a high profile, or even low profile. service which attracts automated or spearhead attacks makes you appreciate reCAPTCHA
- Web services and users are often low value and reCAPTCHA offers a free medicine
- Cleaning up attacks and such as a devops/webmasters is pain in the ass - getting all those alerts ad Saturdat 11:00pm in a bar - you do not want to cover them from your $100 budget
- reCAPTCHA makes many problems go away for a service provider
- People complaining about reCAPTCHA are often low value users (they do not buy anything) - though I have only subjective point to confirm this
Long term solutions can be only moving away for CAPTCHAs to strongly authenticated humans by a trusted party
- Strong human authentication on every service controlled by Apple/Google/Facebook who has vast data to keep bots in the check
- Start paying for the services - though you still need to do CAPTCHA at least once in the card authorization to prevent cardsters
Alternative for reCAPTCHA - though I do not vouch in for the quality yet: hcaptcha.com/
Bonus: Micropayments instead of ads or make botting too expensive - welcome to cryptocurrency land
That's a government function that's hard to apologize for.
Bots usually fail in the 0.0-0.3 range, so you can run it with a threshold around 0.7-0.8 and most people won't even notice it. Shame about the gross privacy invasion but it's probably not much worse that running Analytics?
It will kill your site speed score on Lighthouse for mobile.
Also when it asks to select all the cars. Is a bus a car? Is a truck a car? I really don't know what it is expecting and I must pick wrong as I often fail.
People generally don't take pictures of rolling green hillsides. But they very often take pictures of rolling green hillsides with sheep on them. So if you ask the robot to draw a picture of rolling green hillsides, it will include sheep. Or, if you ask it to draw a picture of the savanna, it will want to include giraffes.
Now you're being asked to find a bus in a photo without a bus because it's a street scene, and every street scene has a bus in it.
I haven't read her book, yet, but her Twitter is often full of amusing anecdotes like this.
Don't humans do this too, though? If I asked someone to draw a picture of rolling green hills, they may well add sheep as an additional detail.
I haven't done the experiment, but I'd posit that if you walked up to a group of 8-year-old children, gave them crayons, and asked them to draw pictures of "rolling hills", a significant portion would add sheep, cows, flowers, or some other details—even though a majority of rolling hills in the world don't have any of these features.
Your goal shouldn't be to answer the question earnestly, but to confirm the machine's biases. Going with the flow is expedient, as well as giving Google less support.
It gets more frustrating when it's less clear. Is "click the hills" with a picture of a mountain a mistake, or a trick? Should I click all the tiles that contain a bus, even if it's only one pixel, or should I only click the ones that mostly contain a bus? etc.
Hopefully these rules of thumb will help someone reading this find these captchas less frustrating.
I did find out later that some legitimate users were getting rejected due to auto fill or something.
Does anyone knows a recapthca2-like service/software, that would help solve such tasks (OCR, object recognition..) on a custom provided dataset ?
It could provide an alternative to recaptcha, and a "Mechanical Turk"/crowdsourcing for universities, institutions or companies to help solve some of their repetitive tasks.
Hahahahahah hahhaha ahhah hah. Lol.
I'm sorry, I just can't help myself. This cultural tendency toward naming everything you prefer as a right is just... It's hilarious to the extent it's not just sad. I'm willing to give the author the benefit of the doubt that it was just hyperbole. In which case, bravo.
1. No (or at least piss poor) localisation. It asks me to locate English words, sure, but the images are of things familiar only from American films - sorry, movies. ReCaptcha is how I know (and my only use for knowing) what a 'crosswalk' is.
2. Sometimes it's just wrong. But I have to select the images that it incorrectly thinks is a bridge or whatever anyway, otherwise I'm not allowed to login.
Everyone's one of N top complaints:
- How much of the damn structure counts as a traffic light?!
The idea is to develop a framework for Captcha generators. A few sample generators are provided out of the box, but new ones can be written easily. The framework takes care of storing entries in the database, serving them as challenges through an HTTP API, and checking the responses.
From the README, why libreCaptcha:
* Eliminate dependency on a third-party
* Respecting user privacy
* More variety of CAPTCHAs, tailored to your audience
But we are not trying to create an unsolvable Captcha. For those websites that need something good enough to deter generic bots while not compromising privacy of their users, this might be a good enough alternative to reCaptcha.
So I wish them luck while hoping that they’ll go die in a fire.
Manifest v3, tracking everything, this... Only validates my decision to scrap all Google services 2 years ago.
> Hundreds of millions of CAPTCHAs are solved by people every day. reCAPTCHA makes positive use of this human effort by channeling the time spent solving CAPTCHAs into annotating images and building machine learning datasets. This in turn helps improve maps and solve hard AI problems.
https://www.google.com/recaptcha/intro/v3.html (under "Creation of Value - Help everyone, everywhere - One CAPTCHA at a time.")
Yeah, if you're signed into Chrome and Google has enough information to know who you are (yes, I know the new one is score-based…this just means that anyone who isn't the above is going to get a poor score and be blocked.)
No, I'm not joking.
If this were about not being able to see cat pictures, alright, but this is about accessing money that I'm supposed to own. This is so backwards to me.
1) Google bough reCAPTCHA in the first place.
2) Google's latest captcha isn't even a captcha, you just click a button saying you are human and it analyzes your mouse movement and probably a fingerprint of sorts and you are in.
The article just isn't accurate and seems unnecessarily hateful about things that are not exactly true.
I guess I didn't mean a button, but a checkbox.
Apple is working on their SSO project (Sign in with Apple); I hope they will also consider the use-case of being able to tell a site that you're a human without sending any information.
Basically the same they'd do with e-mail.
Several times I was prevented from complaining about something to a company that didn't have a public e-mail address, but only a contact form with captcha, just because that day, google decided they'll fail all my recaptcha attempts, and tell me "sorry we think your computer is sending automated queries" and wouldn't even allow me to try any futher.
It really doesn't help if I'm in a negative midnset about the company already. I've completely stopped using shipping companies that put tracking info behind recpatcha, for the same reason, although that's a different thing from contact forms and a bit harder to manage. But contact forms should not be blocked with recaptcha.
Outright blocking communication is a poor taste. Accept all communication and use automated mechanisms to filter through it on your side.
BTW, this is even worse for users that block re-captcha via adblocker or something, because you often lose the entire text you were about to send. So the next attempt you're even angrier.
And because I value my time you risk simply not having me contact you if you use any kind of captcha.
I switch services whenever it is possible and I see a reCAPTCHA prompt.
(I am a human.)
If you're non-Chrome user, don't even try playing with images - Google force you to click 3-5 times more than Chrome users, it's just stupid.
Hmm, if only I could read Bengali, I could tell if this were a storefront or not.
(submitted to HN here: https://news.ycombinator.com/item?id=20158386)
This makes for the most shitty experience ever, when I try to use such a website and I give the middle finger to the person who decided they need to have a reCAPTCHA there and to the person who put it there and I tell them the F work in my mind. The disregard for people's privacy cannot get much worse than with reCAPTCHA.
Using it with Tor is almost certainly not a good idea because it changes your own behavior from other Tor users thus compromising your anonymity (and the Tor folks are not in favour of PrivacyPass, because they think the solution is that CloudFlare shouldn't be putting the reCAPTCHA in the way in the first place). And that's assuming that the cryptography is actually solid and there is no way to distinguish between different PrivacyPass users. Tor has decades worth of research put into it -- what level of scrutiny does PrivacyPass have? How many people actually use it and how many have tried to break it?
> When 80% of traffic from an IP is malicious and the other 20% is regular traffic, but both sources look like the same traffic (impersonating browser headers, sometimes running headless chromium), what else can you do? Cookies and stateful cookie-like objects, such as privacy pass.
Google claim not to use DNS in this way:
> Is any of the information collected stored with my Google account?
> Does Google correlate or combine information from temporary or permanent logs with any personal information that I have provided Google for other services?
The proposed solution would replace captcha entirely, but to my knowledge nobody has tried it.
i've had success with bayesian filters and shadowbanning myself, but it does require some effort.
If 10% of web developers used Tor on weekends, no website would use reCAPTCHA because they'd realise how painful it is to certain users. I run a Tor relay (non-exit) at home, and now I get more reCAPTCHA even though there's no possible reason to assume my home IP is "bad". I'm still going to run my Tor relay -- I just think it's interesting to note that users are being punished by a giant MITM-as-a-Service company for trying to help other people use the internet anonymously.
I imagine it's partly because I don't block cookies (I whitelist sites that get to store them across sessions and everything else is then session only).
Picture captcha every time.
I'm not sure why they bother, since I'd find it more suspicious if say someone is coming from a Russian IP address but has UTC set and not a Russian timezone...
I would happily pay a monthly fee to get around these ridiculous captchas, even though it's absurd to have to do so.
Of course, the Tor folks told the CloudFlare folks about this many years ago and CloudFlare still acts as a giant censorship machine and continues to block anonymous users from reading content on the internet. Not to worry though -- you can install their extension[+] to "protect your privacy" to bypass the reCAPTCHA that CloudFlare themselves erected in front of other people's websites! It's definitely not in any way comparable to an arsonist selling fire insurance as a side gig -- at least with fire insurance you actually got something out of the exchange!
[+] Which does have a paper that explains the security of the cryptosystem, but a single paper does not make a protocol secure by default. I'm not a cryptographer, but the Tor folks did raise some concerns in the issue where PrivacyPass was discussed, and there's no doubt that combining Tor with a system that is nowhere nearly as battle-tested should be a major point of concern.
Do I wish there was an alternative to aggressive ads and tracking? Hell yes. Do I want to pay every website individually for what I view? Nope, and companies don't either because it would massively hurt their growth for people coming in new.
I would argue a "better" framing designed to emotionally manipulate would be "why are you trying to block people in oppressive regimes from being able to read about the outside world and organise themselves, putting them in danger of being murdered by their government"? But it would be dishonest to make the discussion about "why do you want to kill people", just as it is dishonest to make the discussion about website business models.
Advertising is related here because recaptcha's use of tracking, primarily used for advertising, as a factor in determining their score for users and also because blocking ads/tracking is part of the cause behind people's issues with recaptcha.
Contact forms should just send an e-mail and let the e-mail's content based statistical filter deal with spam.
Blocking people from communicating with your company based on Google's whims is really not smart. It's giving too much power to Google.
There's an audio captcha option.
Okay, he isn't really a friend, I only met him once. We have a friend in common who speaks sign language and was able to translate. Seeing him read sign language with his hand was interesting.
"You probably don't need ReCAPTCHA"  is an article discussing techniques that has had decent discussion  on HN before.
There's an argument to be made that this would incentivize even more investment in bots that can pass them. My reply: Google should just create a bounty for anyone who successfully beats it worth more than the black hat value.
"Click on the photos of humans with weapons."
Sometimes I feel like I'm only half-joking, though.
The internet became practically unusable thanks to the constant, unsolvable CAPTCHAs. You can click the correct image tiles until your finger falls off but you still won't get through.
Instead, use the protocol-independent CAPTCHA. It is a SASL mechanism, which sends a challenge with plain ASCII text (and may include line breaks), and then accepts a single response of plain ASCII text, and then the server decides whether or not the response is acceptable. The similar thing can also be done with a simple HTML form, but using SASL would then allow working with any protocols and work with command-line interface just as well as HTML interface, too.
Now that was evil!
Accounts don't really cost me anything, and they get automatically deactivated if they didn't get any activity in the first 30 days anyway. Activity on my service costs money, so if someone wants to make a bot that pays me money, I have no problem with that.
The only time I'd consider implementing something like reCAPTCHA is if I was giving something away for free (e.g. a free trial) such that a signup actually had a cost for me.
I can't imagine it actually taking 30 seconds to solve a reCAPTCHA. That needs a citation.
You will start getting challenged more often, you'll find you're asked to solve several multi-select challenges in a row even if you get them right, the multi-select challenges will replace tiles after you select them, and the tiles will start fading in and out very slowly.
See https://www.youtube.com/watch?v=en5KSZSpDFY for an example video.
These are (AFAIK) intentional features deployed Google - you just don't see them if they can already track you via something like your Google account.
Took me all of 5 seconds to solve
Google still has plenty of other tracking points on you. And of course reCAPTCHA looks reasonable to a prospective developer - it wouldn't be adopted otherwise!
If someone tells you it routinely takes 30 seconds to solve, you should really just accept their experience rather than discarding it with "works fine for me!". If you still need to see it with your own eyes, go setup a TOR browser and become one of the undesirables, rather than just imagining. You just might end up adopting the marginalized view.
Before being so brazenly dismissive of other people's experiences, take the few measly minutes it would take to actually try it out. Even doing a simple Google search with Tor Browser usually gives you a reCAPTCHA to solve.
?? I did try it exactly as was requested of me. I didn't know where to find a recaptcha so I went to the demo page.
It's possible the video is the result of a bug and not normal behaviour. I wouldn't know as I don't often browse with tor.
I once had to request a new password for my online bank account. I ended up asking the bank manager to reset my password, pretending that reCAPTCHA is preventing me from resetting the password myself. My bank is not paying me for solving captchas for Google's benefit (this is so screwed up…).
The biggest offender is CloudFlare for me.
Well, now you do (though I was a bit of an asshat in my parent comment, sorry about that):
> Even doing a simple Google search with Tor Browser usually gives you a reCAPTCHA to solve.
You can get the Tor Browser from .
It just so happens that Google's method for telling if I'm a bot or a human isn't just to show me mangled text, but also to use ad-network-style tracking.
In Google's defence this is probably an attempt to deal with captcha-solving-by-humans-as-a-service or something like that. I can't imagine Google employs many privacy enthusiasts who would spot a problem like this by dogfooding.
These users appear to be Firefox users who sandbox Google cookies. That said, it seems to have improved a lot recently (only one challenge)
It happens to me quite frequently... but I block a lot of tracking. Sometime I give up if I was browsing just for fun... but my bank sometime use reCAPTCHA.