
Introducing the Invisible reCAPTCHA - taytus
https://www.google.com/recaptcha/intro/invisible.html
======
macrael
Google using captchas to get humans to read street addresses captured by
street view cars to improve maps results remains one of the most Googly things
they've ever done. Genius, lateral, and a little weird.

~~~
mrighele
Having to do free work for Google is more than weird for me. For this reason
unless the page is really important I end up closing the tab. If it was for
open service (such as OpenStreetMap) it would be a different matter though.

~~~
danso
How is that "free work"? Automated usage can threaten a site's bottom-line,
and even its existence. A site resorts to Google's CAPTCHA's service because
they don't have the money to build their own detector. It's not a service that
was free for Google to create.

edit: Also worth noting that some of this "free work" that users do for Google
is used to improve bot-detection overall. The comment to this blog post (the
post itself, and its paper are great reads) is a nice example:

[https://security.googleblog.com/2013/10/recaptcha-just-
got-e...](https://security.googleblog.com/2013/10/recaptcha-just-got-easier-
but-only-if.html?showComment=1407032409681#c2120495904244057201)

Essentially, the user is complaining (and justifiably so) of being served the
pre-Street-View versions of CAPTCHA; I had forgotten how bad they could get:
[http://i.imgur.com/01F2eES.png](http://i.imgur.com/01F2eES.png)

~~~
dsp1234
_How is that "free work"?_

The original recaptcha was used to help clear up text from book digitization
projects where OCR couldn't understand the data. The appeal of this was that
these works were in the public domain, and thus proper digitization of these
works is a societal-wide benefit.

Since Google purchased it, they've been using it to:

1.) Clean street view data for a proprietary product (Google maps)

2.) Build training sets for unknown ML purposes

These are activities that Google could very much pay a group of people to do.
Instead, through recaptcha, they are getting that work from the end user for
no payment. A case could be made that it's not free for the owner of the site
that deploy recaptcha (because they get value out of the service, and Google
gets data/ML services). However, the actual end user who has to fill out the
recaptcha does not benefit in any significant way. Since a recaptcha is an
inconvenience to the end user, that user pays both with the time to fill it
out, and the data gathered by Google.

TL;DR Some people do not like that Google benefits from a transaction where
Google is not a party, and where otherwise, Google could generate the benefit
using their own resources.

~~~
danso
Recaptcha was not a free service. Or if it was, why did Google end up paying
millions for it? And it's not free now, unless the engineers and scientists
working on it are working pro bono.

According to Wikipedia and the New York Times, reCaptcha was not developed for
public domain works. Its pilot project was to digitize the NYT archives,
archives which were not released to the public domain nor are fully available
without being a subscriber:
[http://www.nytimes.com/2011/03/29/science/29recaptcha.html](http://www.nytimes.com/2011/03/29/science/29recaptcha.html)

I'm not a machine learning expert but I'm going to laugh at your suggestion
that Google could "pay a group of people to do". In the above referenced NYT
article from 2011, recaptcha's creator says several million words were being
processed by recaptcha per day.

And again, have to disagree that the end user "does not benefit in any
significant way". We would not be discussing this if Google hadn't learned
from massive user data to iterate their captcha from distorted word mush to
what it is today. Captcha was a serious drain of user energy and patience,
that's why recaptcha was invented in the first place. And the worst captchas
were tolerated because automated usage was a financial threat to websites that
end users use.

~~~
eriknstr
I had totally forgotten how bad the word mush was. In perspective of this, I
agree that we _do_ get something back from doing unpaid work for Google, and
that is a HUGELY improved user experience with the current reCAPTCHA like
"click all the pictures of sandwiches" or "click the portions of the picture
that has a street sign in it" compared to the earlier reCAPTCHA and other
CAPTCHA.

~~~
wst_
Which still makes you work for them. Just, instead of letters, you get a bunch
of images to recognize.

------
wisebit
"Powering these advances is machine learning and a combination of threat yadda
yadda"

Looks like they do little more than just check for a Google cookie [1].

[1].
[https://www.blackhat.com/docs/asia-16/materials/asia-16-Siva...](https://www.blackhat.com/docs/asia-16/materials/asia-16-Sivakorn-
Im-Not-a-Human-Breaking-the-Google-reCAPTCHA-wp.pdf)

edit: still, it's far better than the previous state of captchas. I'm glad
they did this. But it's like for anything to be considered "advanced" or
"good" in tech lately, it has to have been powered by "machine learning".

~~~
xg15
Well, technically, it is machine learning. Only that the machine learning was
likely part of the usual data mining on google accounts and not much specific
to the captcha problem...

(That said, whenever I used that checkbox widget they had before this
announcement, there was a noticeable framerate drop in the browser while the
thing was doing its magic. So I suspect, they are at least doing some browser
fingerprinting/benchmarking to see if the widget runs inside selenium or a
stock browser.

I also remember rumors that they analyze keyboard/mouse input on the page and
check if it looks "human", but I'm not sure if that's true.)

~~~
mpeg
Yeah it's basically browser fingerprinting (incl. GPU fingerprinting, hence
the slowdown) plus google cookie.

If your browser is standard (AKA no anti-fingerprinting plugins) and your
advertising cookies are not blocked (privacy or adblocker plugins) you'll
probably pass with no issues.

If either of those is not true, you have to solve a bunch of image captchas.

Mouse/keyboard input analysis was just marketing talk; at least when they
first released the nocaptcha it wasn't even captured.

------
vidyesh
This is so confusing. It doesn't explain how it works nor a demo page nor the
reason behind why it went invisible.

~~~
markdown
> This is so confusing.

Really?

> It doesn't explain how it works nor a demo page

Imagine a web form without reCaptcha. Do you really need a demo of that?

> nor the reason behind why it went invisible.

Because Recaptcha had an annoying, bad, terrible UX.

~~~
mulmen
You have done nothing to answer the GP.

How does Google determine if the captcha should be shown?

What are the "adaptive captchas" that are shown to suspect users? A demo would
do a great job here.

How does an invisible captcha "create value by applying human bandwidth" if
the premise is that humans never see the captcha?

------
malikNF
While I like the idea of not having to deal with these annoying ReCAPTCHA
prompts, something somehow feels "intrusive ?". I mean does this mean google
is going to keep track of what I would be doing when I visit a site?

Say for instance I am signing up for a website, does the password I enter get
sent to google servers to be analyzed now?

~~~
Sir_Substance
>I mean does this mean google is going to keep track of what I would be doing
when I visit a site?

Oh buddy, I have bad news for you...

[https://support.google.com/dfp_premium/answer/1716364?hl=en](https://support.google.com/dfp_premium/answer/1716364?hl=en)

[https://www.google.com/analytics/#?modal_active=none](https://www.google.com/analytics/#?modal_active=none)

[https://www.doubleclickbygoogle.com/solutions/measurement/](https://www.doubleclickbygoogle.com/solutions/measurement/)

~~~
popol12
Wow, I never heard of Pixel Tracking. That sounds super evil. Are there
browser plugins to fight back on this one?

~~~
r3bl
They're an even bigger thing in emails. Usually, companies would include a
pixel from an external resource and then figure out if you've read the email
or not by looking at was that image loaded or not.

IIRC, GitHub does that. If you read the email of some notification, it won't
show you that notification in your notifications. The solution is to block
external image loading in your email client (I know that Thunderbird has that,
and I know that Zoho's email client on Android has that).

~~~
username223
Yep, tracking pixels are among the oldest forms of internet surveillance,
predating the far more aggressive and intrusive JavaScript companies use
today. They're one of the reasons why most mail clients don't load images by
default. They only give information on page loads, not obsessive behavior
tracking, but they're harder to block.

They're why I block doubleclick.net, google-analytics.com, etc. in my hosts
file rather than just blocking their JavaScript.

------
daviding
It's done by binding it to one of your own buttons, where I guess it does the
'jitter test' on that element (if it's a low risk IP).

[https://developers.google.com/recaptcha/docs/invisible](https://developers.google.com/recaptcha/docs/invisible)

~~~
retrogradeorbit
What is "the jitter test"?

~~~
gregschlom
I would guess testing that the mouse pointer moves semi-randomly over the
button, in a fashion typical of the way humans browse a web page

~~~
reaktivo
I can't image it being limited to jitter, I'm curious how they handle
touchscreens.

~~~
Klathmon
Seeing how you scroll the pages how you click your finger, the speed that you
tap elements.

Honestly I feel mobile is easier to validate

------
garganzol
ASP.NET AJAX Control Toolkit provided similar functionality since 2007. The
corresponding component is called NoBot.
[http://www.ajaxcontroltoolkit.com/NoBot/NoBot.aspx](http://www.ajaxcontroltoolkit.com/NoBot/NoBot.aspx)

"NoBot is a control that attempts to provide CAPTCHA-like bot/spam prevention
without requiring any user interaction. This approach is easier to bypass than
an implementation that requires actual human intervention, but NoBot has the
benefit of being completely invisible."

Works like a charm even now, 10 years later.

------
greenhouse_gas
Just curious. What happened to all the data made by the old reCaptcha (the one
that OCRed books).

I know that they stopped the project, but did Google at least release the data
(old public domain text of books)?

~~~
dingdongding
I think that data is what made lot of old books searchable on Google Books.

~~~
fiddlerwoaroof
Wasn't recaptcha originally associated with Project Gutenberg's digitization
efforts?

~~~
fiddlerwoaroof
I guess not.
[https://en.wikipedia.org/wiki/ReCAPTCHA#Origin](https://en.wikipedia.org/wiki/ReCAPTCHA#Origin)

~~~
greenhouse_gas
It says there that it was.

Seems like a bait-and-switch. Do free labor for a good cause (PD books), turns
out you're just growing Goolge's library which can be taken down at a whim.

~~~
icebraining
_It says there that it was._

No, that was the Distributed Proofreaders project, which is unrelated (just
used as an early example of crowdsourced OCR).

reCaptcha originally helped digitized the archive of the New York Times, but
that was finished years ago.

------
colept
What I want to know is what happens if you get 'trapped' in this invisible
ReCAPTCHA?

The most frequent encounters with CAPTCHA's I see are rejected API requests
over VPN.

~~~
chrisacky
I was browsing Upwork the other week and I got trapped.

I couldn't load any more jobs, or view any more workers. I had to inspect the
Network tab in Chrome, open the API request and then click a "I am not a bot"
on their API page.

That was a poor implementation tbh.

------
Geee
Doesn't anyone else see it as a problem that Google's bots (and possibly
CIA/NSA) can access any system using Google ReCAPTCHA? They can create
unlimited amount of accounts on social networks, blogs, forums etc. and thus
have unlimited social voting power on the Internet.

There is a need for a decentralized Captcha that can't be circumvented by
anyone.

~~~
StavrosK
Would not having ReCAPTCHA prevent this?

~~~
iainmerrick
Using some other bot-filtering system would prevent it. Of course the makers
of _that_ filter would be able to bypass it, but I guess you could use
multiple filters...!

~~~
StavrosK
Can't you do that now, then?

------
visarga
That whole page wasn't apple to explain WHY it is invisible and how it works
if it's invisible.

~~~
tyingq
Guessing it's your browsing history (courtesy of including js from a Google
domain) plus mouse tracking.

It doesn't seem much different than the current "click here" one to me. They
are just letting the page owner substitute their own button in lieu of the
check box.

~~~
StavrosK
Is that why I _always_ have to solve a challenge? Because I use Privacy Badger
and uBlock Origin?

~~~
mpeg
Yes, recaptcha works based on advertising cookies.

------
Grue3
I remember when they advertised it with "you just need to click a checkbox".
In reality, I always have to solve a streetview captcha, possibly several
times in a row.

~~~
mpeg
Let me guess, you use privacy enhancing browser extensions or adblockers? :)

------
zumu
I always wondered how blind people entered captchas. Does this finally make
captchas accessible?

~~~
ry_ry
There is usually an audio version available, presumably screen readers are
pre-configured to present the audio option on popular captchas by default.

I'm curious now, will have to give it a whirl when I get into the office :D

Suspect Google will rely on their vast knowledge of people's browsing habits
based off IP/account/ad-tracking/browser-fingerprinting to skip the user input
aspects. Although that said, a screen reader won't have the standard physical
interaction clues client-side that a user is a real person, mouse tracking for
ex. is probably a moot point. Not really sure how Google will handle those, or
if blind users will get a degraded always-on captcha experience.

Either that or a headless screen reader becomes the scraping/botting tool of
choice.

------
andy_ppp
I think this sounds like further and deeper (maybe multi page) tracking to
build up a profile that Google deems acceptable. I think I'm off to train a
bot to behave more like a human when it interacts with a webpage...

------
cedex12
Concerning the general problem of captchas: What about paying a small fee
instead of filling ever increasingly complicated forms. Say 10 cents. You
could set up a service where the website's owner could decide whether the
money goes to them or to, say, a foundation of their choice.

------
albeebe1
I'd love to hear some commentary from someone involved in the business of
bypassing captchas. I have to assume "google invisible captcha hack" is
getting bid up right now.

~~~
turblety
Me too. If you want to spam or hack surely you'd just outsource the captcha
solving to a cheaper country. Or even sites like 2captcha that give you 1000
solves for around 50c. You can still make a robot but just send the captcha to
a service like that, and that's not even one of the cheapest ones. Especially
when you consider what the real value of the data (or even spamming) is.

------
d--b
They should add a contest to break it. Cause that sounds like the only real
thing they can do is track the movement of the mouse / keyboard, and that
sounds quite simple to fake.

~~~
espadrine
Google Analytics and Ads are present in a large portion of the Web. So it is
much more difficult than that. It requires you to program the bot to follow a
sequence of links unrelated to the bot's goal, idle on some pages, click links
that make sense to the session, enter some text sometimes in comment fields —
in essence, react to content the bot has never seen before.

Google is betting that they can extract meaning from web pages better than
bots, and they have had a lot of experience with that. On Web pages, each link
does not have the same probability of being clicked by a human given the list
of pages seen before. Knowing that probability requires the bot to understand
what a human would see, and to perform actions that match a given goal which
corresponds to the sequence of pages browsed.

And that goal mustn't always be the bot's goal. Bots have business incentives:
they want to get people to do something by writing text that will be seen by
humans. Humans, on the other hand, only do so once in a while.

------
mattbgates
I started making my own as Google Recaptcha was getting annoying. At its most
basic, hide a field. On your PHP page that does the checking, simply see if
that field has anything in it. If it does, its a bot. If it doesn't, it's not
a bot.

~~~
delta1
Do you use inline styles to hide the field or "obfuscate" it through a CSS
class?

~~~
mattbgates
It just doesn't show up at all. Placed right before the form ends. I also use
a combination of PHP and Javascript to create my own captcha. PHP generates a
random string which I send to Javascript to hold. Upon clicking a button, if
what is in the captcha does not match what is in the Javascript, thou shalt
not pass. Haven't had any bot break through it yet, but I did have a Russian
hacker email me, pissed off (entire email was in Russian, but I'm pretty sure
the translator did well to know he wasn't happy), because he had created a bot
specifically for my website that managed to create over 2000 posts in less
than an hour.

You can view the captcha at [https://mypost.io/](https://mypost.io/) .. you
cannot create a post without entering the captcha. I ended up creating my own
because despite entering the correct answer (selecting the right images),
Google Recaptcha would not recognize it.

------
lightedman
Yet another CAPTCHA failure.

Here's the problem Google needs to fix. If I'm logged into my account, years
old, obviously not a complaint against it, I still end up getting this captcha
nonsense. Doesn't matter the site.

------
bitmapbrother
So I didn't see how it was done, but in a way that makes sense. I guess Google
doesn't want to divulge what they're doing as to not give the bot makers any
sort of insight on how it works.

------
return0
nice, but that was the most uninformative video ever.

~~~
pymai
its quite simple. the boot kicks the ball, the balloon carries it up to the
ferris wheel, the ball drops onto some vertical pully system and raises the
red thingy, which then tells the browser that you are not a robot

------
rodionos
Is there a sample page somewhere, where I can 'see' the invisible recaptcha at
work?

~~~
kgdinesh
You'll never know.

------
vasundhar
What does it mean to #Privacy ? Does it mean they track more of our behaviour
?

------
Sir_Cmpwn
I would still like to see captchas being replaced by proof of work systems.

------
SN76477
I cant wait to get blocked out of a website and not know why.

~~~
dbbk
You get presented with the traditional reCAPTCHA if you're 'blocked'.

~~~
SN76477
oh cool, ok, then that isnt so bad.

------
vslira
I don't mean to imply this is some kind of evil monopoly conspiracy, but it is
a bit ironic that Google helps websites avoid scraping.

Edit for clarification: I'm not even saying it's wrong, just plain ironic

------
kuschku
I'm wondering if this might not be illegal under several countries' data
privacy laws.

Users need to be directly and visibly be informed, if and when data of theirs
will be transmitted to third parties.

As Google's ReCaptcha is based on tracking what websites you visit, what
search terms you enter, correlating this data, and comparing this tracking
profile of yours, it's quite problematic that the user doesn't even see any
captcha anymore. With the previous captchas, site owners could keep pushing
the legal problems to Google, partially.

But this new solution doesn't fit with the EU Data Privacy Directive in
neither intention nor letter of the law.

IANAL, this is not legal advice. You can not use this in court.

~~~
flamtap
I can see this easily addressed by a TOS. Even the subtle ones like "clicking
that button means you agree to [link to TOS]" covers it, no?

~~~
kuschku
Nope, not at all. ToS terms and EULA terms are basically null and void for
such things under EU law.

You actually have to make a separate, opt-in checkbox, directly informing the
user what will happen.

