
Google ReCaptcha targeting non-Google browsers - hashhar
https://www.reddit.com/r/firefox/comments/9vmsyp/google_recaptchas_targeting_firefox_and_other
======
dual_basis
It used to be that I never had to fill out reCAPTCHAs, but now I get prompted
almost 100% of the time, even when I am logged in to my Google account. I'm
not doing anything unusual, and the only other person on our home internet is
my wife. (Besides, I still get prompted to fill them out on other internet
connections anyway.) Some of them take up a minute or two, especially the
image segmentation ones (apparently I disagree with the general population on
where the dividing lines should be).

Is there anything I can do about this? I'm being held hostage for about 20
minutes a day to help train Google's AIs. I used to think CAPTCHA was
excellent - we were fixing up OCR from books and allowing the library of
congress to be digitized and available to all. This gradually morphed, and now
(as I understand it) the benefits from this human training are kept locked
behind Google's walls. If they at least released the resulting dataset it
wouldn't bother me as much.

~~~
beatgammit
I started putting some BS answers, and they seem to be accepted. It seems you
need to select at least 2 tiles, but not necessarily the right ones.

It doesn't get rid of them, but I can usually get through them a bit faster
since I don't need to actually look at the image or text.

Relevant XKCD [https://xkcd.com/1897/](https://xkcd.com/1897/)

~~~
dual_basis
I mean, the reason this works is generally because there is some information
which Google does not have, that's why they are asking you to provide it. So,
for instance, in the case of words, they used to present two words which were
scanned in, one which they knew was correct and one which they didn't know. As
long as you enter the one they know correctly, they assume you've also
answered the one they don't know correctly. (Maybe they do some variant of
this, but that was the general idea.)

I assume they're doing something similar with the image segmentation tasks, so
if you randomly select two tiles and click "next" there are some images where
it actually doesn't know if you are correct or not, so it ends up storing your
response and lets you through.

------
breakingcups
There is something incredibly worrying about the fact that choosing to not
have a Google account means a large portion of the web is walled off from you.

Choosing to not have a Google account means you are co-opted into doing labor
for free (helping their image classifiers recognize cars, shop signs, etc.)
for Google. Have a browser with tracker blocking built in? Aah, too bad.
Please "solve" these 5-10 pages of work for our algorithm. Don't want to? Too
bad, you can't use this site.

And yet, if clients come to me looking for a good solution to determine who is
a bot and who isn't, I don't know what else to recommend. Other Captcha
solutions seem completely inferior.

~~~
dual_basis
I agree completely. The worse this gets, however, the less reCAPTCHA makes
sense to me. The main reason to use it was the fact that it worked invisibly
for users it deemed as not being a bot. Based on the responses here (as well
as my personal experience) this seems to work less and less now - maybe
partially because Google wants more users to train it's algorithms. Therefore
there is not much to distinguish reCAPTCHA from some other CAPTCHA system,
including simple ones you could make yourself. The image recognition stuff
requires more work but, arguably, we are even worse at basic NLP tasks, which
are even easier to make CAPTCHAs for. All you need is some private dataset,
which you can create manually fairly quickly, and a corresponding task. For
example:

Select which of the following are fruits:

1\. banana 2\. firetruck 3\. baseball 4\. apple

On top of that, if there were independent implementations of CAPTCHA systems
the exact mechanics of interacting with the CAPTCHA would be sufficiently
different that the very act of submitting responses would, itself, be a bot
test.

------
Tsubasachan
I care quite a bit about privacy on the internet. If you do things like use a
VPN, delete cookies and not let your browser leak information Google will make
your user experience hell.

~~~
heroprotagonist
I haven't found a perfect solution to avoiding tracking, but I frequently look
around for new ways to improve my privacy.

One interesting thing I have found was to use Firefox's container tabs with
the Temporary Containers extension and a per-domain isolation configuration
which will put keep Google search in an temporary isolated container but still
let me use mail.google.com in a persistent container.

Combine this with 'Google search link fix' so that search result page no
longer uses local stubs to track links followed. Configure Temporary
Containers configuration for Google to always open links from there in a new
temporary container.

Then use uBlock Origin to not load Google's analytics when you're on the
remote page.

Hopefully this prevents Google from tracking search history, though it's
possible they could use some IP and browser fingerprinting to establish a
weaker correlation on a shadow profile.

I'd love to also have per-container proxy configuration which would use a
random proxy per temporary container, but I guess that's not possible yet.

I haven't looked much into ways to randomize the browser data sent to server,
but its also something I'd like to be able to do if it were automatic and
random enough (eg, a per-request randomization of user-agent and reporting of
things like fonts, plugins, screen resolution, etc, which are used as
correlating factors in fingerprinting tools).

------
jocoda
On principle I don't use Chrome/chromium, unless a project requires it. It's
not a bad browser but I wonder if running a google executable on your computer
is like inviting a thief into your home. I still remember when chrome first
launched and they would check for updates multiple times a day, every hour or
something like that. And what's this virus scanner they offer?

Anyway, I run ff most of the time, and there's a site that I use daily that
requires a login. Used to be I would receive a recaptcha challenge maybe one
or twice a week, even though logged in. No problem, probably using recaptcha
to throttle bots I think.

For the last week plus there's been a fundamental change and I'm receiving
multiple recaptcha challenges in the same session, even though I'm signed in,
max so far has been 4 in the same session covering maybe 30minutes or so.

Not sure if it is the site itself which is to blame for a poor implementation,
or if this is google.

But co-incidentally this happens just as google announces recaptcha 3.
Conspiracy nuts might say this is google leveraging their effective captcha
monopoly to drive users to chrome.

~~~
dual_basis
I use Chrome, and I also am getting an annoyingly large number of reCAPTCHA
requests.

That said, I don't think it's that much of a conspiracy. It is easy to defend,
from Google's point of view - if you use Chrome they have much more
information about your web traffic to therefore discern if you are a bot or
not. I could absolutely see them implementing rules which, while they do not
explicitly prioritize Chrome traffic, in practice end up doing exactly that.

------
adtac
A showerthought I recently had: as long as CAPTCHAs exist in their currently
inefficient form, it could be argued that we'll never have strong artificial
intelligence because CAPTCHAs are basically reverse Turing tests. I haven't
developed this thought thoroughly, but I thought it'd be interesting to share.

~~~
jobigoud
The T in CAPTCHA stands for Turing Test.

The moment we can't make them efficiently distinguish bots from humans would
be the moment the test is passed.

------
jammygit
For the last 6 months its been taking 10-30 minutes to log in to certain sites
due to the captchas. Its like an entire afternoon after dinner to log in. I
use a VPN, which is the cause, but its gotten insane. How much bot spam are
sites getting now that this became necessary?

~~~
leibwiht
It's even worse for Tor users, logging into Discord makes me solve about ten
or twelve rounds of reCAPTCHA. It's maddening.

~~~
gcb0
and they ate not even careful with their data sources.

most of my IPs run tor relays without any exit traffic, and google still flags
the whole company for 30 minutes of their image classifier training every time
each time a captcha shows up.

I wonder if I can start to write that time off as a donation to google image
classifier program for the IRS

------
DeathArrow
Are there any decent alternatives to invisible reCAPTCHA?

