
Beating Google ReCaptcha and the funCaptcha using AWS Rekognition - nailer
https://bitbucket.org/Pirates-of-Silicon-Hills/voightkampff
======
Hitton
Interesting. Two things that I found interesting:

1) ip addresses he uses have probably really good reputation, because even
with rather bad image recognition ability it still lets him in.

2) surprisingly enough google's captcha apparently doesn't consider mouse
movements at all - the bot selects images rapidly one immediately after
another and always clicks in the same place of image.

~~~
lights0123
> surprisingly enough google's captcha apparently doesn't consider mouse
> movements at all - the bot selects images rapidly one immediately after
> another and always clicks in the same place of image

Yep, it's Google cookie based, not behavior on the page. You can only use the
tab and enter key to successfully complete the captcha if you have enough
Google cookies.

~~~
jb_s
isnt it ultimately just a series of API calls to solve the captchas? whats to
stop someone hooking the javascript, solving the images and then raising the
right event ?

~~~
milankragujevic
I would assume Google would send all mouse events to the server, then compare
them with all other mouse events from all users who solve reCaptchas and then
decide on a "humanness rating" of the data internally and use that alongside
all other parameters to decide the probability the user is or is not a bot.
You can't fake that, you can send fake events but you can't trick Google for
long enough, sooner or later it would figure out a pattern and mark that
pattern as a bot. Well, if it did track mouse movements, which it apparently
does not. But it can start at any time.

~~~
guru4consulting
probably Google is already working on it.

But what if the device is mobile or tablet (or touch screen laptop) where
captchas are solved without mouse movement? I guess that could be the reason
why Google has not implemented it yet.

~~~
milankragujevic
Possibly. They could also track device information like motion sensors, which
are available without a prompt by default on Chrome, and use the motion
information from the device to decide on the randomness of the movement. In
the worst case, they can just show you 5 walls of captchas and pass you.

------
nkingsy
Step 1: Use Captcha to get free AI training data.

Step 2: Sell improved AI.

Step 3: Add edge cases to the Captcha that your AI can't handle yet. Go to
step 1.

~~~
EGreg
Can someone please explain to me why smart people keep saying that we’ll just
build software to detect deepfakes?

When AlphaZero plays against itself it just gets better. Once that spread is
so narrow, every deviation is in the range of possible error. And it can’t
explain WHY a certain move is better or why the picture is a squirrel. And
sometimes it can be wrong, but only on hilarious edge cases, in the rest we
just “trust it” because we dont know one way or the other!!

Once deepfakes have covered every possible thing that could give away the
deepfake, what remains is dubious arguments as to WHY something is or is not a
deepfake. It could be wrong or not, we would just “trust it”?

I am saying that if there a range past the uncanny valley for adversarial AI
then once the generative network is there, it’s game over. They can quickly
generate any amount of speech said by anyone for example, and we will not know
whether they said it or not. All audio and video evidence would be
inadmissible without watermarks. And then we would have to trust whoever made
the watermark.

In short - mutually distrusting byzantine consensus or signing will be
required for any claim.

~~~
Freak_NL
Sci-fi author John Varley covered this in one of his novels.

In a society where anything digital can be manipulated, the veracity of a
speech or event depends on a network of trust and people who witnessed the
event _in real life_. If someone made a speech in front of hundreds of live,
physically present spectators, these spectators can then verify the
authenticity of any recording, and as long as there people in that group that
you trust (directly, through your trusted network, or perhaps as journalists
of good standing) you can trust that the recorded speech is what was said.

Conversely, a recording of a world leader saying they recommend Coca-Cola to
prevent tooth decay wouldn't have any credibility at all if no one credible
was actually there to witness it.

~~~
NoNotTheDuo
What happens when there's a global pandemic and everyone is forced to stay
home and not attend witness any of these speeches, but instead has to watch
the broadcast over the internet and/or TV?

~~~
ByteJockey
Then we just generate the speeches and let public figures do more work dealing
with the pandemic, because it doesn't matter at that point.

Ideally anyways. Realistically, we probably just argue on the internet about
which talks were real (in this case generated by the person pictured in them).

------
jacquesm
At some point Captcha's will be easier to solve with software than by people.
Hopefully by then we can retire them completely. The onus for proving that a
visitor is not a bot should be on the server side, not on the humans.

~~~
idunno246
google is already trying to retire them. v3 has no challenges, they just
generate a score completely transparent to the end user

~~~
ronsor
"Transparent" until Google doesn't trust you for some insane reason.

Now I can be denied access because of an opaque score, with no way to get
around it.

Yay, progress!

~~~
eggsnbacon1
yup. Setting some tor privacy extensions in Firefox already locks you out of
Google login. Yay!

------
spullara
I remember a Yahoo hackday from around 2009 where one of the teams created a
captcha that required you to match photos to tags. Another team made a program
that could look at a photo and assign tags. Mutually assured destruction.

~~~
normanmatrix
Lo and behold the wonder that is adversarial AI - 11 years later we can
automate this. My guess is that Google is quite happy about open source
Recaptcha solvers. Delicious fodder for improving the adversary.

~~~
spullara
There were both automated 11 years ago using Flickr data.

------
ilovefood
Woah, I never seen something like this done with bash and command line
utilities. Thanks for sharing, made me find xdotool which is exactly what I
neede to replace SikuliX. I can't comment on the project since I didn't test
it but wow, only this example [https://bitbucket.org/Pirates-of-Silicon-
Hills/voightkampff/...](https://bitbucket.org/Pirates-of-Silicon-
Hills/voightkampff/src/7b01ceabbe198a2905bd9445ff3bfece51735bed/main_visionAPI.sh#lines-806)
made me think why didn't OP use a scripting language and went with Bash?
(Python?) Don't get me wrong, the Bash level is strong with this one :)

~~~
WaxProlix
Sikuli has the benefit of being fairly cross-platform, so scripts written for
X will also work in Windows and in theory other more esoteric window manager
environments.

It also has the ability to identify visual cues, which this doesn't seem to
do.

~~~
ozymandias2049
Sikuli is not fast enough. I have know about it since it was developed at MIT
because I was there in 2010

------
clairity
while neat, it's a boon to spammers and other nefarious actors who can use
these techniques to further get around captcha... concerning to say the least.

if we want to defeat time-wasting and privacy-invading captcha and the murky
ethics aren't a concern, then we should go to every e-commerce site that
employs them, with full crap-blocking privacy mode on, load up on
products/services, and then abandon our carts at the captcha.

maybe we can have botnet operators take this idea and run with it, despite the
murky ethics.

~~~
ocdtrekkie
I am perfectly fine with rendering CAPTCHAs absolutely useless for bot
prevention if it expedites websites finally getting rid of the absolute pox
that is reCAPTCHA.

~~~
johnward
Yeah. Text captcha/Recaptcha can be solved by spammers. The more advanced the
more cpu time it burns up but it still does little to stop them. It just makes
it super inconvenient for me to solve. Sometimes they aren't even human
readable.

~~~
moooo99
In the worst case, there are people in poor countries solving those chaptas
for relatively little money. Probably not worth for everyone, but when
operating a botnet on a large scale, thats probably a pretty attractive option
as well.

------
gruez
I wonder how good it's at solving the "hard" (with noise added, eg.
[https://i.imgur.com/jbf0Xfy.png](https://i.imgur.com/jbf0Xfy.png)) captchas.
I went through the "past puzzles" zip, and they look easy, compared to the
"hard" ones you get if you're on tor (or otherwise "shady" network) or the
site owner has turned up the difficulty setting[1].

[1] [https://stackoverflow.com/questions/23314528/how-to-
reduce-r...](https://stackoverflow.com/questions/23314528/how-to-reduce-
recaptcha-difficulty-level)

~~~
Shared404
> compared to the "hard" ones you get if you're on tor (or otherwise "shady"
> network)[.]

In my (limited) experience they serve you "hard" ones, and then continue
serving you "hard" ones until you give up and go do something else.

~~~
jaggirs
Interesting, Google technically has the power to ban whoever they want from
any website that uses their captcha.

All they have to do is give out impossible captcha's.

------
hinkley
Is there really no way to passively collect enough data from the user agent to
determine the likelihood that an actual human was involved?

I suppose there are a bunch of privacy and corner cases that would mess that
up (for instance, composing a reply in a text editor and pasting it would be
indistinguishable from a bot).

Has anyone tried? Are there write-ups I could read about?

------
eeegnu
They mention their own captcha system but don't really reference it anywhere
as far as I can tell. I'd guess it'd be based on something like ARC (Abstract
Reasoning Corpus), instead of object recognition which already have decent
methods of solving if you can get enough data.

~~~
hoerzu
couldn't find anything...

------
sktguha
Did you beat recaptcha v3 also ?

~~~
ozymandias2049
See videos for Vision API. Score over 0.7 for 40 hours

------
dylan604
Are we saying that the Google image recognition is inferior to Amazon's with
this?

------
knoebber
Does it beat ReCaptcha v3?

~~~
ozymandias2049
Apparently yes.V3 is just the system used by V2 to give puzzle difficulty. I
scored 0.7 for 40 hours on Vision API videos.
[https://www.youtube.com/watch?v=7tPcs06fgbg](https://www.youtube.com/watch?v=7tPcs06fgbg)

Go to the channel for the other 3.

~~~
zikohh
Thank you for the link, I could t find it!!!!

------
jaflo
What is the Project Touch-Captcha (두 터치) mentioned in the readme?

~~~
ozymandias2049
I will announce in a couple days

------
pknerd
Does it work with Invisible Captcha? I do not think so.

------
ozymandias2049
Media Inquiries: pirates.of.silicon.hills@gmail.com

------
yters
what we need is an unbeatable captcha

~~~
echelon
Pay a micro transaction to perform an action. In the end it's more about cost.

