I have a feeling this won't last very long now that they've publicized the fact that they profile a users interaction with the system. Things like rate of hitting captchas, mouse movements, characters typed before pressing send, etc are all easy to mimic and control if you know they're analyzing that info.
I thought maybe I was the only one who struggled with recaptcha captchas. On some sites I have literally tried 10-15 times before finally becoming too frustrated and giving up solving the captcha because whatever it was (form to fill out for a download, fill out a comment on a site), just wasn't worth the time I was wasting. Hugely irritating.
Initially, I was using reCAPTCHA on my site. But after hearing from irate users and specially preventing new users from registering, I decided to disable and use a simplified CAPTCHA. Loosing new potential users to CAPTCHA wasn't acceptable.
If you own a website and use Recaptcha for human verification you know it's been broken for years. Whether it's too hard for humans, too easy for bots, or by-passed using 3rd party labor for pennies per solve.
I find it difficult to believe the captcha served over the last year or so were actually scanned from books: they were so completely illegible and nonsensical. I usually had to click refresh about half a dozen times before I could even find a sample that I could read correctly.
Half of the captcha (the illegible nonsense) is the actual test. The other half is usually easy to read; that's the scan from the book. You can actually answer anything for that part and still pass the test, although obviously if you do, you're not helping digitize books.
I figured out a while ago that you only ever need to type the nonsensical string.
I think its pretty clear the reading books bit was abandoned long ago. I never get non-test words that are in any way a struggle for a competent OCR system. And on the occasion that I do, its impossible for me to read either. If they provided context it would be much more helpful.
As an aside, if you've ever had to solve one of these through TOR and you happen to be running through some eastern european countries... good god those are the most frustrating captchas I've ever seen. Long strings of "mnnmrnrmnm" with contrasting colors and jpeg artifacts... a few attempts at solving those makes me want to kill someone. I feel bad for people trying to do anything on the internet from those countries. I wonder what the rationale is for making captchas nearly impossible to solve in specific regions.
It's likely the case that the house numbers are from Google Street View, and they are using them to improve the addressing for Google Maps.
I'm a bit puzzled by this update, though. I have reCAPTCHA on a wiki that I maintain, and I still see the traditional text based ones, not anything like these new number based ones. Are they rolling this out slowly?
While it's nice that they're trying to design the captchas to be more user friendly, these new number captchas are quite easy to crack. This might lead to an uptick in spam as recaptcha ocrs become easier to create.
You are missing the whole point. The new "easier" captchas are shown only to people that they are sure are humans already. I suspect they are looking at the number of previous captchas you have solved correctly, are you logged into gmail, etc.
You're correct. Refreshing the page a couple of times reverts the captcha to the old one with letters. So I guess they keep track of how many times they serve captchas to your ip address in an hour, and if you go over some limit they start serving you the harder captcha .
Google knows your searches, and I am pretty sure google knows every site you visit, via google analytics, which most sites run.
Combine this with browser fingerprinting (your browser's fingerprint is incredibly unique), and the fact that you probably have a google account. There is a high probability they know who you are. From your history they can determine if you're human or not.
I find your statement weirdly amusing. You are deploring a lack of progress regarding something (it is increasingly difficult to tell humans and computers apart) that you would probably view as progress if stated differently (computers are getting increasingly smarter). I feel your pain but this is a very though problem and won't get any easier soon.
Perhaps in some glorious future utopian society, humans won't have to see CAPTCHAs at all.
Of course some bots may make use of high-level browser engines (such as those provided by acceptance testing frameworks) to try and get around this, plus you'll always have cheap human labor. But ultimately, anti-spam is an arms race and simple tactics like this will get rid of most unwanted agents.
What worked in the end was a points system for spammy behavior: First post has URL in it? +1 point. User fills out linkedin field on profile? +1 point (seriously, none of our legit users did this...). User posts a word on the blocklist? (viagra, cialis, cvv2, etc) +1 point. User Agent is IE? +1 point (we're a Mac site). After a certain number of points, the user was banned and all their generated content deleted. After a certain number of posts without triggering the ban, they're greenlighted. Spammers quickly noticed their posts disappeared instantly and left the site.
Umh, anything that will end up in the POST request will be reproduced by a bot, I don't even actually look at the page when implementing screen scraping modules, but just at the Network tab of the Chrome Dev Tools.
What I think have the potential to remove the need for conscious CAPTCHA solving is what Google is supposedly doing here: machine learning on behavioral patterns in the user interaction with the form (instead of just with the CAPTCHA).
No no, the hidden input fields would not be filled in by a human (because they're positioned off screen by CSS, for example), whereas a bot would fill them in because the bot is just scraping the HTML for all input fields and filling them in with something.
It seems like you had that backwards -- hope that clears it up.
It's meant as a defense against non-targeted bots—the ones that roam the web looking for forms to fill out.
I think that's where the issue lies: these tools do different things. Honeypot fields defend against general bots. Captchas defend against specific bots, too, but also have greater friction, so are only used when specific bots are an issue.
I did my mom's website for her small law firm and I was getting tons of bots even with hidden form fields and they didn't look targetted. Captcha helped a lot but the spam didn't stop until I used both combined with special rules, like the phone number field has to have a certain amount of numbers.