If you want to post on 4chan and don't have a Pass, you need to solve a captcha for every single post. It becomes easier with practice, I fail maybe 1 in 10 captchas. And the more captchas you solve correctly, the easier the captchas for your IP get.
>Checking for offensive words is only one of the filters we have. Even without that filter, it’s essentially impossible to get recaptcha to return a false result.
Famous last words :P
Probably the mankind will hate me, but I'm the kind of person who answers correctly the control word, and writes an incorrect, but similar word for the unknown one.
This is my way to protest against recaptcha.
Sure, maybe sometimes you get a weird one and fail it. But typically the next challenge is easy to pass. Seems the author cherry-picked some of the worst reCaptcha examples for the article, but wrote it in a way that made it seem they were presented back-to-back.
Besides this -- the article makes no attempt to offer a better solution.
Captcha's are really the best way we have right now to "prove" someone is not a bot. Hidden Form fields, etc, don't work and are easily spoofed. Sure Captcha's can be beaten by bots sometimes -- but I trust Google's scale/volume with ReCaptcha to handle that for me (for the most part).
Captcha's are not going anywhere anytime soon.
That's completely irrelevant. Criticism is not about solving the problem. It's about pointing out that the current solution is inadequate.
Most movie critics never wrote, directed, or acted in movies. It doesn't invalidate their criticism.
In fact, your criticism of the other poster's criticism doesn't offer a better solution than criticism either. You simply criticize that post. (And that's OK, if ironic.)
I hate with a passion the attitude of "Don't bring me problems. Bring me solutions". Sure, if you've got a solution as well, that's great. But I'd much rather know there's a problem that you don't have a solution to than be completely ignorant of it.
As a tools developer, I observe that user-provided solutions almost never address anything outside their specific problem, which is potentially only one of many things the feature with the issue is designed to address.
Some of my colleagues tend to point to user feedback as gospel, including the ways suggested to "fix" the issue. But those fixes are often myopic and laden with technical debt.
But I will never disbelieve a claim that something is confusing, or hard to use (almost never, anyway; some people are idiots). Just don't be offended if I don't fix it in the way you came up with.
Well it's a pointless criticism when everyone knows CAPTCHAs suck. People have spent a lot of time working on them trying to find better ones that work consistently at scale and have failed.
It's similar to someone now writing a criticism on the testing procedures at the Chernobyl nuclear power plan.
Its turning in to the new version of "3rd world outsourced phone support" A strong indicator the user simply doesn't care about the customer experience.
For some industries / companies, this is perfectly OK and BAU. For others it can be a company-killer.
I see a captcha and I know the company doesn't like me, doesn't like what I'm doing, and doesn't care if I know how they feel about me. And for some situations that's perfectly OK. Certainly not all.
Which is truly unfortunate, as they're a fucking abomination, an embarrassment to the IT industry in general.
Try solving these actual examples:
I don't quite see why people assume recaptchas are real words, they haven't been for a really long time. The control is always a made-up word and almost always solvable. If you can't solve the other one, enter whatever.
First one: a rhaval
Second one: a onsupsel
You need to realize that only one word needs to be entered. For the other word just enter any string (or even no string works in some/all cases).
 the idea was to always write the correct word and just enter "nigger" as the other word. This eventually led to reCaptcha disregarding 4chan answers from the pool of resolving weird words.
However, the whole point of them having the humans do that work is /they do not know the correct answer/. Since they do not know the correct answer they cannot be basing the test of the CAPTCHA on it. From there it is not a big leap to surmise that you are only actually being tested on the word that actually looks distorted on purpose.
It's probably written on recaptcha website or wikipedia article, but basically you get presented one word that machine can't read (the distorted letters) and something out of a book that google had digitized and uses you to ocr it. You also get a lot of address numbers for google street view.
Just input the distorted word and type anything for the other, and it will work. Of course the instructions won't tell you that or else people would act accordingly and google would lose this free labor source for the tedious work of proofreading digitized version of books.
I can totally understand why those would frustrate non-technical users, but on a site like HN I would expect people to know how reCaptcha works.
That said I think reCaptcha has been getting much harder recently due to the arms race with bots. I now sometimes fail 3-4 times in a row.
Why should you even expect that? Recaptcha was interesting in 2006, but anyone not following the news around that time or around the its acquisition by Google might not have learned this.
You're probably right. But the fact remains that captchas aren't good enough. They can be partially automated; blackhats can use captcha solving farms which will be at least as accurate as the average human (probably more accurate, I imagine).
A better solution might employ heuristics similar to DDoS mitigation techniques. I really don't know, but there is a need for something better here.
That might be the problem right here. Try browsing with Tor or passing through an anonymizing proxy. The more you solve correctly, the easier they get. The more unknown you are, the harder.
I personally experienced this and can't wait.
2. Free human OCR
So why don't you stop using Google products if you hate it so much?
I think any site that uses reCAPTCHA must not have any regular vision impaired users.
The mission statement of the reCAPTCHA project is "Protect your website from spam and abuse while letting real people pass through with ease."
I don't think anybody is passing through the audio captchas with ease. Nor are they helping to digitize anything.
I think that the use of captcha's as a reverse turing test was always secondary for reCAPTCHA anyways. If someone can write software to solve the visual captcha's, that is a great accomplishment. Once that is accomplished, we will need a new type of reverse turing test. Perhaps we are approaching that point.
I'm interested to hear the perspective of users with a visual impairment on the audio captchas.
while yours is
The Yahoo captcha used rotating, bouncing letters on a scrolling background of more letters - ridiculous. Microsoft's was just a typical smeared mess, but no easier to actually solve.
I think I failed each at least 3 times.
It's not just difficult captchas, but use of them everywhere. The site my university recommends for ordering textbooks starts inserting captchas if one searches more often than perhaps twice within a minute. Another I can't recall the details of requires a captcha solve to make any sort of profile change despite being previously authenticated.
ReCaptcha only requires one (1) out of two (2) words to be correct in the challenge.
It presents one known-by-the-system-word, and one not-known word. If you get the known word correct (the easier of the two to read) then it passes the challenge.
ReCaptcha then pools the answers for the second not-known word and after pooling thousands (or more) responses, then that word becomes "known" based on the average answers (and then that word is "digitized" and used by google maps, or ebooks, etc).
And for those wondering, I find it easiest to read captcha's by just looking at the letters by shape.
Going down the list in the article:
Again, it's important to note, you only have to get one of the two words correct to pass the challenge. So.. probably 99% of the above list would pass.
Edit in response to your edits
You've deleted your previous edit. Still, even with your current edit it is clear you are not actually reading the article. You say:
However, the author of the article explicitly states that he did this:
I decided to just guess the first word and hope “secretary” was the control. It wasn’t.
So the author correctly identified one of the two words (and makes the same identification as you did), but was still rejected because it was not the control word.
You obviously don't have an accurate understanding of ReCaptcha implementation, and you apparently are not reading the article with comprehension, despite claiming several times that you have.
I do (I've implemented them many times), but no point in arguing.
I must be the only person who finds the level of security a captcha provides worth the 1 to 2 seconds it takes to type in a Captcha. And if done properly, you should only have to type a captcha once per site.
Which is easier? Allow form spam on your site, or have a user type a captcha once the first time they visit and decide to post a comment or something? Captcha's have provided a tradeoff between inconvenience and protecting your site.
Also, you say that you have an accurate understanding of ReCaptcha implementation based on the qualification that you have "implemented them many times". ReCaptcha was created by Google, so unless you work for Google on the team that implemented ReCaptcha, it doesn't seem possible for you to have "implemented them [ReCaptcha] many times".
From the article, in the context of ReCaptcha, it seems like Google has stabbed itself in the face with the sword of data.
Google may think the data is telling it something, but what it's really managing to do is irritate legions of humans with terrible (borderline hostile in this case) UI/UX.
I do have to admit most of those are cases where both words are difficult or impossible. But we can at least assume that the easier of the two (the one not cut in half) is the control in most of those.
The article's largest complaint is not being able to read one (1) of the two (2) words in the captcha challenge.
I was pointing out, that this complaint is not valid since reCaptcha (where all of the article's screenshots are from) only requires one of the 2 words to be correct.
It’s important to note the way reCAPTCHA works. Each user (or bot) is presented with a control word, and a word unrecognized by OCR. This control word is already known to Google (who runs reCAPTCHA). If you get this first word right, it is assumed that you get the second word correct as well. So, in reality, you only need to guess the key word correctly.
The author explicitly addresses your point and if you looked at his examples, most are very difficult for both words. In many of his examples, the control word is distorted beyond reasonable recognition, and the new word is cut in half or worse.
The article does make this point.
After your 'edit' my comment makes no sense.
I don't want to spend more than a second or two working out what a captcha says - if it wasn't something I absolutely needed I'd probably have gone away long before the author's patience ran out.
Basically I have to fill in the number and then guess whether it was the first or second set of characters and fill out bogus before or after the number and hope I got it right. The numbers weren't even hard for a computer to read. The only thing it does is waste everyones' time.
Met a couple people who work at the company (I'm in Detroit and think that's where it is based) at some startup events a couple years ago. That's how I found out about it.
It was for a contact form on a vendor's website. Ended up going with another vendor who had identical product
the on/off metaphor is not clear either - at least make the "login" button be not enabled until the switch is moved
Its sitting on the login form for at least a minute now, filling up the switch background btw
Seems to be doable. The user pays 1 usd/month and gets 100 credits. The extension author can outsource the solving to http://antigate.com/ and get the answer in 15 seconds.
If someone could use [something like this](https://github.com/mekarpeles/captcha-decoder) to make an extension it would be great.
Then we just hope that the spammers create a perfect solver again :)
This is why visual recognition is just one of the signals you need to use to tell humans and computers apart http://googleonlinesecurity.blogspot.com/2014/04/street-view...
And generally it is a very bad idea to choose the most popular service among the alternatives, as by doing so you are contributing to the centralization and monopolization of the Internet.
Think of it in a Bayesian sense.
If 10% of anonymous users end up being bots (the prior), and the "hard" recaptcha has a 1% false-negative (incorrectly identifying someone as a human) rate, then of the anonymous users who succeed in getting past the recaptcha, .1% will be bots (the posterior).
But if 1% of sign-in users are bots (probably less than that), you only need a recaptcha with a 10% false-negative rate to achieve the same bot throughput limit. And, those users are less frustrated.
Pretty neatly conveys the feelings on this topic.
They are getting ridiculous.
One simple way for minimizing junk going through automated submits. Idea without using recaptcha at all:
It works only with JS enabled and uses randomization in order to stop bots learning how to avoid it.
I don't even get the point of it since you can get passed them by just hiring people off like at http://antigate.com/ for as little as 70c per 1000 captchas
Time to switch to next, harder, AI problems as captchas :)
Also a couple of examples http://alicious.com/hard-recaptcha-huh/.
Also, '“Onightsl”? “Onighisl”? Are those even words?' No, my understanding is that dictionary words are never used as the control, so as not to be vulnerable to dictionary attacks.
Edit: I'm not suggesting that these captchas are in any way good; they do clearly have issues. I'm just saying that storyline in the blog post seems contrived. To me it would be more convincing if presented in a more genuine manner. However, perhaps he was simply very unlucky.
There you are, talking on and on and on about some tiny unimportant but extremely specific implementation detail no one should ever have to care about. People shouldn’t have to read a manual about the inner workings of this captcha implementation (and have some experience with what types of text computer vision is good and bad at recognising!) to have any chance solving it.
In this case the author clearly had no idea how that control/unknown system works in detail (it seems like they, just like me, only know that you do not have to recognise both, but they didn’t really understand the reason for that – nor should they have to) but that doesn’t really matter for their argument even a tiny bit.
For me at least, the point would have come across better if that (seemingly) false ignorance were dropped. (Either that, or frame it in terms of, "Here's what an average user sees when they try to log in," or something along those lines.)
Except for some untrusted websites / users who can get really difficult captchas sometimes: https://i.imgur.com/6pAatnC.png
The whole thing is a technology arm's race. The best solution would be one where you simply verify fixed private information. We use captchas for verifying a human being is not a bot, right? And we do that because we assume the user is anonymous for a short time.
Instead we could simply provide a secured authentication gateway where one could provide private information that is linked to a human identity. That way it can't be abused unless they have an unlimited supply of stolen identities. Even better would be if everyone signed up for a TOTP service provider and used their token generator and service-account to prove their human-ness without needing to put in sensitive information. But that's probably too much work.
I know what you're trying to say here, but consider today's xkcd as a counter-point. I think "most people" are quite capable of solving a lot of puzzles. This issue is that any puzzle that we can solve in a reasonable timeframe is often a good target for a computer-generated solution as well.
You can still come up with new ways to verify someone is a human for specific uses where you want anonymity, but they will always be part of the tech arms race if you want them frictionless. To avoid them getting more annoying you need a way to authenticate an individual identity, as that allows you to rate-limit access.
You could, of course, do TOTP and totally preserve anonymity. Unless the TOTP service provider is compromised, in which case all bets are off (but perfect-forward secrecy might solve that?)
I agree that anonymity is orthogonal to the purpose of captchas, but usually a captcha is only required when you don't have identity. This can be because you haven't established identity, or because the identity is in question, but also because the site does not want to require identity. In fact, outside of first time user sign ups, most captchas are used specifically to allow people to engage without needing an account. So in most cases you use a captcha because you want to allow anonymity.
There already exists several systems like you describe: login with your Google account, Facebook, Twitter. There are already several comment systems (Disqus for example) which make using these as simple as using a captcha for sites who don't care about anonymity. We don't need to integrate identity into captchas.
Although...maybe you could outsource the question and answering to Mechanical Turk. Turn the whole thing on its head. Have a real person write a question to try to trick the bot into revealing its botness, have the real human grade the answer.
Out of curiosity, I went and opened the demo page (https://www.google.com/recaptcha/demo/ajax) in a new incognito window and timed myself. I can do about 8/minute at maybe 90% accuracy.
Captchas are only a problem if you compulsively refresh in hopes of getting something clear.
You're ignoring people with visual impairment or cognitive impairments.
If it's those, I guess google uses recaptcha to get data for streetview.
I think we really, really need a replacement solution for them that works as reliably vs. bots.
Scroll to the bottom.
Attacker can choose the frame that's easiest to attack and they can segment better with help of motion vectors and differences between frames.
Can't remember where I saw. Anyone knows?
Edit: From checking the source, it looks like they're using NuCaptcha (http://www.nucaptcha.com/). Looks like O2, Groupon, and StumbleUpon are also NuCaptcha customers. You can see examples on this page: http://nucaptcha.com/features/security-features
Yesterday I had to go through a moving captcha when trying to log into flickr. I got redirected to the yahoo login webpage where I copied and pasted that 20 something random characters yahoo had me working on for a cumulative time of an hour (I had to tweak pwgen to get some reaaally random stuff and yet see yahoo rejecting it because "too easy" and then wait for an hour or two before I could try again).
Then they had me confirm I was not a bot by asking me to type the moving letters in a captcha.
But point taken.
A simple solution is google Authenticator (or similar systems).
The only problem is a system for all kind of users and equipment.