Hacker News new | past | comments | ask | show | jobs | submit login

The article's largest complaint is not being able to read one (1) of the two (2) words in the captcha challenge.

ReCaptcha only requires one (1) out of two (2) words to be correct in the challenge.

It presents one known-by-the-system-word, and one not-known word. If you get the known word correct (the easier of the two to read) then it passes the challenge.

ReCaptcha then pools the answers for the second not-known word and after pooling thousands (or more) responses, then that word becomes "known" based on the average answers (and then that word is "digitized" and used by google maps, or ebooks, etc).

~~~~

And for those wondering, I find it easiest to read captcha's by just looking at the letters by shape.

Going down the list in the article:

onightsl secretary.

. phaRega

o ndaaar

proximity rsgrrem

and khseeke

. azedcg

elearsal 5

ination amesye

se ebtyR

Reomi now

ivestshm nwre

Again, it's important to note, you only have to get one of the two words correct to pass the challenge. So.. probably 99% of the above list would pass.




No, the problem the author mentions is explicitly with ReCaptcha. He addresses your edit in the article, which you would know if you actually read it and didn't just skim. The problem, as evidenced by the author's many examples, is that the control word is often distorted beyond reasonable recognition, and the new word is not valid data. So neither of the two words is solvable.

Edit in response to your edits

You've deleted your previous edit. Still, even with your current edit it is clear you are not actually reading the article. You say:

Again, it's important to note, you only have to get one of the two words correct to pass the challenge. So.. probably 99% of the above list would pass.

However, the author of the article explicitly states that he did this:

I decided to just guess the first word and hope “secretary” was the control. It wasn’t.

So the author correctly identified one of the two words (and makes the same identification as you did), but was still rejected because it was not the control word.

You obviously don't have an accurate understanding of ReCaptcha implementation, and you apparently are not reading the article with comprehension, despite claiming several times that you have.


> You obviously don't have an accurate understanding of ReCaptcha implementation

I do (I've implemented them many times), but no point in arguing.

I must be the only person who finds the level of security a captcha provides worth the 1 to 2 seconds it takes to type in a Captcha. And if done properly, you should only have to type a captcha once per site.

Which is easier? Allow form spam on your site, or have a user type a captcha once the first time they visit and decide to post a comment or something? Captcha's have provided a tradeoff between inconvenience and protecting your site.


Pardon my confusion, but wasn't your original comment arguing against homebrew Captchas?

Also, you say that you have an accurate understanding of ReCaptcha implementation based on the qualification that you have "implemented them many times". ReCaptcha was created by Google, so unless you work for Google on the team that implemented ReCaptcha, it doesn't seem possible for you to have "implemented them [ReCaptcha] many times".


Minor point, but reCAPTCHA was purchased by Google, not created by them.


I don't think you understand how to implement a captcha on your website. Unless you use a 3rd party CMS where implementing a captcha is just a checkbox and pasting in your api key, then it's a lot more work.


Don't re-invent the wheel? That is, please continue to give useful data to Google for free?

From the article, in the context of ReCaptcha, it seems like Google has stabbed itself in the face with the sword of data.

Google may think the data is telling it something, but what it's really managing to do is irritate legions of humans with terrible (borderline hostile in this case) UI/UX.


Yes, please do. Because Google has and will continue to make way better use of that data than anyone else.


The original article was specifically about reCAPTCHA, not homebrews, and how difficult they now are (something I've also noticed lately). Either give it a (re-)read, or if you're saying you were able to easily read the examples in the article please share the answers! :)


ReCaptcha only requires one (1) out of two (2) words to be correct in the challenge.

It presents one known-by-the-system-word, and one not-known word. If you get the known word correct (the easier of the two to read) then it passes the challenge.

ReCaptcha then pools the answers for the second not-known word and after pooling thousands (or more) responses, then that word becomes "known" based on the average answers (and then that word is "digitized" and used by google maps, or ebooks, etc).


Again, sorry, I've got to point you back to the original article. The author explains the details of reCAPTCHA's known/unknown word-pair clearly, as you have done, but goes on to explain that the impossible-to-read word was actually reCAPTCHA's "known word", so the CAPTCHA was impossible to pass.


Yes, but the author actually only tried once. Every other example he gives, he claims he hit refresh to get a new one instead of attempting it. Also, I have to wonder if he was simply mistaken about that first attempt. Are we sure it didn't just fail because his username/password was wrong and display a new captcha, causing him to assume he had failed the first captcha?

I do have to admit most of those are cases where both words are difficult or impossible. But we can at least assume that the easier of the two (the one not cut in half) is the control in most of those.


You might want to read the link...he's complaining about reCaptcha. The basic complaint is that the ability to recognize letters is no longer good enough to distinguish human intelligence. We need to identify some other trait that's simple for humans and difficult for computers. Those "home grown" captcha solutions are likely better in this regard from reCaptcha since they don't have the possibly contradictory goal of digitizing books.


Please re-read the article


I have.

The article's largest complaint is not being able to read one (1) of the two (2) words in the captcha challenge.

I was pointing out, that this complaint is not valid since reCaptcha (where all of the article's screenshots are from) only requires one of the 2 words to be correct.


You must not have read very carefully then. From the article:

It’s important to note the way reCAPTCHA works. Each user (or bot) is presented with a control word, and a word unrecognized by OCR. This control word is already known to Google (who runs reCAPTCHA). If you get this first word right, it is assumed that you get the second word correct as well. So, in reality, you only need to guess the key word correctly.

The author explicitly addresses your point and if you looked at his examples, most are very difficult for both words. In many of his examples, the control word is distorted beyond reasonable recognition, and the new word is cut in half or worse.


Actually, reCaptcha requires a specific word to be correct. Specifically, the illegible one.

The article does make this point.


You must have missed the part where he tried to guess with the easier of two words and still failed. The control words are rarely any easier to read than the unknown word.


Your original comment said that the biggest problem was home-grown captchas, and that people should use something established like recaptcha - the article was specifically complaining about recaptcha.

After your 'edit' my comment makes no sense.

I don't want to spend more than a second or two working out what a captcha says - if it wasn't something I absolutely needed I'd probably have gone away long before the author's patience ran out.


Google exploits ReCaptcha to recognize street numbers. When typing a numerical ReCaptcha, you are doing OCR for Google Maps for free.


Well, not exactly for free. Its a trade. You're willing to donate your small amount of time to Google in exchange for Google providing security benefits to the website which you are attempting to use.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: