Hacker News new | past | comments | ask | show | jobs | submit login

There was a good podcast about this [0] just a couple weeks ago. They interviewed the guy who invented CAPTCHA as well as the head engineer on ReCaptcha v3.

The gist of it was that in a few years, all Captchas will be useless because machine learning is too easy and cheap. The only way to defeat spam will be to use reCaptcha v3 or something like it, because those services will use what they know about you to determine if you're a bot or not, plus their own machine learning of what "normal" behavior is for your website. It sounds like ReCaptcha v3 is basically an app level IDS.

[0] https://www.npr.org/sections/money/2019/04/24/716854013/epis...

> The only way to defeat spam

No, no, no. There are many ways to combat spam, and there is no silver bullet. "Determining whether you're a bot or not" is just one tool among many, and one that lets human spammers through, or gets too intrusive and starts blocking humans.

IMO the best approach still is to focus on the content they post rather than trying to figure out who/what they are.

Google wants you to think reCaptcha is the ONLY tool. That way they can get more user data.

> those services will use what they know about you

This is inherently user-hostile, as it presupposes tracking and identification. I don't want them to know anything about me!

Which is kind of the trick - there may be a point not to far in the future where it is nearly impossible to tell the difference between hostile bots and users who are just really into privacy and not being tracked.

I'm fairly certain that Google doesn't care.

It's not about machine learning, it's about a spray and pray business model.

If your goal is to generate, say, 5,000 spammy backlinks you're going to have the choice of building smarter and smarter bots to bypass CAPTCHAs and filters, or just tossing the same dumb bots at a wider pool of target sites. The latter is always cheaper, if you're focused around your basic blog-spam sort of scenario.

I could see it different if you had a specific high-value service that was worth bot writers targeting-- think of registering email accounts en masse, or an ecommerce site getting thousands of test charges an hour on stolen cards. But even then it's still about just a matter of being "faster than the other guy the lion is chasing" -- you just need to be inconvenient enough that the malicious user finds a more accomodating service. That needs little in the way of an AI arms race, it can be something as simple as rate limiting.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact