Hacker News new | past | comments | ask | show | jobs | submit login
The end of the road for Cloudflare CAPTCHAs (cloudflare.com)
44 points by jgrahamc on April 1, 2022 | hide | past | favorite | 25 comments



Can somebody tell me why we don't just classify bots as legitimate users and get rid of all this shit? If you want to prevent insane amounts of traffic, you can already use rate-limiting strategies. Why does anyone care if it's a bot or a person? Are we just unable to scale our systems to meet such demand? Cloudflare even offers queuing in extreme cases, if I'm not mistaken.

I have a silly anecdote. So, my wife wanted a Louis Vuitton purse - one of the more modest ones, which is relevant.

Being more affordable, the demand for it is amongst the highest of their offerings presumably. But, they artificially create a scarcity scenario so it's more desirable.

This then leads to bots farming the page endlessly and invididuals struggling to get them. So, the bots win anyway.

Why don't they just accept a waitlist if they're going to be so silly about it? Or verify you are a human being using some kind of private identity service when you attempt to purchase your cart? How does throwing up blocks on the page solve anything?

Edit: Added anecdote


It's not (just) about load: it's about unwanted scraping, retail bots grabbing inventory before humans can, credential stuffing done slowly and widely. All manner of unwanted interaction with a website or API.


There are other solutions to at least one of the cited problems. I added an anecdote above and I'm hoping to see what you think.

I would challenge that there is no such thing as unwanted traffic if it's publicly accessible - If you want it to be stricter, you should be requiring accounts with strict purchasing limits matching that of a normal person by default. No "anonymous" purchases, and a more appropriate identity vetting on account creation.


I think the weakness in your assumption is that CAPTCHAs exist solely to solve retail scalping by bots.

It serves a much larger purpose - mitigating bot load in aggregate across an entire domain or network. Maybe you’re assuming that DDOS/bot attacks are rare? Because they’re not (1). If you let all of them continue willy-nilly, obviously it’ll get overwhelmed at some point? Plus, why waste so much server energy/time serving malicious load…

(1) https://blog.cloudflare.com/network-layer-ddos-attack-trends...


The consumer web is an all-you-can-eat buffet and bots have unlimited appetite. Also, you can’t monetize their robotic eyeballs. So the economics wouldn’t work, mostly


A large part of the internet is after "growth and engagement". They don't have a real business model where the consumer pays for the service (otherwise it wouldn't matter whether the paid content was accessed by a human, bot or even alien), so the best they can do is either "grow" which means recruit human users (or at least with enough plausible deniability to claim they're all human - whether they actually are doesn't matter) so they can get more investors to pour money into the dumpster fire (though thankfully that strategy seems to have died down now), or obtain an ever-increasing amount of "engagement" which involves wasting human user's time so they can sell that crowd to advertisers.


Note that captchas are a symptom of anonymous web browsing and privacy.

If you are logged in to an account with a company, they can use your entire user interaction history with that company to decide if you're human or not. That means they won't have to make me click trains.

With my permission, they can also share that info with third parties. I'd be completely happy to trust the right company to tell the world that I'm human. They would only be attesting that I am human, not which human I am, so there isn't any substantial loss of privacy.


Something along those lines is actually highlighted for Private Relay connections:

“All connections that use Private Relay validate that the client is an iPhone, iPad, or Mac and that the customer has a valid iCloud+ subscription. Private Relay enforces several anti-abuse and anti-fraud techniques, such as single-use authentication tokens and rate-limiting. This is designed to ensure only valid Apple devices and accounts in good standing are allowed to use Private Relay.”

https://developer.apple.com/support/prepare-your-network-for...


This is what I'd like to see attempted more seriously, as I think it solves the problem more appropriately.


Managed Challenge

Depending on the characteristics of a request, Cloudflare will dynamically choose the appropriate type of challenge from one of the following rotating actions:

  * Show a non-interactive challenge page (similar to the current JS Challenge).
  * Present an invisible proof of work challenge to the browser.
  * Show a custom interactive challenge (such as click a button).
  * Show a CAPTCHA challenge.
This doesn't seem like the end of the road. Is this still gonna suck behind my corporate firewall?


I want to know more about

> Present an invisible proof of work challenge to the browser.

Obviously they're not going to be mining bitcoin, but what could they possibly be doing for "proof of work" that proves you're human?


I imagine this would be used in the case where the content doesn't care about human eyeballs, and just needs a semi-hard rate-limit.


Why is proof-of-work the best model for a rate limit? What's wrong with timers?


Not sure what the reasoning is, but off the top of my head: storing a timer per request is a non-trivial (and flat!) burden on the CF server, while proof-of-work puts a burden on the client proportional to how hard they're hitting it.


Do you not have to store the value to check the proof of work against?


Isn't usually verifying that a PoW solution is correct much less resource intensive than calculating the solution? And if the request format includes the challenge along w/ the solution, the server doesn't need to do any calculations until the user finishes the challenge. (A lot of this is implementation dependant of course, but this is my high level understanding)


The client could receive a signed problem to solve and then send it back with the answer. Any server could verify it.


I wish they'd stop using IP address as a signal. My ISP is a carrier grade NAT. Cloudflare captchas are the main problem with using Tor or a VPN.

Forcing sites to require JavaScript is even worse.

Together, captchas and the new replacement from the article mostly just undermine client side security, which opens sites to much jucier attacks than unauthenticated bots reading from CDN cache!

On top of that, I regularly fail Captchas. I'd happily pay for a browser plugin or something that would have my computer complete them for me. It's probably better at it then me!


Reducing challenges to real people is everyone's goal, including the goal of everyone working on modern CAPTCHA / bot mitigation platforms..

And no one will ever succeed at bringing them to zero.

Perennial favorite explainer on the topic: https://www.hcaptcha.com/post/why-captchas-will-be-with-us-a... Why CAPTCHAs Will Be With Us Always

(disclaimer: work on this stuff)


One thing I couldn't find in the post is how big the estimated error rate is. Particularly false positives or negatives. Probably impossible to track by definition, but I'm not so much interested in reducing the number of captchas as I am in ensuring bad actors are kept out, even if it has the unfortunate side effect of inconveniencing legitimate traffic.


Captchas don't keep bad actors out though. There are plenty of Mechanical Turk style services that will solve captchas for close to free.

I might not want to pay slave wages to people to mindlessly click buttons, but I'm not a bad actor. (Also, these services have serious security / usability issues for pretty much everyone except bad actors.)

The captcha war was lost long ago.


Are Challenges / CAPTCHAs on by default?

I see in the article reference to specifying what kind of challenge in the context of Firewall Rules.

But what happens if I’m a new Pro customer, are Challenges on by default? Said differently, if I sign up example.com on a Pro plan and make no changes to my settings - would potential users who visit example.com be Challenged?


april fools?


They've made a point of doing launches on April 1st in the past. For instance, 1.1.1.1 was launched on 4/1


No, we don't do that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: