Hacker News new | past | comments | ask | show | jobs | submit login
Tell HN: Cloudflare verification is breaking the internet
615 points by statquontrarian on April 28, 2023 | hide | past | favorite | 334 comments
Across many different pages including science journals, ChatGPT, and many others, CloudFlare verification goes into an infinite loop of:

1. "Verify you are a human"

2. Check the box or perform some other type of rain dance

3. "Please stand by, while we are checking your browser..."

4. Repeat step 1

I'm on Fedora Linux 37 using Firefox 110.

The workaround is to use Chrome.

After experiencing this dozens of times and getting annoyed of needing to use Chrome, I finally went and deleted all my cookies and cache which I had been dreading to do.

It did not help.

I don't have a CloudFlare account so I wrote up a detailed post on their community forums. I offered a HAR file and was willing to do diagnostics. It received no responses and it was auto-closed.

It's unacceptable that CloudFlare is breaking the internet while offering no community support.

Edit: I'm in Texas. I'm not using a VPN or Tor, just AT&T Fiber. I don't have ad-blockers. No weird extensions. Nothing special (besides being on Linux).

Edit2: Since this got traction, I opened a new community post: https://community.cloudflare.com/t/infinite-verify-you-are-a-human-loop/503065

To be clear, I'm not against CloudFlare doing DDoS protection, etc., but it can't be breaking the internet while ignoring community posts on it.

Edit3: The CloudFlare team has engaged. Thank you HN!





The purpose of CAPTCHA is supposedly to test if human or a bot, not to break or violate user privacy protections. It appears Cloudflare and others rather push the dangling of websites as "carrots", and see if they can get users to disable their ad blockers or any other privacy protections to get access.

The Cloudflare verification has become a sick or sadistic joke now. It's often just used to annoy people, and no matter if they pass the tests, denies access anyway. If the test is not going to determine access, then don't provide it, and just wholesale be up front on mindlessly or frivolously blocking people and entire IP ranges.


I thought the purpose of captcha was to train AI


Cloudflare's captcha alternative Turnstile doesn't have anything to train ai on, no images, descriptions or anything else really, its just a single click.


There's a natural contradiction between security and privacy.

For security, an actor needs to be tested and marked as secure, or else tested again before every interaction.

For privacy, an actor must not be marked, lest observers could correlate several interactions and make conclusions undesirable for the actor.

It does not make the infinite loop produced by CLoudflare any more reasonable though.


Ever heard of zero-knowledge proofs?

CloudFlare claims to support Privacy Pass, which is supposed to use a zero-knowledge scheme to solve for this for Tor users.

Unforunately, the integration has been broken for a very long time and bug reports aren't tended to.

https://blog.cloudflare.com/cloudflare-supports-privacy-pass...

https://privacypass.github.io/

https://github.com/privacypass/challenge-bypass-extension/is...


I don't understand why an actor needs to be tested and marked as secure on first interaction. There must be signals so that the server could initially trust an actor in some case. For example, why can't the server trust a never before seen IP attempting to sign into an account that hasn't been experiencing incorrect password attempts? Is Cloudflare just a case of a one size fit all solution?


the problem is it's too easy to make a botnet attack a sure by having each computer try a password for a unique account once per day. this wouls let you get a few million chances per day or website at guessing user passwords without detection.


In theory, this could be countered by moving to one wrong password attempt per IP over any web site protected by Cloudflare. I have a better understanding of the threat and there might be other drawbacks.


I disbelieve there is no way for a client to prove that it has been challenged and cleared in the past without disclosing a persistent unique identifier.


Without a unique identifier, it would be easy for an attacker to clear one challenge and use the result for all nodes in a botnet.


Why can't the identifier be merely yet another bit of data whose existence and properties can be proven by cryptography without transmitting the data itself? It's done all the time with other data.


He's saying that won't work, because the goal is not actually to fingerprint or mark users. It's to ensure that the thing connecting to their servers at that moment is a web browser and not something pretending to be a browser. Give away tokens that say "i'm a browser honest" and they'll just get cloned all the bots.


Rate-limit the number of different source IPs that the token can be used from within a given period of time, or the number of requests per second that can use that token without having to re-verify?


If they can track the token that way, that blows the whole point, the token becomes a persistent unique id.

The idea was to prove that a token exists without disclosing the token itself, nor any sort of 1:1 substitution.

That sort of thing is definitely possible, that's not the conundrum. What they said is one of the conundrums I have to admit. If the server doesn't know who the user is, then the server doesn't know it's a valid user vs a bot.

But I only agree it's a problem. I don't agree it's a problem without a solution.


I'm at a loss for how this could be implemented reliably (where it never fails to stop bots). Ideas?


I don't think the burden of proof/R&D is on us. But there are many smart people around, I'm sure Cloudflare can pay some of them (even more surprising things are possible with cryptography).

One far-fetched idea is to use ZKP proofs to prove that you were verified, without disclosing anything your identity. But that's likely overkill.

Anyway, I think Cloudflare already works on something better with turnstile, the "privacy preserving captcha" and private access tokens [0].

[0] https://blog.cloudflare.com/turnstile-private-captcha-altern...


What do you see as the problem with this attempt?

https://privacypass.github.io/


It allows for unlimited tries. Let's say current ML system could solve 1% of the captchas, then an attacker could try a million captchas and generate privacy passes for equivalent of 10k captchas.

Theoretically, to penalize the user you need to identify the user. And for that you need to maintain long term identity.


You still wouldn't need that.

Trivial counter-examples include proof-of-work (see HashCash) or cryptocurrency micropayments (not necessarily settled on-chain, so transaction fees are not an issue for the user).


Isn't the client's IP address a sufficient unique identifier?


Absolutely not. Dozens if not hundreds of legitimate clients can appear on the same public IPv4 address, being home internet customers behind a NAT. The same client can trivially change their IPv4 and likely IPv6 address on a mobile network by toggling flight mode to reconnect.


There's more to it than just anti-fingerprinting. There's also some other fingerprinting going on, and I think there may be some kind of IP reputation system that influences these prompts as well. I've put privacy protections up to max but never see Cloudflare prompts.

I see them using some VPNs and using Tor, but that makes sense, because that's super close to the type of traffic that these filters were designed to block.

I suspect people behind CGNAT and other such technologies may be flagged as bots because one of their peers is tainting their IP address' reputation, or maybe something else is going on on a network level (i.e. the ISP doesn't filter traffic properly and botnets are spoofing source IPs from within the ISPs network?).


Every IPv6 thread we get someone saying "Oh v6 is worthless, we can stay on v4 forever, there are no downsides to CGNAT". I still have no idea how they can think that.


Those responses baffle me. I don't think most of those have ever been on the receiving end of anti-abuse features targeting shared IP addresses. I wonder if they're the same people who consider IPv4 a scarce resource that needs to be shared carefully.

Try ten Google dorks for finding open Apache directory listings; your IP address gets reCAPTCHA prompts for every single search query for minutes. Share that IP address with thousands of people, and suddenly thousands of people get random Google/Cloudflare prompts.


Yeah, ever try to use Google through Tor? If you're lucky, it will let you do a captcha and get your result, but mostly it just says the IP is temporarily blocked for abuse.


IPv6 addresses are effectively the same as shared IPv4 addresses in anti-abuse systems. All anti abuse systems treat a /48 or /56 level the same as a single IPv4 address. It's the only way to actually detect one system doing abuse.


> All anti abuse systems treat a /48 or /56 level the same as a single IPv4 address.

With the difference being that you get your own /48 or /56 and suffer from only your own behaviour.

If you're behind CG-NAT because your ISP can't get enough IPv4 addresses, then you suffer from the behaviour of other people.


I don't know of a single ISP that gives /48s out to customers. Maybe a /56, but I think even that is rare.

IPv6 is way better than cgnat, but ISPs are still doing their own internal routing for much smaller blocks. Meaning the block itself is functionally the equivalent of a shared IPv4 for abuse prevention purposes.

But also, I could just not know about the ISPs giving out /48s. My window to this is from the abuse prevention side.


Residential /56s are ubiquitous in my community, and /48s are offered by one major isp, though not the one I personally use.


I'm with Andrews & Arnold (a UK ISP) and they provide a /48 by default.


You get with v6 it's all disposable? You can use it for 1 min and throw it away.

You'll be able to to get them from any geo-location easy as pie.

So it's worse. You'll be even less trustworthy unless you register as trustworthy and keep it, which means tracking. The same as having a fingerprint or login now.

As pro argument that sucks, it's the opposite.


The second half of the address is disposable, plus a few more bits. The first 56 bits or so are allocated just like non-CGNAT IPv4 addresses are currently allocated.


So then you can build up a good reputation by sticking with one IPv6 address, and you shouldn't have to deal with any silly bot restrictions at all.


>I suspect people behind CGNAT and other such technologies may be flagged as bots because one of their peers is tainting their IP address' reputation, or maybe something else is going on on a network level

This is a thing that is absolutely happening, I got temporarily shadowbanned for spam on Reddit the day I switched to T-Mobile Home Internet which is CGNAT'd, and I didn't post a single thing


I'm curious why you seem to think that Tor is more legitimate to block than those behind CGNAT. There's been plenty of research showing on a per-connection basis, Tor is no more prone to malicious activity than connections from random IPs, and that it's only on a per-IP basis malicious activity is more likely. I.e., it's the same phenomenon as why CGNAT causes collateral damage. You could argue that Tor is opt-in and therefore less worthy of protection, but saying "users who want extra privacy deserve to be blocked, even when we know (as much as one can know) that they're not using it for malicious reasons" seems like a fairly dystopian premise.

I'm actually kind of glad more people are becoming aware of this problem, and hope it finally spurs more interest in mechanisms that divorce network identity from IP addresses -- including the work Cloudflare is doing on Privacy Pass!


In my opinion Tor is as good a privacy-preserving technology as VPNs and should be treated very similarly. I use Tor sometimes and I'm annoyed as you are with all the CAPTCHAs and outright blocks when I just want to read an article on a website.

However, the sad fact is that Tor is abused for a LOT of malicious traffic, much more so than any VPN provider, let alone normal ISPs using CGNAT. The anonymity combined with its free nature make it very attractive for bad people to use Tor for bad things without any reasonable fear of getting caught.

An outright block for Tor traffic is definitely out of the question, but adding CAPTCHAs to sensitive things (like account signups, expensive queries, etc.) is sadly a requirement these days.

Blocking exit nodes does nothing to protect your website's security, but it sure as hell cleans up the logs and false positives in your security logs. It's not just Tor, though, there are also some home ISP networks that don't seem to care about the botnets operating inside their network.


"I'm curious why you seem to think that Tor is more legitimate to block than those behind CGNAT."

Who said that? I don't see anyone saying that.


How else would you interpret "I see them using some VPNs and using Tor, but that makes sense, because that's super close to the type of traffic that these filters were designed to block"? They seem to be implying that Tor is a form of acceptable collateral damage, but the likely problem here, i.e. the CGNAT instantiation of collateral blocking, is not.


That only says why it might be blocked, not that it's right.


I never said they claimed it was right, just that it was more acceptable. Again, I don't see how one could interpret it otherwise?


They didn't suggest it was acceptable or more acceptable either, or any other equivalent words for ok, or agreeable, or understandable, or justified, or proper, or reasonable, or...


... I am saying they are clearly making a comparative statement, not an absolute one. Again, how else am I supposed to read that sentence? I feel like I'm going crazy here, this isn't some nuanced point, it's literally what they seemed to be trying to say. Can you please tell me what you think the point they were making with the "but" in that sentence?


"I see them using some VPNs and using Tor, but that makes sense, because that's super close to the type of traffic that these filters were designed to block."

All this says is "This explains why I get blocked while using tor or vpns", It does not say they agree with it or accept it etc.

It only says they are not suprised that it happens, that they understand the mechanism by which it happens, not that they accept or agree with it.

They might or might not also think it's fine and reasonable. I can't say they don't approve any more than you can say they do.


Some sites I have already visited keep popping them up. And I'm on public IP that should have been associated with my computer for a while...

Maybe it is just per use case. Or they think I'm a bot as I keep looking at sites every couple hours... Which might be actually common with these sites.


it may be anecdotal but I see Cloudflare on Firefox compared to Chrome.


The most entertaining part of when I first ran into endless verification loop/Cloudflare error codes is that I couldn't access their official forums/support articles for information due to the same problems.


Had the same issue a long time ago, it was surprising how much of the internet was just "turned off": https://blog.dijit.sh/cloudflare-is-turning-off-the-internet...


Got SSL_ERROR_UNSUPPORTED_SIGNATURE_ALGORITHM when I went to the site and a redirect to https when I manually changed the protocol to http. I turned off https-only mode in Firefox so it appears to be a redirect that your server is sending back.

When I change the protocol and get the redirect back to https there's another "/" which is added after the domain such that "domain/path" becomes "domain//path". This repeats if I continue to change the protocol and hit the redirect such that "domain//path" will become "domain///path" (I noticed this because there was like 6 of them).

Apologies if this is indeed caused by my browser settings; I've been unable to find the cause if that's the case.


The slow march of progress I suppose, that machine is running OpenBSD6.0 which apparently is too old for modern ciphers, I had A+ a year ago on Qualys.

I suppose I better update it now, sorry for the inconvenience.


It is concerning how the recommended security practice is essentially planned obsolence.


Interesting find but that's not the issue for me. about:config shows privacy.resistFingerprinting=false by default (maybe Fedora sets that default?). There were various sub-settings (privacy.resistFingerprinting.*), some of which default to true, so I explicitly set them to false, and refreshed, but that didn't help. I also changed layout.css.font-visibility.resistFingerprinting from 1 to 0. I also tried adding the domain I'm testing to privacy.resistFingerprinting.exemptedDomains and that didn't help.


I wonder at what stage we can consider the damage Cloudflare is doing to the internet as naughty under anti-trust or similar?


Lucky me, I didn't find yet any site to regret if I just give up when I'm presented with the "verify you're human" garbage - which by the way you can get also on Windows Firefox from Google.


The breadth of sites that have this is increasing. I've had problems from everything to a website that sells eggs to science journals to ChatGPT.


> This is because Cloudflare is not happy with Firefox 'resist fingerprint' feature.

"Cloudflare is not happy with anything that is not Cloudflare"

ftfy :)


Yes, I was going to mention something like this. I use a custom firefox cookie setting and get many sites that are broken. The sign that it is a security setting within firefox is the fact that chrome will work fine.


> I'm not using a VPN or Tor, just AT&T Fiber. I don't have ad-blockers. No weird extensions. Nothing special (besides being on Linux).

Even if you were doing any, or all of these things, you are no less a legitimate internet user than anyone else. This whole "rain dance" supplication to show you are worthy of browsing a web site has got to go. Stop visiting sites who treat their users this badly!


This reminds me of the origin of "jaywalking". People used to walk wherever they wanted but when cars became a thing they found that people where in their way. So they started to blame people for "jaywalking" to turn it into a bad thing that the pedestrians are doing rather than framing it as cars wanting to take some of the road away from pedestrians.

We are trying to frame people who are trying to protect their privacy as "suspicious" rather than saying that we want to track them better.


Likening packets on the internet to people in a street is not an accurate analogy. The reason people use these solutions is that they're inundated with garbage traffic that is often automated. The internet is more like a street with 5 real people and 1,000 malicious humanoid robots.


You get 1005 requests for a file. They are all real requests. You simply send back the data.

You want to determine who is manually asking vs automated so you can ignore requests that aren't manually generated.


We are more in the case of adding a turnaround on footpaths to block motorbikes and other 2-wheel scooters. captcha also exists in the real world.


Instead of Cloudflare there should be a protocol that would allow host to notify upstream providers that they should reject incoming traffic from specific address ranges. If there was such protocol all Cloudflare business would be zeroed in a minute, but sadly that is not going to happen.


>there should be a protocol that would allow host to notify upstream providers that they should reject incoming traffic from specific address ranges

At the server or datacenter level, it's call a firewall :)


Considering that a lot of the internet is still behind a NAT I’m not sure that’s a completely bulletproof solution.


If it needs a checkbox to confirm you're a human then I would say that's a lost battle. A bot would be just as able to click that as I am. And I keep being hit with these over and over again. It's well beyond just annoying, especially when it is sites that I have a long standing relationship with. Which I wonder: are those sites even aware that Cloudflare keeps popping up that check dialog?


You are right that scripts can check a checkbox. Which is why the "checkboxes" that cloudflare/recaptcha display to you are not actually checkboxes. They're fingerprinting and behavioral analysis scripts.


There is no reason whasoever you should need to prove you are not a bot in order to view an effectively static website. For interaction you may need anti-spam measures, for viewing no.


The FUD and moralization of groupthink conformance.

When not in a vehicle and there are no cops around, I do the New Yorker thing: I completely ignore signals and focus on traffic. The prima facie and prime directive is safety over conformance. I will not waste my life at the behest of some Christmas lights.


Statistically you're at such a higher risk on the road than anywhere else, it's best to minimize the amount of time spent in a motor vehicle. This is why I don't waste time putting on my seatbelt, run all red lights, and always floor the gas pedal for the duration of my trips.


You forgot the /s


Same thing with fraud against a business being turned into “identity theft”.


What is the alternative though; we had a millions of requests from 100000s of IPs from all continents a few months ago; literally the only thing that got our site back up was bot fight from cloudflare. How do you do this another way?


Personally, I have no problem with CloudFlare or their verification and protection products. But something's broken if it works in Chrome but not in Firefox (and I'm not doing anything special in Firefox).


there is no alternative. it sucks, and so people complain. the only solution is to just let people complain.

there's no way to solve this problem without having some sort of tracking system to determine who's a legitmate user.


So, if somebody so wishes to take down a website they dislike, we should just put up with it? If a state actor DDoSes a journal documenting war crimes, we just ask them nicely to stop?


that's absolutely not what i'm saying. if somebody wishes to take down a website they dislike, we (as website operators) should block their bot traffic. and we should use whatever reasonable methods we have to detect what traffic comes from bots and what traffic doesn't come from bots. that includes putting cloudflare in front of our sites.

and when some legitimate users really, really look like bot traffic because they circumvent whatever methods we use to determine whether traffic is coming from real people, they might sometimes get blocked along with the bots. they're going to complain about that, and the only thing we can do is listen to their complaints.


That only works for you while you're not involved in the second group. We get to complain and make noise and push on both websites and CF, so that the "non-bot" user group doesn't become "latest chrome user on latest windows in the approved country running proprietary CF extention for id verification" one day in the future when it's an easier solution than dealing with the actual issue.


sure, you get to complain. but what you don't get is for your complaints to change anything. unless you have a solution for thia? because that would be great.


Sure. Solution 1: CF starts interacting with the reports of them blocking users and actually fixing those issues. If they can't achieve that, they can relax the rules and refund their customers for failing to provide the advertised service.

It's not a hard concept. Unless you think you're too big to care, you fix issues you cause. (If you are too big to care, I hope the laws regulating anticompetitive behaviours hit you)

Solution 2: Everyone in engineering and management at CF can access internet only while marked as the same level of trust as the lowest one currently assigned to cgnat-s and tor exits. No exceptions, but they can contact support like any external user.


Maybe it could get solved by paying a couple of cents to the website administrator, in the form of cryptocurrency, and in exchange you get a few dozens of requests that the website agrees to reply to.


You don't need additional tracking, every user has a unique IP address. What is missing is a protocol that allows to reject traffic from specific IPs. Imagine if someone with IP address 1.1.1.1 sends 100 Gbit traffic to your host; your provider doesn't want to pay for this traffic so they nullroute you to stop the attack. If there was a protocol, you could simply block all those Gigabits on the upstream provider and if it doesn't comply with protocol then it has to cover all your losses. Then Cloudflare would become unnecessary.


Whenever we get a flood of unwanted traffic dumped on us, it's coming from thousands of different IPs. They hijack everyone's old IoT trash and un-updated printers and wifi routers and Android 3.1 phones and use those to blast traffic. If it were coming from one IP address nobody would be bothered by it, it would be easily solved with rate-limiting rules on the firewall.


Unless you are a small one-man company it is easy to find those IPs. The problem is how to block them because their traffic can use all your upstream bandwidth and blocking them on your host doesn't change anything.

> If it were coming from one IP address nobody would be bothered by it, it would be easily solved with rate-limiting rules on the firewall.

DDOS works by sending more traffic than your upstream bandwidth can carry (e.g. you have 100 Gbit link and they send 40 Tbit of UDP packets to you). Firewall won't help here. The protocol I am talking in a comment above would solve the problem by blocking this traffic close to its source.


Push out proof-of-work challenges.


Can you link to some resources where I could learn how to implement this for my sites?


> every user has a unique IP address

Not by any stretch of the imagination.


I think there are potential alternatives that could evolve.

My preferred solution would be domain validated identities with long lived, global reputation alongside some type of attestation. For example, if I have a GitHub account with 'example.com' as a verified domain, GitHub could attest 'example.com seems to be a real user or organization that behaves well'. It would be similar to the web of trust concept in GPG, but technology is to the point where it could actually be built in a way that makes it usable. Money that you're spending, or the way you interact in well known communities, could have the side effect of bolstering your reputation everywhere.

My most feared solution would be a similar system of attestation, but using Passkey since it would solidify the role of the current big tech companies as the arbiters of everything online. For example:

    You look like a bot.  How do you want to prove you're human?
        Microsoft
        Google
        Apple
        Facebook
Those companies, as Passkey providers, would, for all intents and purposes, be your 'anchor identity' online and they'd be in a good position to attest to you behaving like a normal, non nefarious participant.

I think Apple would be the company that could sell that kind of change to normal users. It could be done in a way that's anonymous because all you really need is an attestation that says 'Apple certifies this user is in good standing'. Apple is very good at selling those kinds of changes as being privacy focused and I think their user base would go for it if it were framed as 'good people' (aka Apple device owners) getting a superior experience that isn't available to the 'bad people' (aka bots, bad actors, and outliers).

If it worked, Google would follow with Android. Anyone else large enough for their opinion of you to count (Microsoft, Facebook, etc.) could probably compete, but it doesn't work for startups or small, less known providers.

In my opinion, as soon as authentication moves to something like domains or digital signatures where 3rd party attestations become simple, we could see a lot of new ideas that focus on reputation and related solutions / services.


But I don't want any of those companies knowing which websites I visit. I only do business with one of them, and even then, they have no need to know what I'm doing outside of interacting with their sites. These companies have enough power already. Leaving it to them to decide whether you're trustworthy or not is just as dystopian as what's happening now. You've just moved the problem from Cloudflare to one of those companies. Plus, if they suddenly decide your account is invalid for some arbitrary reason that you aren't allowed to know, now you're completely fucked.


This is already real:

https://support.apple.com/en-us/HT213449

https://developer.apple.com/wwdc22/10077

While its build with privacy in mind, I have some deep concerns on making these current big players gatekeepers to identity on the web


Curious, why do you have a bot problem?


I do not know; I run a little saas and out of no where it got flooded. Didn’t happen before and didn’t happen since (bot fight is off now).


> Stop visiting sites who treat their users this badly!

The problem is the individual sites aren’t making these highly technical decisions, people are using what seems to them an innocuous security product.

Not visiting a random website places no pressure on CloudFlare to change, since there’s no way to correlate your choice with the decision to use CloudFlare.


Not to mention that you may not have a choice. I've seen government sites have this shit on them. We're quickly approaching the satirical society of the movie _Brazil_.


Unverified: 27B/6 derives from George Orwell's address.

I'm wondering how long it will be before we have memory holes considering how, apart from the internet archive, there is perpetual bitrot and silent updates.


It's a form of digital totalitarianism. Submit to the rule of a few corporations or be left out socially, economically, etc.


> Stop visiting sites who treat their users this badly!

Too bad that basically means you can't surf the internet anymore as a majority of websites use Cloudflare. One of my Firefox installations on Linux are also plagued by this. I can't use Firefox to browse the web.


I already do that tbh. The internet is pretty redundant and you can find what you want anywhere.

CloudFlare blocks me from a part of the internet when I use anonymizing tools like Tor. I assumed they just do that to fingerprint and track you. Even the crypto thing to get a dozen or so passes after solving a riddle never worked.

So I have just moved on to websites protected by Akamai, or virtually anything but CloudFlare. It's not just a political decision btw. It's just easier to move on than to try to fight CloudFlare or to become viral on HN to get support.

It shouldn't be up to the user to adapt, but to the website.


agreed, especially when you are trying to BUY something. the modal popups trying to get you sign up for newsletters, the demand to prove you are human, fuck right off.


I see you're using an ad-blocker. You must disable it to see my low-effort content that's available on the next search result.


It's funny how many sites with news or (propaganda) opinion pieces do that.


I get CAPTCHA fails from my work's corporate network. We are on VPN and it makes us look like a sketchy VPN provider. Heck, StackOverflow blocks us half the time without a CAPTCHA challenge.


this is what I do. "Fuck 'em" if they think everyone is trying to hack their site. They could use any number of standard protections but they choose to use a hammer. The only place I'll kind of jump through hoops for is my personal bank or CC companies. I set up a socks5 server for that so I wasn't using the VPN that cloudflare and IAmVeryImportant.com sites hate.


> This whole "rain dance" supplication to show you are worthy of browsing a web site has got to go.

This is just whining. I don't necessarily like it either, but you conveniently ignore all the reasons why that rain dance supplication exists in the first place. All ears if you have a better solution for DDoS attacks, malicious bot traffic, etc.


I know CloudFlare has market share that would push their complaints to the top, but they aren't the only bot traffic blocker, DDoS shield, etc. Do other providers get a (proportionally) similar amount of complaints?


Even though they will engage on your ticket, the problem is a business level problem they help create and solve at the same time.

https://rasbora.dev/blog/I-ran-the-worlds-largest-ddos-for-h...

It was also discussed previously via https://news.ycombinator.com/item?id=32709329

> "Without CloudFlare's "neutral" security service offerings I couldn't have facilitated millions of DDoS attacks."

For those of you who are blaming website operators;

> "As someone who has previously justified their actions by saying "I am not directly causing harm, the responsibility flows downstream to my end users" I can tell you it is a shaky defense at best. "

The crux of the issue is this:

> "CloudFlare is a fire department that prides itself on putting out fires at any house regardless of the individual that lives there, what they forget to mention is they are actively lighting these fires and making money by putting them out!"

The crooks and the ilk of the internet get a free ride to do their 'shark infestations' everywhere online thanks to CF. However the real humans are the ones harmed here. One person complaining loudly got a ticket addressed. The other 10000 affected won't.


> CloudFlare is a fire department that prides itself on putting out fires at any house regardless of the individual that lives there, what they forget to mention is they are actively lighting these fires and making money by putting them out!

This doesn’t seem like a fair analogy. When I read the quote I expected to dig into the article and find that Cloudflare was somehow intentionally optimizing their network for carrying out DDoS attacks against non-customers in some sort of shady under the table dealings.

In this case the fire department is not lighting fires. They are not committing arson. They are saving all houses including the houses of arsonists.

It doesn’t seem like this kid used Cloudflare to carry out DDoS attacks (burn down houses). It seems like they used Cloudflare to keep their own house from burning down and then went and committed arson on their own.


I emailed John Graham-Cumming about this on March 15th and was told he was looping in the right people.

Small browsers (like mine) are basically unusable now because of this. Theyre significantly squeezing everyone into chrome/safari. Ours is even chromium based, so super annoying.


Is it because you have a different UserAgent? Otherwise, how would CloudFlare even know your browser is different if you're Chromium based?


No kidding, I had to set curl user agent to chrome in order to call some API service hosted behind cloudflare or it'll get blocked intermittently.


Fingerprinting.



Update: The problem has been resolved. I can no longer reproduce the issue. I'm not sure if there was a fix on CloudFlare's side or if it was because I cleared cookies and cache and restarted my browser after resetting general.useragent.override.

If it was the latter, I'm sorry to CloudFlare as this was user error.

However, I do think the two meta points still stand:

1. Better diagnostics: perhaps a FAQ page that lists common issues such as an overridden general.useragent.override, etc. (obviously without giving anything away to bad people, but I'm sure certain things such as this can be pointed out)

2. Better responsiveness in the community forum particularly to this category of errors which blocks public internet activity.


> If it was the latter, I'm sorry to CloudFlare as this was user error.

The fuck it was. None of user agent, stale cache or cookies should have any bearing on you being allowed to view websites.


This is even worse for RSS. Website admin enables Cloudflare for DDoS protection, and RSS clients start getting errors, because they cannot prove their humanity. Would be great if some workaround would be built into Cloudflare, as contacting website admin probably won't do any good.


This is, in fact, a problematic case. RSS is expected to be consumed by other applications and bots. To make things worse, it might not be immediately obvious to the site owner when CF is interfering with the access to his content.


Website admin can solve this and still have protection by enabling caching of the rss feed, using a transform rule to drop all fields that could mess with the cache key, and then reducing the security level for that url. The cache works fine as a DDoS defense aswell as long as you don't let people mess with the key.


Same with API access. I had to change curl's user agent to chrome in order to use some API service that hosted behind cloudflare reliably.


That changing the user agent string helps just shows how absurd these checks are.


I've had this happen to me. I ended up configuring a proxy feed in Feedburner.


so you gave up more control of your content because Cloudflare's a belligerent actor :(

depressing you got stuck in such a mess


You're kinda railing against locks on doors ("I just want them all to easily open for me!") without realizing why they are there.

You can thank abusers and spammers for ruining the internet for you, not website operators trying to deal with spam/bots.

I've had my most inconsequential service taken offline with a $5 booter because the user wanted to brag on Discord. You can bet I default to Cloudflare now.

It's not just for the website operator either. All of my users suffer when $5 botnets take down my server too. And it's cheaper and cheaper to do that every year thanks to the internet of shit.

So I'm not sure who this "Tell HN" PSA is for. Are the baddies going to read about your inconvenience and stop being baddies so we don't need to use captchas anymore?


I'm fine with CloudFlare doing DDoS or spam protection. I'm not doing a DDoS nor spam. I'm happy to help them fix their algorithm. Not only did they not respond to the community post, but they auto-closed it to add insult to injury.


Well, until you have an algo that can mind read, "I'm not a spammer guys, gosh!" isn't good enough, I'm afraid.

And yes, it's annoying that we live in that world. In 1999 you could probably assume a request was human with a User-Agent regex.

In 2024, your smart toaster could be saturating your AT&T Fiber uplink without you even knowing while you're rage-posting in Cloudflare's forums about HAR files and how you're not a bot.


> until you have an algo that can mind read, "I'm not a spammer guys, gosh!" isn't good enough, I'm afraid.

As mentioned, it works fine in Chrome on the same computer. CloudFlare has engaged and is investigating, thanks to this HN post.


A single Chrome install is easier to identify than a single Firefox install with default settings. Firefox is also an outlier in terms of global browser traffic (3-5% for normal websites).


If there is some Firefox privacy feature that CloudFlare considers overbearing, I'd consider turning it off, but I don't even know what the problem is. CloudFlare provides zero diagnostics and didn't engage in the community post. These two latter points are what annoy me. If CloudFlare has some philosophical disagreement with Firefox, then fine, but tell me what it is so that I can consider disabling the Firefox feature.


> Well, until you have an algo that can mind read, "I'm not a spammer guys, gosh!" isn't good enough, I'm afraid.

Yet read-only access to websites, which by definition can't be used for spam, is also locked behind Cloudflare. The same old excuse every time - they're given a legitimate inch for security, but take a mile.

Most telling is that you don't even get heavily rate-limited access to a website without passing Cloudflare's filter. Because then your actual behavior could be used to determine if/how much of a DDoS threat you are. But that would take away Cloudflare's excuse to monitor users, so they prefer to use absolutes.


But it can be used for DDOS, especially when the content is dynamic


Most websites don't have any businesses serving truly dynamic content to anonymous users. If basic page request cost enough that you need to engage in this kind of overzealous blocking then you should fix your page generation and caching.


I propose we begin implementing some responsibility for internet actors. If my car leaks oil on the road, that is my responsibility to fix, yet I did not manufacture the car.

I propose that we make owners of shitty devices responsible for their actions. if my internet of shit thermostat begins spamming people, that would be my responsibility, if it participates in a ddos, that would become my responsibility.


That's already true. If you're found sending abusive traffic, you might get sued, get sent a C&D, and/or your ISP might cut off your internet.

But similar to somebody's leaky car, good luck finding them and enforcing they actually clean it up.


> You're kinda railing against locks on doors

No, definitely not. I'm completely incapable of logging into several different services that have Cloudflare's protection (including their own website) if I use Chrome on my iPad. If I try on mobile Safari on the same device (which has basically an empty history), it goes through just fine.

Something is broken.


The broken thing is that anyone can send any unsolicited traffic anywhere, making Cloudflare a requirement for hosting a website. If we had properly authenticated traffic only that verifiably comes from a human, we would not need all these error prone defenses with false positives.


>If we had properly authenticated traffic only that verifiably comes from a human (...)

Sounds very dystopian and DRM-y for me. You would have to enforce this at the OS or even hardware level (because otherwise bots will just lie). And probably mandatory by law.

Unfortunately I don't think we can force devices to differentiate between "user-originated" and automated traffic without changing internet fundamentals and locking it down.


Yes, that is the point. The options we have are changing these fundamentals, or continuing down the path of ever more aggressive stochastic methods for filtering out most of the bots.

At some point balance will tip where an internet with only authenticated traffic will be seen as more usable and preferred by the masses, then anonymity is over. New AI systems generating unfathomable amounts of human-like garbage to flood all UGC platforms might actually become the tipping point fairly soon.


So far actual humans have no problem copy pasting AI crap. Verification that they are in fact real humans is not going to fix that.


> The broken thing is that anyone can send any unsolicited traffic anywhere, making Cloudflare a requirement for hosting a website.

If a Cloudflare-like service is a required part of internet infrastructure, then each ISP and hosting provider should offer their own, equivalent service. By law, if necessary, if the economics don't work out otherwise. Because having a single company be the arbiter and monitor of who may visit any website is, well, bad.


But they can't. The service only works because of Cloudflare's scale and view of large portions of internet traffic.


Cloudflare is not a requirement for hosting a website.


Only if, in your analogy, putting the key into the lock, turning it, hearing click, and having the door open reveals the same fucking door instead of what's behind it.


This isn't nearly the intractable problem you seem to think it is. Requiring intense tracking/fingerprinting is done because it's easy and/or profitable. Enough pushback on those decisions will make the internet a better place.


There are many less inhumane ways of treating clients than CF does. Just because you needed them to protect your host doesn't justify their abuse of power.


This isn't true, though. Or at least it's not true if you want a free, set-it-and-forget-it solution, which people do for hobbies and side projects. You might want to take a look at https://news.ycombinator.com/item?id=21719793, which is a story about somebody who started out trying to avoid CloudFlare and eventually had to surrender because there was no other way to keep his site online against attackers.


> There are many less inhumane ways of treating clients than CF does.

>> This isn't true, though.

What??

Also, you are probably missing my point: it's not like sites don't need protection, it's the unfriendliness of how CF implements it.


Well how would you do it, then? It's not like CF hasn't thought about this a lot. Spam deterrence has been a problem since the start of the internet. It's clearly not an easy problem to solve.


Cloudflare DDoS protection and Cloudflare captcha are two different services. As a website owner, you can opt into the first without the latter.


Website owhers usually don't realise that some "nicely advertised tech" they're ticking "to protect my poor website from evil hackers" is a damn grenade launcher in an infant's hands. Ironically, they're also shooting themselves in the feet by blocking their own customers.


Losing as much as a couple percent of annual sales to prevent card-stuffers from getting through—which can knock you off your payment processor completely—is a pretty easy call for a lot of businesses.

Not sure how the math works out for ad-supported sites, but it pretty strongly favors "moderately-aggressive automated blocking" for those taking direct payments.


It may be understandable if it's on a checkout page.

But Cloudflare often enough blocks users from reading content pages. Cloudflare could just serve their cached static content instead of showing Captchas.


There are several system level and application level ways of dealing with automated traffic, card stuffers, etc.

Sure, a general solution is better, but since everything today is docker running node.js running without a modicum of caching or appfw in front, not surprised things are so fragile


I haven't really had bad luck with Cloudflare, but for reCaptcha, I make it a point of contacting orgs that use it and telling them they've lost a sale as a result of their choice. The replies I've gotten are usually along the lines of "we have to use it for securit" and I know they don't really care, but all I really see that I can do is complain, and if they get enough complaints hopefully they try something else


As someone who had actually recommended a team to use reCAPTCHA and implemented it, it's really not that they don't care about losing a sale, it's that they lose more money by not using reCAPTCHA and letting bots run rampant. It's a business decision: they are still ahead even after accounting for lost sales due to a small minority of people who are opposed to reCAPTCHA and the money they pay for reCAPTCHA (which may be zero).

Obviously most small sites are not actively targeted by bots and using reCAPTCHA is a waste of money and people's time. But if you are, reCAPTCHA is a godsend.


> small minority of people who are opposed to reCAPTCHA

It's not so much that "people … are opposed to reCAPTCHA", but that for some they can't make it work.


Even if the loop is just one iteration, it's already breaking the internet. I cannot stand web sites that show the CloudFlare verification page before you can access. It's just ridiculous.


The page sometimes keeps refreshing literally forever. Completely ignoring my unconfirmed "Allow this page to reload?" prompt. I left it "checking" for hours once. No luck.


This is exactly what the OP is complaining about - endless loops that never complete.


I agree

I've got a Firefox extension that tells me if a site appears to be using Cloudflare - and I avoid all the ones I can

But I'm stuck with that stupid Cloudflare slowdown screen for the portal to my dr's office


Isn't this for stopping DDoS?


Yes, but aren't there more viable options? Like: a transition page that just waits for 5 seconds before loading. Then I don't have to, as an Asian, wonder how American school buses look like when I "click on all squares that have a bus". As though stop signs, buses and yachts are somehow universally the same all over the world.

CAPCHA/RECAPCHA is the internet version of the infamous "regatta" question on SAT [1].

[1] https://www.clearchoiceprep.com/sat-act-prep-blog/the-most-i...


> Then I don't have to, as an Asian, wonder how American school buses look like when I "click on all squares that have a bus"

It is funny how our five year old daughter can recognise what American school buses look like, simply through media exposure, despite the fact that buses in our country look completely different (and our school buses don't look different from public transport buses, since they are the exact same buses and drivers, just scheduled on school routes instead of public ones.)

Sometimes I can get rather critical of American cultural imperialism, but this kind of thing is more at the amusing than concerning end of that spectrum. It is a good example though of how many American businesses are happy to offer their products outside the US with minimal or no attempt at localisation–and either don't realise the reality of that lacking localisation, or do yet don't care. It is particularly a problem I think with other English-speaking countries, where people just assume that if the language is the same everything else must be, or else their idea of the differences is limited to a handful of well-known items like date formats


That's what it is for, but most setups don't have it setup correct (the verification page should ONLY appear during an actual DDoS, and even then only against IPs that appear to be participating).

It wants to do a bit of cryptography, which means that if scripts/WASM/etc are disabled, you can be out of luck.


I have noticed my CPU spike during these checks; however, I have factory settings for Firefox and haven't disabled scripts/WASM/etc. Is there some setting that Firefox might default to that could cause this?


No idea as I use brave, but check the console log for blocking or anything like that.


No. Cloudflare offers different levels of protection. One level is ‘prevent DDoS.’ Another level is ‘prevent bots from accessing the site at all.’ Not all bots are part of a DDoS. The problem is that many website owners turn on the second setting, because ‘bots are bad,’ without realizing that this means that some of their users are going to have to fill out Captchas.

(Comment written from memory, I may have details wrong.)


Sometimes it's a lesser evil. Clouflare blocks about 1.6 million bot search queries per day on my search engine. Simply could not operate it without this inconvenience.


> I'm currently looking for hosting for a large term frequency data file that is necessary for several of the search engine's core functions.

Did you get that sorted out?

Asking because we (sqlitebrowser.org, dbhub.io) have a bunch of Hetzner dedicated servers that are nowhere near fully utilised. Could probably figure something decent out using those, as Hetzner doesn't charge for bandwidth.


Yeah that's solved itself, I eventually got a cheap VPS @ downloads.marginalia.nu for providing these files. It solves the immediate problem of hosting the data that can't go in git.

How much space have you got by the way?


Checking just now, there's at least 1/2 TB spare on all except one machine.

For these boxes, once they're set up they tend to not grow all that much in disk space.


On a different note, is IPv6 supposed to be enabled for "downloads.marginalia.nu"?


1.6 million out of how many total?


50k legitimate queries / day on a slow day. A HN hug of death is maybe 100-150k/day.


I think those who haven't operated a publicly-visible server on the open Internet in some time might be surprised at just how shark-infested these waters are now. It's, like, mostly sharks.


But you know the website might have sooper sekret information they want to protect, which is why it's been published on a public website.

Speaking of bullshit restrictions designed to encourage compliance with surveillance, have imgur links just straight up stopped working for anyone else recently? I'm coming from a datacenter IP. I assume it's just some heavy handed part of the cost cutting push they announced.


Verification isn't about keeping secrets, obviously, it's about restricting the velocity of bots and their ability (intentional or not) to degrade your site's performance/availability.

There are too many bots out there that are very inconsiderate and do not limit or throttle themselves.

We have one right now that crawls every single webpage (and we have 10's of thousands) every couple days, without any throttle or limit. It's likely somebody's toy scraper, and currently it's doing no harm, but not everyone has the server resources we have.

The point is - if you are dealing with inconsiderate bots, a captcha of some type is pretty nearly a bullet proof way to stop them.

With that said, Cloudflare usually is smart enough to detect unusual patterns, and present a challenge to only those who they believe are bots or up to no good. If every person gets a challenge, then the website operator is either experiencing an active attack, or has accidentally set their security configuration too high.


I do know the common narrative. FUD -> more snake oil "solutions". I myself rely on a special type of igneous rock that keeps hackers away. In reality:

1. Most sites only have this problem due to inefficient design. You are literally complaining about handling 1 request every 2 seconds! That's like a "C10μ problem."

2. How many IPs are these bots coming from? Rate limiting per source IP wouldn't be nearly as intrusive.

3. There are much less obtrusive ways of imposing resource requirements on a requester, like say a computational challenge.


Not every website is the same, folks.

> You are literally complaining about handling 1 request every 2 seconds

I don't know where this came from. The inconsiderate bots tend to flood your server, likely someone doing some sort of naïve parallel crawl. Not every website has a full-stack in-house team behind it to implement custom server-side throttles and what-not either.

However, like I mentioned already, if every single visitor is getting the challenge, then either the site is experiencing an attack right now, or the operator has the security settings set too high. Some commonly-targeted websites seem to keep security settings high even when not actively experiencing an attack. To those operators, remaining online is more important than slightly annoying some small subset of visitors 1 time.


> crawls every single webpage (and we have 10's of thousands) every couple days

100,000 / (86400 * 2) = 0.58 req/sec.

I acknowledge that those requests are likely bursty, but you were complaining as if the total amount was the problem. If the instantaneous request rate is the actual problem, you should be able to throttle on that, no?

I can totally believe your site has a bunch of accidental complexity that is harder to fix than just pragmatically hassling users. But it'd be better for everyone if this were acknowledged explicitly rather than talked about as an inescapable facet of the web.


> But if the instantaneous request rate is the problem, you should be able to filter on that, no?

Again, not every website is the same, and not every website has a huge team behind it to deal with this stuff. Spending 30-something developer hours implementing custom rate limiting and throttling, distributed caching, gateways, etc is absurd for probably 99% of websites.

You can pay Cloudflare $0.00 and get good enough protection without spending a second thinking about the problem. That is why you see it commonly...

If your website does not directly generate money for you or your business, then sinking a bunch of resources into it is silly. You will likely never experience this sort of challenge on an ecommerce site, for instance... but a blog or forum? Absolutely.


Actually I get hassled all the time on various ecommerce sites. Because once centralizing entities make an easy to check "even moar security" box, people tend to check it lest they get blamed for not doing so. And then it gets stuck on since the legitimate users that closed the page out of frustration surely get counted in the "attackers protected against" metric!

I'd say you're really discounting the amount of hassle people get from these challenges. Some sites hassle users every visit. Some hassle users every few pages. Some hassle logged in users. Some just go into loops (as in OP). Some don't even pop up a challenge and straight up deny based on IP address!

And since we're talking about abstract design, why can't Cloudflare et al change their implementations to throttle based on individual IPs, rather than blanket discriminating against more secure users? Maybe you personally have taken the best option available to you. But that doesn't imply the larger dynamic is justifiable.


> why can't Cloudflare et al change their implementations to throttle based on individual IPs, rather than blanket discriminating against more secure users

Cloudflare does not do this - I've made that point several times. The website operator either has the security setting cranked to a paranoid level (which is not the default, btw), or they are experiencing an attack. Those are the only two scenarios where Cloudflare is going to inject a challenge as frequently as you assert.

Normally Cloudflare will only challenge after unusual behavior has been detected, such as inhuman numbers of page requests within a short duration, or the URL/forms are being manipulated, etc. The default settings are fairly unobtrusive in my experience.

If you are also complaining about generic captchas on forms and what-not, that's a different thing entirely. Those exists as anti-bot measures, naturally, but also as anti-human measures. We simply do not want a pissed customer to send us 900 contact-us form requests one drunken evening...


> Cloudflare does not do this - I've made that point several times. The website operator either has the security setting cranked to a paranoid level

This is a bit of intent laundering. By Cloudflare providing ridiculous options, some people are going to take it because more "security" must be better.

> Normally Cloudflare will only challenge after unusual behavior has been detected, such as ...

or people using more secure browsers like Firefox with resistFingerprinting = 1. I suspect this is a significant blind spot for site operators. Have you personally tried your own site with RFP=1, TOR browser bundle, VPN from a datacenter IP, etc?

> generic captchas on forms ... exists as anti-bot measures, naturally, but also as anti-human measures. We simply do not want a pissed customer to send us 900 contact-us form requests one drunken evening

My whole point is it's a bit disingenuous to throw out large quantities of things as the argument, when the hassles are often thrown up on the very first request. I'm not complaining about the sites that throw up CAPTCHAs after the third failed login, but rather the ones that do it on the first attempt!

And sure, I don't have a good map of which types of hassles are specifically Cloudflare versus others of their ilk. And I certainly don't know how often Cloudflare doesn't cause problems, as it doesn't stand out. I just know there is too much indefensible surveillance-based user-hassling in general and OP's anecdote is right in line with my standard browsing experience on many sites these days.


> or people using more secure browsers like Firefox with resistFingerprinting = 1. I suspect this is a significant blind spot for site operators. Have you personally tried your own site with RFP=1

Yes, and it is not an issue for us. Again, this is up to site operators to decide for themselves. The defaults are sane, and Cloudflare makes it very clear what each level of their security configuration does. It is up to the site operator to decide how they want their site to behave. Perhaps, simply avoid sites that bother you? That list will grow by the day, unfortunately.

> TOR browser bundle, VPN from a datacenter IP, etc

Nobody, and I mean nobody, cares about this traffic. We're in the ecommerce space, so perhaps by that I mean nobody in the ecommerce space cares. We do not want TOR traffic. We do not want random-cloud-ip-vpn traffic. These are more often than not where our fraud bots/attempts originate, and we are not alone.

Recognize, if you are using TOR, or browsing regularly via a datacenter IP VPN - you are in an extreme minority and unfortunately lots of folks before you have used these services for bad things.

I personally like TOR, and VPNs. This is no slight against them - but the facts are undeniable here.

> surveillance-based user-hassling

You also referenced canvas-based fingerprinting, and seem to assume that's how these things work. Some might, but many are much more dumb than that. Usage-pattern based challenges are fairly simple when you understand what normal traffic looks like.


They may recrawl every 2 days and make 100,000 parallel requests.


In my experience, if bots start flooding a server, it's the ISP/hosting provider that gets angry and contacts the owner first. )


> The point is - if you are dealing with inconsiderate bots, a captcha of some type is pretty nearly a bullet proof way to stop them.

Not any more.


Most bots do not handle javascript, still to this day. They want to scrape HTML and catalog prices, etc.

At least in our experience.


OK, fair enough. Not in about six months to a year. Because publicly available ML can now solve pretty much any CAPTCHA a human can solve, there's now an incentive to start deploying and improving the already existing JavaScript-capable bot frameworks.


Most bots are either search engines (of all kinds, not just your google's and bing's), competitors, or academic/fun projects.

Of the three common types, only one has a serious interest in breaking captcha's - but they also have a serious interest in not getting too much attention by abusing your services deliberately. ie. if a bot is misbehaving, it's going to get our attention, and we're going to look into what it's doing, where it came from, who operates it, etc, and possibly take some action if appropriate or available. Those actions may not necessarily be limited to the technical space either...

This is just our experience. Other industries will have their own sets of challenges to deal with.


CloudFlare is usually there to mitigate bots attacking. Without which, the site wouldn't be available to view in the first place.

CloudFlare is merely the symptom of a greater set of problems, which it attempts to mitigate.

If you want to be angry about something, be angry that bruteforce attacks are common, guzzle resources and usually yield zero legal repercussions in most cases.


Personally, I have no problem with CloudFlare's bot protection. My problem is with CloudFlare's lack of diagnostics and community involvement to resolve/explain false positives. I have no idea what obscure default setting to change in Firefox to make it work.


[deleted]


Perhaps. Ask your government about it if you genuinely don't think the alternative is going to be far worse. People demand x solutions for y technologies like crypto or AI. Demand solutions for the problem.

The centralized solution is going to be a government-owned/controlled MITM service like CloudFlare. No doubt with actual ID for verification.

I don't see the decentralized solution happening any time soon.

Massive attacks existed long before CloudFlare ever did. If you're implying there's a conspiracy that CloudFlare is attacking others directly or indirectly to sell their solutions, I'd be extremely careful as that's defamatory and almost certainly false.

Furthermore, most CloudFlare users only use the free plan and thus cost CloudFlare money. Isn't that curious?


Imgur links haven't worked on my VPN for a long time.

Even if they did, I'd still avoid imgur since they censor even worse than reddit.


Adding another comment in general response to folks saying "spam/bots sucks, internet is broken"

I get the reason for these pages. But there needs to be an escape hatch in there somewhere. After N cycles of poor fingerprinting, give me some way of asserting I'm human-ish or even slow me down sufficiently where bots are stifled. I'm happy to pay a tax of some sort as long as there is an escape hatch.

As of now, the page keeps looping. For the sake of curiosity I've let it do it's thing for a few hours and it never stops. I'd even take logic games or math problem at this point if captchas are too easy to break. Give me an escape hatch that isn't "use chrome".


I get the need... but that is one perverse incentive right there.


Yes, this is annoying as hell. It's gotten to the point where I just close out of a site when I see that interstitial come up.


yeah, same here. ( for example, I can't access forum.xda-developers.com anymore.


No kidding, these days if I get prompted from cloudflare bail. I also noticed if using a VPN, cloudflare will block your access in some cases.

Maybe time for a boycott of sites using cloudflare /s :)

I also wonder how hart this is for people who are blind, I think they would have a very hard time. Seems to me blind people in the US could use cloudflare using the American Disability Act.


>Maybe time for a boycott of sites using cloudflare /s :)

Pull the "/s" off, and you've already got one person (me) on your side :)


Cloudflare's verification and blocking means that I regularly have to use a VPN to access sites because having an HK ip address is reason enough to get those verifications or be outright blocked.

In the same way that Google breaks email by blocking any small servers, Cloudflare breaks internet by blocking people randomly, not supporting firefox on linux, etc...

Both are cancers that makes the world a worse place


   > I don't have a CloudFlare account so I wrote up a detailed post on their community forums. I offered a HAR file and was willing to do diagnostics. It received no responses and it was auto-closed.
Cloudflare has some weird thing going on there if you want to report bugs. If you try to open a support request to report the bug it'll be auto-closed stating only paid accounts can submit support tickets. Then it says if you really are sure then post it in the community. Did that but the post was auto deleted as spam. All I was trying to do was report a bug in their dashboard. Did someone internally game the KPI for open support issues? :)


I reported an abuse of their DNS system a few years ago to support@ and legal@ ... got told (paraphrase) "not our problem - you figure it out"


I'm not kidding, I've basically stopped using Google Search at this point because I refuse to disable my VPN or log in, and under these conditions I've been unable to do a Google search without passing several purposefully slowed recaptchas.

I used Google because I got a quick result for what I'm looking for. Now I can't get that I'm better off using a marginally worse search that doesn't force me to spend 2 minutes passing recaptchas to use their service.

I'm probably in a minority of people who use fresh incoginto windows frequently, disable fingerprinting, and always behind a VPN though.


certainly minority but at least not alone


You can email me (jgc@cloudflare.com) the HAR file and I'll get people to look at it.


Thank you very much! Email sent at 11:38 CT although it's 1.5MB so check your spam.


I have received it, spoken to the team and they are looking at it.


I very much appreciate your help and I'm glad to do any other diagnostics. However, respectfully, I think the deeper issue is the lack of community support. On the verification page, there is no "help" button. Even with my motivation to find the community support page (which is also non-trivial), there was no response on my original community post and it was auto-closed which was particularly offensive to me (at least just keep it open). In my opinion, something of the magnitude of "we're possibly going to lock people out of large parts of the internet" deserves more careful engagement with the public.

I also understand Linux is an obscure use case but I do wonder how many other "normal" use cases out there have been ensnared. Given the lack of a "help" link on the verification page, an average user is powerless.


I'm chatting with folks about how the community stuff is being handled. Linux isn't obscure (it's widely used internally, too!).


And in a days time this thread will be off the main page and nothing will of been done.

Hows the "chatting" going?


No news yet; however, one of the other HN comments suggested creating a new Firefox profile using about:profiles and that seems to have worked (whereas clearing cache/cookies didn't work), although I'm still trying to find the root cause because it's going to be annoying migrating to the new profile. I think the deeper issue stands that the process to find the cause of why CloudFlare is blocking large parts of the internet for me is too opaque, so I hope CloudFlare has a broader solution such as a diagnostic code or detailed help page.

Right now I'm reviewing about:config for non-standard settings. I did find that I did set general.useragent.override at some point and I forgot about it; however, unsetting it didn't help. I went through all other non-default settings and haven't found anything yet.


Update: The problem has been resolved. I can no longer reproduce the issue. I'm not sure if there was a fix on CloudFlare's side or if it was because I cleared cookies and cache and restarted my browser after resetting general.useragent.override.

If it was the latter, I'm sorry to CloudFlare as this was user error.

However, I do think the two meta points still stand:

1. Better diagnostics: perhaps a FAQ page that lists common issues such as an overridden general.useragent.override, etc. (obviously without giving anything away to bad people, but I'm sure certain things such as this can be pointed out)

2. Better responsiveness in the community forum particularly to this category of errors which blocks public internet activity.


I don't think this is a user error, nor was it ever in the first place. Cloudflare shouldn't of glitched in such a way. The service should accommodate however the browser is configured.

Call me skeptical but does feels more of an "okay this feature we wanted is breaking stuff and people are noticing. Lets turn it off for now".

Something new will break in probably a month.


I regularly get these infinity captchas on Firefox as well. A couple of days ago I noticed that switching to a different Firefox container let's me pass the captchas.


I had the same experience within an app to manage doctors appointments on iOS. For some reason I was locked out of the app by cloudflare and I ended up having to call my doctor's secretary to move my appointment. It took 15 minutes on the phone for something that should have taken less than a minute within the app. Even my friend who worked for that company had no way to fix it. I don't know whether to blame the app developers or cloudflare here.

I wouldn't call Linux an obscure use case, it's particularly great for workstations and old laptops that struggle with running Windows.


> Tell HN: Cloudflare is breaking the internet

Fixed that for you. Cloudflare is a dark force of centralization operating under the threat of "but what if my forum with 10 users gets DDoSed?!" or "I'm too busy to set up Let's Encrypt so I let some random third party who leaks secrets all over the open internet terminate TLS on my behalf."

And bonus now we all have to jump through 15 captcha hoops to load some stupid website barely worth visiting anyway. Who gives a flying fuck if bots look at your ugly website anyway?


... I don't think people are using cloudflare to protect something willy nilly.

My general experience is, if you host a popular site, it will be DDOS'd.

If you host a site in a 'competitive' space, you will get DDOS'd.

I've seen it all personally, forums, image upload sites, NFT galleries, and SAAS health tech even, people will spend a couple hundred dollars to make you miserable.

If you don't have protection, they can literally see how you are falling and it only encourages further spend.


> ... I don't think people are using cloudflare to protect something willy nilly.

I do, it's a buzzword. Cloudflare, you don't have that? Your not cool unless you do.

With young apprentices learning the ropes of SRE/SysAdmin, DDoS protection has been painted as a #101 of the web when realistically you don't need it.


Also, Cloudflare is one of the easier free gateways to set up for your domain. Lots of people are using that (and Cloudflare apps) and they probably enable these protections without giving it a second thought.


Is putting up a HAProxy instance with some DDoS mitigation rules really that hard?


Worth mentioning that Cloudflare also hosts those DDOS services and prevents them from being shut down.


do you have a sauce on that? it's a really big charge.


I spoke too strongly to say they keep them online intentionally. The services do per se violate their AUP but because CF protects their identity, CF becomes the arbiter of taking them offline, and doesn't always agree to do so.


Like what?


Congrats you won a self made up argument


Put blame where blame is due. Poor security practices in operating systems of Internet-connected devices are breaking the Internet. Bandwidth is not cheap and only botnets can afford to DDoS major Internet sites. Cloudflare is the mitigation to terrible security practices in software development and system administration that allows botnets to persist. Cloudflare is simply the Schelling point people have arrived at to minimize harm until we have better-secured peers on the Internet (if ever).

The incentives are unfortunate; bandwidth is not free but it's cheap enough that individual owners don't really care if their hosts are part of a botnet until their ISP starts complaining or disconnects them. Individuals also don't really have good choices available to them; consumer devices rarely get patched for very long compared to their useful lifetime.

I think the current compromise is better than some alternatives like an Internet Passport or harsh penalties for making mistakes on the Internet or FDA/FCC levels of scrutiny on Internet-connected devices.


CloudFlare's been "breaking the internet" for years


My favourite is how depending on what hosting provider you use you can't access their own blog's feed. You get a 403 because you are a robot. Imagine that! A robot accessing a machine-readable feed so that humans can read your marketing material! How awful!


My service uses Cloudflare and we get hundreds of millions of bots trying to abuse our service.

The other day I stopped the Cloudflare CAPTCHA for a day just to see what would happen and the next day I saw fake orders with disputes and credit card testing which costed my business thousands.

I don't think this is a major problem for consumers, but for merchants, without CAPTCHA it is even worse for merchants.

I think I'll keep the CAPTCHA turned on, not sure if there is an alternative though.


would a fraud detection solution that you can query for IPs, cards without captcha work?


(if so, I have one =) )


I also hit the CloudFlare verification merry-go-round several times per day using Ubuntu / Chrome.


Reminds me of when I logged into a gmail account I forgot about for like 13 years.

Google asked me to verify I'm a not a robot, so I did. Then it said I "couldn't be verified" anyway so I did it again, but it gave me like 20 questions in a row.

It said I once again "couldn't be verified" at the end of it (I clearly didn't fail) and I would need to verify my phone number and email. So ha! Got you there.

...But I did that, I verified both which was clicking links or entering authentication codes from multiple devices and multiple linked accounts. After running out of excuses it just eventually said something like "You cannot log in at this time," despite having completed every security challenge.

I absolutely didn't fail any, and if I had, it would have immediately kicked me out and stated so which has happened before on other computers in previous years for different accounts. I wasn't on any VPN and didn't have any abnormal operating system or other settings. This was either main stream, up to date Firefox or Chrome or both. It was on my main regular computer in the USA in a popular tech professional city.

I never got the password wrong while it asked me to log in or anything, which it did about 10 times. I got everything and all security questions correct on the first try without any level of failure in regular human time.

Absolutely nothing should be setting off major red flags... If they're not going to approve my login, they shouldn't have me dancing through hoops for hours. I passed every test and verified registered devices associated with my account and verified security emails sent to other accounts that it was indeed me. If I pass every security check, why do they get to still decide no after wasting hours of my time? Why not just reject me straight away?

It's like winning the lottery and jumping through every hoop to verify that I legitimately bought the ticket in a legitimate circumstance with absolutely my money and they keep going through a checklist of loopholes to not pay out. When I don't meet any of the loophole conditions that they're trying to stretch to meet, they just give up and say "No, you didn't win." Actually, that sounds like a recurring real major problem that actually happens in the US now that I think about it.


There are entire websites that simply will not work for me on Linux+Firefox because of Cloudflare. Never before have I wished for a company to go out of business, until now.


I assumed it was due to me being on a VPN and/or having privacy.resistFingerprinting turned on in Firefox, but I encounter this several times a day. Agreed that it sucks. I know Cloudflare is probably damned if they do, and damned if they don't, because they're warring with bots, and some of us are collateral damage. It's that privacy vs. convenience tradeoff our bearded cyber prophets warned us about in the 90s.


I'm working on a crawler and CloudFlare is the cause of 99% of all the headaches and random bugs I encounter doing simple HTTP requests.

I literally have implemented custom logic to deal with sites returning the "Server: Cloudflare" header.


I had the exact same issue for months, and I just checked and it is gone (on Firefox), which has to have happened in the last day, without me having changed a single thing on my end. I'm certain it's because you made some noise, so thank you. It was absolutely ridiculous how many websites Cloudflare was able to render unusable in Firefox. Truly a terrifying power for a company to wield.


CloudFlare is as evil as Google if not more and people don't seem to realize that. Solving their captcha doesn't help; it requires solving about 15 captchas before granting access. I gave up and I always use Chrome when that happens. It wastes a lot of time, it slows humanity - not only by solving those captchas but also by this slow "checking your browser is secure" javascript, it adds up on a massive scale. As more services adopt CloudFlare, users are hostage. And they no longer decide which browser to use. At the same time all traffic goes through CF which acts like a massive surveillance hub. Very depressing. It started as a DDOS protection (which could help some people) but ended up with all this WAF js "browser security" crap. I wish CF never existed.


Cloudflare are just selling a tool to solve a problem in the best way they can conjure.

All of this comes from there being no universal way to prove you are a human on the Internet. If somebody were to invent a physical device (think YubiKey) that atttested that your activity is human without it being usable to identify/track you, we might have a shot at solving this without CAPTCHAs.

The device would be issued to you as an individual and any signs of it being abused could be reported to deactivate it. I have no idea how such a device would work, but I'm sure it's possible. With machine learning becoming more powerful, this is going to be needed one day.

And before somebody makes the argument of "but that's centralised, big brother, blah blah whatever bullshit", let me remind you that every payment you make goes through either Mastercard or Visa.


> All of this comes from there being no universal way to prove you are a human on the Internet. If somebody were to invent a physical device (think YubiKey) that atttested that your activity is human without it being usable to identify/track you, we might have a shot at solving this without CAPTCHAs.

Which is good. That's a desirable property. The distinction isn't available without also allowing fingerprinting. Further, the line between bot and user-agent is not perfectly clear. Something like cost-based attestation where humans and bots are treated equally is ideal.

> And before somebody makes the argument of "but that's centralised, big brother, blah blah whatever bullshit", let me remind you that every payment you make goes through either Mastercard or Visa.

That's an even bigger problem!


>Which is good. That's a desirable property

Is it? That's Cloudflare's whole selling point - keep the bots out. I can understand from a hacker perspective wanting bots to be able to roam the Internet as freely as people but that causes massive headaches for sysadmins, SREs, and DevOps. robots.txt is no good because it's opt-in.


Cloudflare's decidedly _not_ about keeping the bots out. It's about keeping out malicious traffic. This seems like a tautology, but I'll explain why they are not the same: When I hit refresh in my RSS client and it GETs 250 different servers, on my behalf, is that a user agent or bot activity? How are you going to differentiate the two by their behavior? Some bots are let in, on purpose, like search engine crawlers. Some users are kept out, on purpose, because they use anonymity tools.

Since we don't have chips that detect one's heart's intentions yet, the best we can do is treat bots and user agents the same, and address the problem of malicious activity in other ways. This can be rate limiting, paying per request (i.e. hashcash) or other mechanisms I don't have top of mind. But bot=deny and user=allow is not what Cloudflare does or seeks to do.


Let's all note just how much market share cloudflare got before throwing this switch. While they took over a huge part of the web this sort of thing never happened. Now it seems very much harder to even attempt to browse with slightly more anonymity.

Ladies and gentlemen start your conspiracy theories.


Right observation, but wrong conclusion.

Cloudflare skews towards a monopolistic monoculture. (Fastly and Akamai also exist, but present more friction.)

The issue is that with one transparent proxy and application firewall for a large fraction of web traffic, it has to cover uncountable edge-cases to not leave out nonzero users from a large number of sites. It's unlikely to be malicious intention here, but more likely accidents, oversights, and lack of alternatives.


This in fact happened all along and we warned you and got downvoted for it until now.


> I finally went and deleted all my cookies and cache which I had been dreading to do.

You could had just try it in the porn mode. Another option is to use a different profile or a portable version.

https://support.mozilla.org/en-US/kb/profile-manager-create-...

https://portableapps.com/apps/internet/firefox_portable (Windows only, I guess)


I just tried creating another profile and it seems to help! I didn't realize clearing cache/cookies might be insufficient. Maybe it's some other setting that I have in my default profile, although I don't remember changing much at all from factory Firefox settings.


Glad to hear!

There are some things around local storage which isn't cleard even if you clear cookies.


I noticed, when you browse with Linux or a VPN and sites go crazy- that HCaptcha seems far friendlier than Recaptcha in that i never get stuck in a loop like Recaptcha does where it glitches out, or bugs out and makes it so you'll spend 5+ minutes going through 5 or 6 rounds of matching images because it oddly fails.

If captchas are so important - serious point, perhaps different ones are the way to go?

I apologize in advance if this is more of a setting of difficulty from Cloudflare on Recaptcha, and Hcaptcha potentially being able to be set just as difficult/cost you as much time to get past/etc


> bugs out and makes it so you'll spend 5+ minutes going through 5 or 6 rounds of matching images because it oddly fails.

That isn't just a reCaptcha thing. HCaptcha will definitely do that as well -- and if anything it's worse, because some of the "identify this AI-generated image" challenges are pretty awful. (At one point, I recall it asking me to "select the ladybugs" with nine images all containing round spotted bugs in slightly different shades of red and orange.)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: