Hacker News new | past | comments | ask | show | jobs | submit login
Blocked by Cloudflare (jrhawley.ca)
632 points by jrhawley on Aug 8, 2023 | hide | past | favorite | 455 comments



So many privacy nuts use Chrome and don't realize this:

> What about Google Chrome?

> I tried all of the above in Firefox. So I naturally tried to access the same page in Google Chrome to see if I’d still be blocked. Thankfully, I wasn’t.

> But of course I wasn’t because Chrome doesn’t have the same privacy- and security-enhancing designs that Firefox does. Chrome will happily collect as much private information about me and my browsing history and share them with select parties, as needed. It also doesn’t resist fingerprinting or let me modify settings to the same degree that Firefox does because Chrome relies on those fingerprinting technologies to ensure that I am targeted by ads it deems necessary for me to see.

> Being blocked on Firefox and not blocked on Chrome also tells me that Cloudflare is blocking me based on the fingerprint (or lackthereof) of my browser. Everything about my connection is identical between the two requests, aside from the browser being used. It’s the same security certificates, same corporate VPN, same machine, even the same timeframe when I try to access the site.

If you care about anything these days, don't use Chrome.


I’m no Google fanboy but I wasn’t satisfied with this:

> Chrome will happily collect as much private information about me and my browsing history and share them with select parties, as needed

What information does Chrome provide in this scenario that Firefox doesn’t? It feels like backward logic: it worked in Chrome therefore it must be because Chrome gave extra info. In reality it could be a whole bunch of things, something as mundane as Firefox being a rarer user agent so subject to more filtering.

It strikes me that all of this is an inexact science. I've run into rate limit messages with sites before now that go away when I switch browsers, no matter what the browser is. I assume it's because, with the limited information given, the DDOS protection software assumes that same IP + different UA = different computer.

I have no clue but I wasn’t persuaded that this specific scenario works with Chrome because it was giving away more information. At a bare minimum at least try a third browser!


I don't mean to support or refuse the author's main points or analysis, but you might like to know that the Chrome team is currently working towards shipping the Topics API. I have strong opinions about it but I will try not to editorialize.

My high-level understanding is that they're going to run an ML model over your browsing history (locally on your device) to build a list of "topics" that you care about. Sites you browse can use the Topics API to pull a set of these interests from the browser to show you "relevant" ads. Mozilla has taken a negative position against this standard.

https://privacysandbox.com/proposals/topics/

https://github.com/mozilla/standards-positions/issues/622


How is that relevant to the topic?


You asked:

>> Chrome will happily collect as much private information about me and my browsing history and share them with select parties, as needed

> What information does Chrome provide in this scenario that Firefox doesn’t?


Key words: "in this scenario"

Is Cloudflare using an as yet unshipped API as part of DDOS protection?


No, the idea is they're abusing existing APIs for fingerprinting purposes that Firefox privacy settings disallow --canvas font rendering difference detection, detecting your GPU model, and things of that nature.

But this new API demonstrates that Google is not on the consumers side when it comes to limiting tracking/data gathering ability, as the new API is explicitly for fingerprinting.


> No, the idea is they're abusing existing APIs for fingerprinting purposes that Firefox privacy settings disallow

But that’s exactly what I’m saying: the author asserts as fact the reason Chrome worked was because it gives up more personal information but there’s no interrogation of whether that’s actually true and if true, how it’s achieved.

I’m no defender of Google I just believe we should be making arguments we’re able to actually back up.


Fingerprinting is one of the techniques used to track you across the web.

If the site is serving Google, Meta, or ads from other networks, your unique browser fingerprint is one of the tools that makes it possible to target and retarget you.


I think we’re all aware of that. Where’s the specific evidence that Chrome passed the Cloudflare DDOS protection because it gave up more private information than Firefox did?


especially since the author had to change the privacy.resistFingerprinting in Firefox to true to get it to work (meaning that it was able to bypass Cloudflare's loop by being MORE secure). But that appeared to break other non-Cloudflare sites.

I think the fingerprinting is a red herring. Yes, Chrome is less secure. But Chrome worked.

It's quite possible someone at the author's workplace updated their Cloudflare WAF settings and made things more strict, causing more checks. I'd even offer that a Firefox extension might be contributing.

But the argument that Chrome worked because it offered Cloudflare personal information is pretty out there ;)


I thought it was the opposite: that instead of fingerprinting users, web services would instead just ask the browser which topics the user is interested in and display the relevant ADs. It's an explicit design goal to reduce the dependence on fingerprinting users, otherwise why would they do it. Topics are supposed to be the locally sourced privacy preserving alternative to invasive tracking.

Whether Mozilla/Apple/others agree is a different story. The blowback has mostly been around how topics aren't perfect and the design still leaves room for abuse and therefor effectively devolves to traditional tracking: https://mozilla.github.io/ppa-docs/topics.pdf.


For me the issue is a browser shouldn’t be making the information on the topics of sites I visit available to anyone who asks


Browsers don’t do that today and the result is that AD networks fingerprint and track you to try and serve you more relevant content.

The argument from supporters is that this is a step away from the “fingerprint and track” status quo MO. The argument from detractors is that it doesn't quite achieve that goal.

All you need to address your concern is for access to the API to be user-configurable.


Anyone who believes that ad networks won't continue to do fingerprinting in addition to whatever privacy leaks Chrome adds is a fool.


Not if browsers actually limit access to that data needed to do so.


The API to be off by default i.e. it’s opt in and not opt out

And it should be behind a permissions prompt


That's a distinction without a difference. In both cases, user privacy is compromised. If anything, the proposal to make "user agents" snoop on the user is even more infuriating. That sounds more like trojan horse than "user agent."


When I started having this problem logging into a certain credit card co.'s website beginning with about Firefox 105.0.2 on Fedora 38, I was told by their apparently outsourced customer service that I had to use Chrome, which I don't have installed there and couldn't try. Yeah, they wanted me to use LogMeIn so they could fix the problem, too. Right.

Firefox on Android was still working, though, loathe as I am to put passwords of any significance on my phone. Doesn't directly address your question, which I'd like to know the answer to as well.


Brings me back. My company "upgraded" the time entry system at the beginning of this century.. Issue, our whole dev team was on unix (hpux, Solaris) and used firefox, which didn't work anymore (IE only). They solution to have 3 separate terminals we would "cytrix" into an NT machine to do our time machine on Internet Explorer...

Sigh


PayPal's "secure browser" effectively becomes broken by Firefox's first part isolation. that took some time to figure out.

In terms of being blocked by CloudFront (not cloudflare),I actually got a website to fix their policies by just emailing their tech support and showing that simple user-agent changes bypasses their policy anyhow.


[flagged]


> Completely reasonable and expected response from customer support

Absolutely not, it is not reasonable or expected that a credit card company launch a website that doesn't work with Firefox.

> Back in the day, my university would load balance based on the browser being used.

What on earth?


So cancel your credit card with them? They have a reason field on the cancellation form.


If my own bank/credit card blocked Firefox I would cancel with them. I'm pointing out that this isn't really normal or justifiable.

To your specific point about just moving elsewhere, complaining in public about bad industry practices is part of Capitalism and part of how consumers regulate the free market. "Take your business elsewhere instead of complaining" has never really been how this has worked; businesses don't get to opt out of being shamed just because they have a cancellation form, and they shouldn't have any expectation that users will or should be quiet about their bad business practices. The free market is not a replacement for criticism within social spaces; the free market works alongside that criticism and is reinforced by that criticism.

Public complaining is an essential part of how consumers within a free market coordinate with each other and educate each other about abusive corporate behavior, and it serves as an additional mechanism alongside boycotts and cancellations to help punish bad actors in the market.


> I'm pointing out that this isn't really normal or justifiable.

Oh well, what can you do? Vote with your wallet. Tell everyone on HN and Reddit. I agree. But at a certain point it wastes too much of my energy, so I'll basically just cancel cand tell them I can not use their service because reasons, very disappointed, bye.


Why would they load balance based on user agent? I can’t think of a scenario where that was a reasonable solution.


Maybe back when standarts where on shaky ground and different versions of the same content was made? I too cant see the performance advantage of it. Deprioritizing less mainstream browsers to mess with the nerds?


Ahhh yes I remember those days... if you wanted to use advanced IE-only features, send to one codebase, if you wanted broader compatability, send to another. Similar to how mobile websites used to work. Thanks for the ideas! Any other hypotheses?


A third browser... like what? Chrome and Firefox are all that exist now, unless you have access to a Mac with Safari.


My "third" browser is GNOME Web, however, I uninstalled it thanks to performance issues. I installed Chrome from Flathub, but with limited permissions, which I only use for cross-browser testing. My main browser is Firefox.


There are a handful of Webkit based browsers out there, though none that popular except for Safari.

But yes, 3 is all we're left with outside of a few bespoke projects...


Honestly the SerenityOS browser (+ its Linux port, Ladybird) is probably the funniest. I wonder if that passes CloudFlare...


Servo seems to be more viable than Ladybird


I remember back when you could run the Servo app on macOS, it was a doge inside a cog and you could actually browse the internet, there was an address bar and back/forward buttons. But now they've actually removed that sort of stuff and given up on making a standalone browser in Rust, in favor of augmenting Firefox instead. See Firefox Quantum.


Mozilla actually fired the Servo developers to focus solely on Firefox (they still employ Rust developers, just not on Servo). But after some years, other companies picked up development on Servo.

Servo doesn't have a browser but I'd wager that writing a full featured browser for Servo would be much more useful than another Blink browser


I think Servo has already served to bootstrap a bunch of Rust-ecosystem things, and that's why they yeeted it. Though webrender and some other offshoots from Servo are still useful for a lot of projects.


Chromium isn't Chome. Microsoft Edge is popular. And Opera is still used: my teen daughter seems to have bonded with it on her own.


Edge is now Chromium and Opera is also Chromium, but touché that I said "Chrome" in my original comment.


its time to pull out lynx again.


Check out Vivaldi...?


You mean "Chromium with extra steps"? I know it's a fork, but the actual engine is still mostly Chromium.


I've had sporadic issues with Firefox not working on work-related sites one day when the previous day it worked just fine.

I have ublock, privacy badger, decentraleyes, canvas blocker, facebook disconnect, and duckduckgo privacy essentials installed.

I would go through and disable each extension in order to see if it was the cause of the issue, and so far, every single time it has been duckduckgo privacy essentials that is breaking websites for me.

I think I should remove it at this point, but who knows? Maybe it's protecting me from something that I don't see.


With Firefox you can toggle some settings that will make much harder to generate useful fingerprints. That's already a massive privacy difference.


Why would chrome give that information away? That's Google's most valuable resource.


Maybe they're directly delivering your information for a price. From you to them, directly, via Chrome.


https://privacytests.org/ shows some good data what each browser lets through/exposes for websites.


Caveat: (default settings)

I harden my Firefox installations, and therefore this website comparison isn't useful.


It does have Librewolf and Mullvad listed, which are hardened Firefox forks. But its still not your exact scenario, my bad :)


@afavour: The topic isn't as simple as having a HTTP header with a unique identifier. Browser Fingerprinting is a complex process, that uses unintentional implementation details, like how things are rendered with different graphics drivers or details you can get from APIs that are intended for other purposes (like WebRTC).

The site that morjom posted gives you a simple overview and Firefox is known for the privacy preserving features it comes with. However, you are right, that it is an inexact science as long as we don't know the logic of the Cloudflare implementation.


Chrome will indeed divulge more information than other browsers but only on the condition that you have opted-in for such collection.

“The Chrome User Experience Report (CrUX) provides user experience metrics for how real-world Chrome users experience popular destinations on the web. This data is automatically collected by Chrome from users who have opted in, . . .”

Taken from https://web.dev/crux-and-rum-differences/


It's not a real time API, though. It's an aggregated dataset available via BigQuery. I don't think Cloudflare could use it as part of DDOS protection except in very vague ways.


You're conflating a downside of using Chrome and the reason they think Cloudflare blocked them.


> So I naturally tried to access the same page in Google Chrome to see if I’d still be blocked. Thankfully, I wasn’t.

> But of course I wasn’t because Chrome doesn’t have the same privacy- and security-enhancing designs

Maybe I’m missing something but it seems the conflation was by the article author, not me?


seems like the author mentioned that in FireFox disabling "privacy.resistFingerprinting" worked. So looks like Chrome by default is allowing the server to collect Fingerprinting. If cloud flare is using that, then it is a big red flag.


The opposite. enabling the flag fixed the issue although it broke other sites.

  > Eventually, I found some suggestions that if you’re using Firefox you can disable the privacy.resistFingerprinting option in the about:config page. But that was already listed as false for me when I got stuck, so I switched the value to true just to see if that would do anything.

  > And that worked!


Of course they are. Thats the whole point of the 'Integrity Check'. Besides, almost every website you visit collects your fingerprint nowadays.


No. And there’s still the central issue of the author really hand-waving the specifics of their accusations about Chrome. It really seems to come down to “Google bad”.

To be clear, I don’t even use Chrome, in part because “Google bad”. This just isn’t intellectually honest.


The heuristics used to attempt to differentiate between a so-called "bot" and a "human" are, IMHO, inadequate as long as there are "humans" that are allegedly mistaken for "bots" and blocked. "Use Chrome" is not a solution. A person using Firefox or some other non-Google software is still a "human". But not according to these brilliant "site protection" schemes. What level of false positives is acceptable.

Using JS to "verify that this is not a bot" is a way to force users to enable JS and expose themselves to more advertising.


Blocking bots in the first place should not be acceptable since bots only act on behalf of humans. What should be blocked is abusive behavior that actually impacts the site - a single one off GET to what should be a static page should never be blocked, yet that's what CF does.


Furthermore, all bots worth their salt as far as threats go enable js and do everything they can to appear like a normal browser.


That's fine as processing the javascript increases the coast at-least.


I'd love to know if puppeteer passed that test (probably). I have had exactly this problem many, many times and it is incredibly frustrating.


There are github projects that are forks of things like selenium and puppeteer that are specifically designed to avoid detection for things like scraping google search results, etc.


puppeteer passes the test if you run it from a machine that already has a good cloudflare reputation score. Try it from an AWS instance and it definitely fails 100% of the time.

(I've tried it, that's how I know)


Easy to say don't use Chrome, harder to say don't use Cloudfare.

And if we're taking things to task for monopolizing a market and being a threat to the future of the open internet, I'd say Cloudfare is and will always be a bigger threat.

The moment the Cloudfare dictatorship becomes less benevolent, everyone is gonna feel it.


> The moment the Cloudflare dictatorship becomes less benevolent…[]

In my eyes they have already done that. ICYMI I highly suggest checking out their response and subsequent blog post around the Kiwifarms incident.

That whole debacle was enough to prove to me they learned nothing and are going to continue down this path. I migrated web services and closed my account with them shortly after that whole thing.

Cloudflare routinely ignores abuse reports for its network and takes no responsibility for the utter garbage being carried across their network. It’s almost comical how they so desperately cling to the claim that they are “just a dumb pipe” on one side of the house and on the other a “serious security vendor” who is “protecting the web” while blocking out users simply for the “crime” of trying to preserve their privacy.

If they wanted to convince me they had the web’s best interest at heart they wouldn’t host half the sites they do. They would actually respond to abuse reports and take abusive websites offline rather than wait for it to hurt their bottom line and reputation before taking action but they don’t.


Wait, Cloudflare stopped being benevolent by NOT abusing their power enough? You have two different opinions one is that Cloudflare should respect privacy and one is that is should moderate the internet, these are fundamentally at odds.


Website owners can just stop using cloudflare though…


Yes, but how can end users opt out of using Cloudfare?


By end users, you mean people browsing the internet? I think you're conflating Cloudflare DNS with site owners leveraging Cloudflare CDN and WAF/Security.


> If you care about anything these days, don't use Chrome.

Or Cloudflare.


funny enough... I called out Cloudflare for the pariah it is, and got downvoted and flagged


I have done the same to the same result. We must be the lunatics, as everyone keep defending their decision to put everything, even their personal blog, behind a single company, because "they might get DDOSed".

The absolute state of software engineers and systems administrators in here, man. Talk about overengineering and premature optimisation, let alone being totally oblivious that their laziness is what creates a monopoly.


People immediately assume if you dislike CF you’re defending one site in particular and once they do that no further discussion is possible.


I'm out of the loop I guess. Which site would that be?


Probably Kiwi Farms or whatever it has evolved into these days.


Funnily enough, KF has its own buttflare-style bot protection script that doesn't really like firefox. Or whatever provider they are using now does.


and someone's come and done it again


I seriously never get people that love CF (or any company for that matter). Praising 1.1.1.1, giving it free advertising. CF is basically handing over your website in return for some less work on your part. I get the advantages of it (like less engineer credits wasted, less server maintaining work and probably cost, faster) but actively giving it free PR just doesnt fit right with me. Pay your bucks and sit. They are a Big Tech company, they dont need your prayers.


>CF is basically handing over your website in return for some less work on your part.

The older you get, the more valuable being able to just dump your shit on other people becomes.


Quite the opposite for me. The older I get (42 now) the less patience I have for people who sacrifice freedom for convenience. You're ruining it for my children when you do this.


I’m a similar age. My view of tech was shaped by sites like slashdot and the thought processes which went into that culture, but put control of my equipment as the important part. It was an eye opener when I entered the workforce and found tech people who genuinely loved the Microsoft eco culture, which makes sense given how much stuff was built for IE only.

I wonder how world views were shaped with those entering the industry post GFC and with google and aws on the ascendency.


No i totally get it, i can see myself doing the same compromise. I cant see myself recommending such practice, however.


I don't praise 1.1.1.1 because it's just a DNS server. My firewall's set to redirect all DNS queries there. It works fine. Could've used 8.8.8.8 instead, but I trust CloudFlare slightly more.


There’s aeverql other providers none of whom are as large as cloudflare - including quad-9 and opendns, and of course your own ISP


Yeah, I could've used any one of them, before 1.1.1.1 I usually used Google.


> So many privacy nuts use Chrome

Really? That's news to me.


Well, Chromium is quite popular with the security conscious on Linux. At least it was when I was using ArchLinux, they had some good custom build script versions.


Some particular build of Chromium and Chrome are vastly different systems. A lot of this is philosophical; The Mozilla way is to support standards and tut-tut at websites for doing overtly malicious things like looking at user-agent or asking for widevine, the Chromium way is to treat the web as a hostile actor and offensively subvert anti-user behaviors.

Any modern browser that doesn't actively fingerprint as either most-common Chrome on a laptop, most-common Android browser, or most-common iPhone is written by such hopelessly naïve nerds that they shouldn't be trusted with user-facing software with real security considerations.


Security conscious and privacy conscious aren't the same thing, although there's overlap. I can be concerned about the security of my system without caring about whether I'm being targeted for ads.


This is untrue, but frequently misunderstood: Privacy and security are two facets of the same problem. If you don't have security, your privacy is at risk. If you don't have privacy, your security is at risk.

Case in point: Many of those targeted ads contain malware. :)


Do you have any evidence of your claim.


Evidence of what? That malvertising exists?


This loop happened all the time for me in Kiwi Browser on mobile. I have a couple of fingerprint-reducing extensions installed there. I also use other extensions like Dark Reader to make website backgrounds pitch black to reduce OLED display drain and improve readability in darker environments. It appears to be better lately, happening more often while I am travelling and changing IPs, less when I am at home. Still it wastes time when it does the loop, it forces me to use unmodified Chrome, wasting more battery power and harming eyes at dark with those white backgrounds. Unfortunately more and more websites are proxying through CF, thinking they are 'protecting' their website. But CF acts like the chinese Great Firewall, deciding who can and cannot to access the site.


I don't quite understand the "ads it deems necessary for me to see" comment. You will always get ads on sites that serve ads. The thing the tracking might do, is change which particular ads you get. The right solution to that, is to use an ad blocker, and to pay for sites that have an ad-free alternative.

Also, fingerprinting isn't always "bad" -- any business who takes credit cards online, wants to try to exclude people who will commit fraud (because they might have done it before.) Preventing fingerprinting, means you prevent certain anti-fraud, which means that you see higher prices and more friction doing commerce online, which also affects your experience. The connection is just much less direct.


> Also, fingerprinting isn't always "bad" -- any business who takes credit cards online, wants to try to exclude people who will commit fraud (because they might have done it before.) Preventing fingerprinting, means you prevent certain anti-fraud, which means that you see higher prices and more friction doing commerce online, which also affects your experience. The connection is just much less direct.

By the same argument you could say it should be fine for a physical store to refuse service to anyone who they get a bad feeling about or don't want to serve. But if you permit that then you're immediately opening the door to racism etc., which we consider socially unacceptable. It should be the same for websites too - I bet all these browser fingerprinting techniques just happen to mean better service for people who can afford the latest iphone.


Tracking is establishing your identity. Try using a private mode Firefox via a VPN. Half of the web is completely unusable. You get put in unsolvable catchpa hell as punishment for being anonymous.


Try walking into a real place with a mask on and you might also get treated less pleasantly.


Walking into real places with a mask on has been normal for the past three years.


OK, but not with a balaclava.


Will a fake nose, moustache and glasses do?

(My point? Characterising anonymity with an item of clothing associated with paramilitaries has associations that don't need to be there.)


I think that a randomly generated completely real looking expert disguise is probably best. And a pain in the ass.


Have you visited many stores since 2020? There was an event around that time.

I still today wear a mask in every store I enter and I can completely honestly say that I have never gotten a weird look from staff over it; it's never been a problem.


> I still today wear a mask in every store I enter

But why.


Halfway through Covid a witch cursed me so that anyone who looks at my face in public immediately goes into horrible convulsions and dies. It was a confusing week until I found out what was happening, and the bodies were very hard to dispose of discretely.

----

More seriously (and more relevantly), in the context of the current conversation about privacy and user autonomy, the correct answer to "why do you need to be able to do X" should usually be, "that's none of your business."

"Why do you need to run a VPN?" None of your business. "Why do you need to wear a mask?" None of your business. "Why do you have WebGL disabled?" None of your business. "Why does your browser not have this font installed?" None of your business.

A big part of autonomy and agency is that you don't need to ask permission or justify to anybody why you're doing the things you have agency to do. If you need to explain then it's not autonomy, it's permission. I don't feel I need anyone's permission to wear a mask indoors in a public space regardless of my reasoning (and in practice I'm never asked to explain, millage may vary but my experience is that nobody really cares). And similarly I shouldn't need Cloudflare's permission to run an obscure browser or to customize my computer setup.


Try a balaclava.


I am pretty sure I could wear a balaclava to Walmart. In fact early on in Covid during mask shortages I'm pretty sure I did wear basically the functional equivalent of a balaclava into a Walmart because I couldn't find N95s.

Admittedly France has tried this bullcrap with burkas before, but that's not exactly something anyone should be emulating, I think we'd pretty much all agree that "I'm sorry but for security reasons you can't buy groceries wearing a burka" is not an acceptable argument. Security doesn't grant free license to override other people's rights.

Bear in mind that the actual real-world examples of the argument "people shouldn't be able to wear masks in stores because of security risks" have for the most part mostly been examples of security being used as a justification to infringe on religious rights or to block marginalized/disabled people from taking reasonable safety measures to protect themselves from infectious disease.

If you're going to bring up an example of security overriding other concerns, at least bring up an example where security hasn't observably immediately become a slippery slope to infringing on people's rights and excluding them from society. Is "stores can ban you for wearing a mask" supposed to make me more comfortable with websites fingerprinting me? I mean, I know where that argument ends up in the real world, it never ends with balaclavas, we've had that argument in the real world and where it actually ends is with immunocompromised people not being able to buy groceries.

So I'm not sure any of this is really supporting your point. Anonymity should not be punished in physical or virtual spaces, and there are huge debates about de-anonymization, facial recognition, and tracking in both public and private physical spaces and for the most part we don't accept security as a justification for de-anonymization.


That's implausible. Using finger printing for fraud detection would only catch someone using different cards on the same machine. Once a card is deemed stolen it stops working so it's unnecessary for that scenario. That doesn't even go into fake fingerprinting some browsers/plugins.

The price is the highest the market will pay. Increasing that price means few customers lower revenue. Fraud is a cost to the business they must pay out of profits because if they tried to increase prices demand would drop.


>Using finger printing for fraud detection would only catch someone using different cards on the same machine.

In this context the goal of fingerprinting is to detect requests coming from an attacker. It does not care about the ability to distinguish between individual machines.

>Once a card is deemed stolen it stops working so it's unnecessary for that scenario.

The whole point of automating it is so you can cash out many stolen credit cards. If you only have one you might as well do it manually.

>Increasing that price means few customers lower revenue

Making more revenue doesn't matter if that extra revenue ends up getting eaten by chargebacks.


It can be an aspect of it. For example, if there are suddenly many unique fingerprints making purchases from the same residential IP, that might look suspicious.

Granted, I'm not aware of a lack of fingerprint being penalized. That said, there are products that allow custom rules, in which case anything is possible.

I work for a company in this space. Opinions are my own.


> business who takes credit cards online, wants to try to exclude people who will commit fraud

How bad is it nowadays? Can't you just enforce 3DS2?


>If you care about anything these days, don't use Chrome.

I care about a lot of real world stuff - human rights, wars, the environment, friends etc. I don't care if Chrome knows who I am and tries to show me ads which uBlock then blocks. There are more important things to worry about than privacy geekery.


Famous last words: "There are more important things to worry about than privacy".

If you've read history (and maybe you have, or not) privacy is a human right. When privacy goes away, then everything else goes away. Ask anyone over 60 in Germany or Romania (that was not WITH the army or the Police/Security services) and they will tell you how nice life is without privacy.

But hey, sure, 1) privacy doesn't matter, 2) you got nothing to hide, etc etc.


Ads are used to manipulate people into doing things they would not otherwise do, which very much affects "real world stuff". Mostly into wasting money on useless crap, but also worse. What ads uBlock can block is limited by Chrome. What "ads" disguised as content you are shown is affected by what information Google collects and lets other people collect. The internet isn't a nerd safe space anymore - what goes on here often affects real people.


I care about accessing the sites I use quickly and efficiently, with a minimum of auth and compatibility dance.

Since Chrome is so common that it's basically guaranteed to have been tested against the site I'm trying to access, I use Chrome.


Hi there, I'm the PM for Cloudflare's challenge platform. I'd love to look into what the cause of the problem is, so you don't see these difficulties.

> Cloudflare detected the high frequency of requests and denials (but not their faulty loop that caused this pattern of requests, of course), and tagged my browser as suspicious.

I can tell you at least that we don't penalize users for this looping behavior, so this wouldn't cause us to see your browser as suspicious. I hope we can dig into this more and uncover the cause of the problem.

Personally, I'm a big Firefox user, and this isn't behavior I see. If there were a widespread Firefox wide issue, automated alerts would trigger and we'd consider this a critical incident.

You can drop me an email at amartinetti at cloudflare if you're interested in troubleshooting.


The cause of the problem is that your software is faulty by design.

1. IP addresses are to be used for packet routing. Certainly not for assigning "behavior scores" to users in the background. IP addresses say nothing about your visitors, my IP address could have been a complete stranger's IP address yesterday.

2. Deciding who can access half the web based on their TLS signature achieves nothing in the long run except reinforce browser monopolies, and goes completely against the spirit of the open web.

I guess now I have to use Chrome for browsing the web from home. Yes, I do run a crawler-like bot as a hobby project, I got what I was asking for. (Funnily enough, it still works if I just emulate Chrome's TLS signature). But I also have friends who have done absolutely nothing of sorts (no technical skills), and still got caught up in this latest ban wave.

Let's be honest here. Your service has likely caused millions of people harm who one day to the other are suddenly blocked from half the WWW - not just nerds, who can get around that one way or the other, real users who just got unlucky and now are potentially blocked from accessing websites required for their daily lives (welcome to the 21th century). This is not a one time problem, it has been going on for years; this time it just came too suddenly for too many people. And this kind of harm is a logical conclusion to the heuristics you use for determining who can view a website.

Never mind that it's ridiculous how a single company from outside my country has the power to decide on whether I can use the web or not. That's kind of on website owners unconditionally giving this power to CF anyway.

Now, allow me to return to purchasing proxies from shady sources for myself, so I can keep using Firefox. Thanks and keep up the good work.


I sympathize with your frustration, but you also have to admit that Cloudflare is tasked with an impossible problem: from a sea of requests, identify those that are coming from robots that are disguised as humans.

So there is no perfect solution. You can't use strong identity because a user can share their identity with a robot. You have to use a crapy heuristic that only works most of the time (or tell site owners it's an application layer problem and use this SASS solution to solve the problem).

I mean you admitted that you run a crawler. Cloudflare has detected that you run a crawler and has wants you to prove that you're human to access sites on their network. It actually sounds like their product worked.

In any event, there should probably be better regulation around how this blocking is handled so that users aren't being unjustly blocked. If you want to run a crawler, how do you do it ethically so that you aren't targeted and your traffic blocked? If Cloudflare blocks you from accessing one site should that block extend across their whole network? How long should it last? How do you appeal the block if Cloudflare's heuristics falsely block you? If you're in a life and death situation and need immediate access to medical information and Cloudflare unjustly blocks your access and it causes harm, who's at fault? Etc.


> but you also have to admit that Cloudflare is tasked with an impossible problem

They're not tasked with anything. They choose to sell a bot detection and mitigation platform as a product, and that's a hard business to be in. If they think they can do it, great. If they can't, they shouldn't try.


The thing I don't understand is why all of the blame is being placed on Cloudflare as a company.

Why not place the blame on the people who are configuring Cloudflare to behave in this way?

I'm a happy Cloudflare Enterprise customer, and our DDoS settings are "Off", we don't present captchas to end users, we don't block any traffic, and we've disabled all of Cloudflare's managed rulesets.

It's very possible to use Cloudflare with all of the security features switched off. The features causing the author's issues are features that can be disabled by the site owner. Cloudflare has power over what they recommend as the default settings, but ultimately it's up to the site owner to choose how to configure Cloudflare for their site.

I think there could be a healthy debate around Cloudflare's default account settings, but I'm surprised by the number of people here dismissing the fact (or maybe not aware of the fact?) that all of these are features that can be turned off. The owner of the site chose to keep bot protection, visitor verification and related features turned on.


I agree 100%. While I wouldn't go so far as turning off all of the DDoS settings and managed rulesets (why pay for it then?), you can certainly set the "secure/strict" level to medium or low and still retain benefits.

I'm wondering if it's related to Cloudflare's new/updated Bots features, especially the "Super Bot Fight Mode" feature -- which I believe gets a default setting that is super strict.

As others have mentioned, saner defaults might help, but I guess they want to error on the side of "more secure" vs a less secure default.


If the "feature" says "block bots", and it is blocking people, then cloudflare is to blame, not the users who enabled the feature.


> Why not place the blame on the people who are configuring Cloudflare to behave in this way?

Sane defaults. Of course everyone would turn DDoS protection on.


So are you declaring nobody should be in that business of bot protection then?


Yes.

Blocking all crawlers except Google bot is itself a problem.

There should not be any bot protection, only abuse (e.g. DDOS) protection. Block disruptive behaviors, not fingerprints.


But they are doing it and succeeding. No product is 100% perfect. The problem is that when it’s not perfect people can ostensibly (and arguably actually) be harmed if they can’t access content on the Cloudflare network. This is why we need more scrutiny around how large internet platforms deploy bot mitigation technology. We don’t need to tell people “sorry just suffer DoS attacks”.


Is only Google allowed to crawl?


Cloudflare is not tasked with anything, they have chosen to take on a task. That that task happens to be impossible does not get them any sympathy for the collateral damage they do while trying.


Why are humans only allowed and shouldn't we be proactive and accept robots as equals now. We have a history of prejudice against groups and we seem clueless that we are heading their again.


Have you ever run an open resource with significant traffic before? People are absolutely abusive with their use of public websites and APIs. “This is why we can’t have nice things” is as relevant as ever.

Cloudflare provides a vital service that solves a real problem that breaks non-pragmatists brains.


> that breaks non-pragmatists brains

Often times when people say this, what they really mean is that they have different opinions about which tradeoffs are tolerable and which tradeoffs aren't.

Captchas are a nightmare for accessibility. Turnstile was designed to solve that problem, but is a nightmare for privacy-oriented and non-standard setups. Getting rid of both systems and blocking based purely on behavior or building entirely new metrics to block on would absolutely be a nightmare for website security.

It's all tradeoffs, but some of those tradeoffs get labeled as "pragmatic" and some of them get labeled as "idealistic" -- mostly just based on the personal values of whoever is making that distinction. The reality is that no matter which direction we go, somebody is going to get the short end of the stick. We all want to minimize harm, but we disagree about who that somebody getting the short end of the stick should be and how short of a stick they should get.

I agree that it's idealistic to claim that we can just let automated agents access any website and that it wouldn't be a nightmare for security. However it is equally idealistic to claim that it is possible to fully secure websites against automated attacks without restricting disabled people, violating user autonomy, or harming the overall health of the open web. I do have sympathy for Cloudflare; they are trying to solve an impossible challenge. That's the key word: it's actually impossible. It's a challenge that can't be solved, we can only do the best we can do and that means accepting tradeoffs both for site security and for accessibility and access.

I disagree with Cloudflare about the exact degree to which solving that challenge justifies and excuses harming the open web and I disagree with Cloudflare's idealistic fantasy that fully solving that challenge is possible without significantly harming the open web. I disagree with some of their product directions and metrics not because I'm idealistic about alternatives but because I'm realistic about the outcomes of what Cloudflare is doing right now.


So block clients that are being abusive, not "bots".


Of course I'd agree that if a robot is following the rules and behaving indistinguishably from a human but maybe just a little more quickly, then it shouldn't be pre-judged (and our detection should accommodate). But here we're talking about robots without agency being e.g. used in botnets to abuse services, or otherwise not following the rules.


All clients follow the rules if you enforce them. Break rate limit and get a timeout. Settle your payment before you send the product using bitcoin instead of Visa which is not able to do this.


You’re so close to getting it.

  > Break rate limit and get a timeout
And what exactly should the rate limit key be? From your username I’m sure you are aware that it can’t be the IP address.

It sounds like you’re coming at this from an authenticated API perspective where client identity is a given and anonymous access is the exception. The web inverts this, making everything much more difficult and necessitating the sort of fingerprinting that is at issue in this article and I presume you are opposed to.


Isn't the point that Cloudfare is essentially enforcing the rules then?


You're being a little dramatic. It's incredibly unlikely that millions of innocent users have been blocked, and unless you have data to the contrary you shouldn't make such a claim.

You know what else is harmful to the concept of the open internet? The enormous malicious botnets and other endemic problems that require a solution like CloudFlare.


Data point of N+1, but I haven't been able to place online orders at Petco for about a year now because they use some Cloudflare feature that hates my browser + home internet connection. Other Cloudflare-proxied sites seem unaffected, and I'm not doing any botting/crawling, nor do I have any IoT devices on my home network. There's not enough information provided to be able to do any substantive troubleshooting.

This became irritating enough that it caused two side effects: (a) I stopped shopping at Petco, and (b) I moved a pile of sites off of Cloudflare and stopped recommending them, and now sometimes recommend against them.

Cloudflare is still a good, quick, cheap option for sites that receive unusual volumes of malicious traffic, so I'll still recommend them as a solution to some problems. But, they're not a good default.


So you're mad at Cloudflare because Petco enabled a feature that blocked you? If Petco had developed something in-house that blocked you, would you be mad at the compiler?


Cloudflare offers this service. If Cloudflare offered a service that enabled Petco to do something amazing would you be grateful to Cloudflare? If Cloudflare advertised on its homepage about blocking a DDOS attack on a website would you say, "meh, Cloudflare wasn't responsible for blocking that attack, they only provided a feature. The website blocked the attack."? If not, then why should Cloudflare be immune from criticism when the opposite happens?

Cloudflare offered Petco the features to do this as a product and makes money off of Petco's usage of those features. I do sympathize with the perspective that ultimately tools need to be somewhat neutral and it can be dangerous to forward around responsibility. But "tools are neutral" can also be taken to an absurd degree. This isn't 5 levels of indirection here and it's not Petco going and installing a neutral piece software that they downloaded from Github. Petco is a client. They're turning on toggles that Cloudflare built into their user interface and advertises as features.

There's some level of moral accountability there for how those features are abused. I'm not saying it should be illegal, I'm not saying it shouldn't be allowed, but Cloudflare is definitely at least eligible for criticism. This is a product, it's not Petco abusing Cloudflare's infrastructure; they're using the product as intended and advertised.


...no, I've changed my recommendations for Cloudflare because it may prevent ordinary users from using a site, and insufficient information is provided for troubleshooting purposes, and those users are likely not going to go to extraordinary lengths to report the problem. Even if they do report it, the site won't be able to troubleshoot it either. So, if you don't need it, you're probably better off without it.


> It's incredibly unlikely that millions of innocent users have been blocked

Is there a 'town square' where we can talk about being presented captchas and similar things from 3rd party intermediates.

I think it's incredibly likely that millions of hours have been wasted on such challenges.


On that note...

https://www.folklore.org/StoryView.py?project=Macintosh&stor...

"Well, let's say you can shave 10 seconds off of the boot time. Multiply that by five million users and thats 50 million seconds, every single day. Over a year, that's probably dozens of lifetimes. So if you make it boot ten seconds faster, you've saved a dozen lives. That's really worth it, don't you think?"

Imagine if people still thought like this about computers and software.


Yes. And cookie splash screens! I admire GDPR's intention but hasn't it been a massive human time sink.

Not to take away from your point, just that it's all a hindrance.


That's more on the websites that track your personal data for non-essential purposes. No tracking means no banners are necessary.


Finer points, my point is just about people wishing to view web pages.


I don't know that most web admins can tell if they should float a banner, so vague is the law.

Technically, I think if you have the default Apache logging configured and you read those logs, you should probably float that banner.


I believe you're mistaken. GDPR allows you to record IP addresses for normal operation of a site, which specifically includes logging. No banner is required.

GDPR is not "vague" about this; perhaps you haven't read it (as laws go, it's pretty easy to read).


It's easy to read because it's vague, and it's going to allow some regulator to decide whether my use of IP addresses constitutes "normal operation." Puts a hell of a lot of trust in government officials to decide who is worthy of prosecution.

It reminds me of the war on drugs in a lot of ways.


@adammartinetti : maybe you could consider developing a new product where you display a GDPR consent banner once, and then these settings apply to all Cloudflare-proxied websites (by passing this consent information as an additional header to the proxied site)


Sounds inferior to the "no cookies no banner" solution.

The GDPR does not mandate gratuitous and pointless personalised spying, which is the only case that requires consent. Normal operations (say a shop collecting payment details and shipping address to fulfil an order) do not require a consent banner.


Those can at least be blocked with ad blockers and/or disabling JS.


ReCAPTCHA was designed with this in mind: given that we had the need to distinguish humans from bots, it presents problems that are hard for bots to solve, where the resulting output is valuable. So the time consumed isn't wasted.


It's wasted from the perspective of the end user.


Not when the end user turns around and uses Google Maps which is now populated with higher quality fine feature information due to the training of the machine learning system on what traffic controls look like.


Valuable to whom?


I dont get your second point. Two things can be harmful to the open web at once. CloudFlare is definitely not taking the right approach at it, which damages the open web alongside botnets. Also, botnet owners are for some reason extraordinarily nerdy and smart so they probably will find a way to fool CF every other month. Its a cat and mouse game for them while actively harming everyone else both with their botnets and the increased aggressiveness of CF caused by their incorrect solution


>You know what else is harmful to the concept of the open internet? The enormous malicious botnets and other endemic problems that require a solution like CloudFlare.

You know what's infinitely worse? Monopolies.

Half these problems can be fixed by banning certain parts of the world. It's just politically shifted out of the Overton window to do that so CF profits greatly.


For every one user that makes their way on here and finds and posts here on this thread probably represent 1,000,000 plus normal users

An open web is open for everyone/thing not just classes of beings you select. Bots and users can both be malicious and both can be positive.


I agree with the premise that most people don't know how to identity or visibly complain about a given technical problem, and so an HN thread with N anecdotes about the problem likely corresponds to N * F actual amount of real-world incidents, for some value of F > 1.... but claiming it's a factor of a million without any backing evidence is absolutely an overreach.

> An open web is open for everyone/thing not just classes of beings you select. Bots and users can both be malicious and both can be positive.

This I agree with. I run an archiver ~monthly on a subset of my month's browsing history, and I'd hate if that got me blacklisted from Cloudflare-backed sites for a benign purpose. (See also the idea of remote attestation)


That's a pretty good idea. Do you randomly sample, or just exclude some domains? Is there some tool out there that does it for you?


Assembling the list of links to archive is a manual process--I just log them in an Obsidian notebook with a category and summary, and I later post it to my blog. (I don't really think other people care, it's more for me to be able to find past things I've found interesting.)

For the archival process I use ArchiveBox[1] running as a container on my NAS; I just grep through the note for `http|https` and feed the resulting list to the archiver. For everything not-hackernews I set the depth to 1, but for HN threads I do 2 so I grab whatever people may have linked in the comments.

I think there's ways to hook into like, ALL Firefox history or saved posts on reddit, but that's way heavier than what I care for.

[1]: https://archivebox.io/


Interesting! Firefox history is just SQLite. I might do something like, take all non-search URLs and archive them once a month or so. Thanks for the inspiration.


cloudflare blocks me every time I open an incognito window. No VPN, just having no cookie towards a domain automatically means I'm a bot…


Right. It feels silly to point out specific things when just about everything about these verification checks deployed by every megacorp throws you into a universe of suffering.

If there's a place to start, it would be with eliminating the infinite challenge loops. Bad enough that IP blocks get outright blocked. Bad enough that I have to decide whether or not that blurred sliver of the edge of the wheel+shadow constitutes being part of the bicycle. Not to mention the humanitarian betrayal of the absolute highest form to farm the free human labor to train AI models when they are simply trying to browse the $#@%ing internet.


You're going by the specified, designed use cases of those technologies.

Every spec is a three-edged sword: the spec, the intent of the spec, and the use of the spec in the wild.

In practice, Cloudflare does a pretty good job on far-more-than average of gluing together some heuristics in an unspec'd way to filter traffic. It sucks because you can't plan around it, but that's rather the point because the malicious actors are trying to plan around it also.

(ETA: Hacker News rate-limited this post. In theory, I could have set up a sock-puppet to try and work around that, but then they would catch that too and I'd be out two accounts. So I just waited out the limit. Measure and counter-measure. ;) ).


Why does HN have such aggressive and seemingly illogical post rate limits anyway?

Is it a theory about increasing quality of communications? It can't be a tech bottleneck.


I don't have this information first hand, but that's my assumption.

Dang was handing them out like candy on January 6th. And I think he was justified in doing so; there was a coup in progress in the United States, so discourse here went completely off the rails.

But it's a very easy to implement method of throttling volume, which helps improve the conversation by minimizing opportunities for people to gish-gallop. You can email and ask to have it removed; I have refrained from doing so because it serves as a gentle reminder not to get dragged down in the lowest common denominator of what passes for discourse on the site from time to time.


> Let's be honest here. Your service has likely caused millions of people harm who one day to the other are suddenly blocked from half the WWW

If this was true, Cloudflare wouldn't be a good product used by a lot of sites.


Excluding people who are poor, weird, privacy-conscious or otherwise inconvenient from your site is a feature not a bug, especially when you can pretend it's an accident.


You're assuming website owners are aware of the issue… How would they know? Cloudflare is just telling them it blocked a bunch of bots.


They have before and after analytics.....


They will presume the before traffic was bots? Unless they also see a drop in sales or ads they won't notice.


I mean if sales didn't drop why would they care?


If it's a ebsite that doesn't sell anything they won't notice.


That's a false dichotomy. It is both a good and a bad product, depending on perspective.

To a large firm, 1% failure is acceptable. To the affected 1%, it's a disaster. Consider wrongful imprisonment as an example.

The penalties for being excluded from the web are fairly severe, and looking to become more so. CF is fairly lean; there is no available human to operate an escape hatch for when things go wrong.

When I'm king, every block or account suspension must provide a phone number, and hang the inefficiency.


And even when it doesn't block you completely, it delays website loading, makes you jump through frustrating captchas, etc.

It's probably third in the list of frustrating web behaviors in the past couple of years (behind GDPR popups and registration/paywalls that seem to have gotten much worse recently).

And somehow there are some sites that I get CF delay walls on every time I visit.

This feature is utterly broken for a good web experience; it pushes users away from sites which use it.

Every time that "checking your browser" page comes up for a legitimate user should be considered a failure. Sure, it can maybe happen a few times in a thousand, but the feature is utterly broken if it comes up every time I visit the same site from the same browser not in private mode.


It's worth noting that the websites that you are visiting chose Cloudflare, and have enabled the features that irritate you. They have browser integrity enabled, have bot protection enabled, maybe turned the security level up (gitlab famously is a nuisance because they lean heavily on Cloudflare for protection). Sometimes they've wholly barred VPNs or entire geographic areas! And that is entirely the decision of website operators, and note that they did all of this before Cloudflare came along.

Cloudflare's customers are website operators, not you the end user. Those website operators seem pretty pleased with the service, so clearly they are doing a good job for the people who they are building it for.


And every Cloudflare customer is a company I won't do business with (unless there is absolutely no way around it)

Cloudflare is running the single biggest, most blatant man-in-the-middle attack in history, and far too many people are happy about it


Agreed. The same goes for the 3rd party "data privacy" popups which simply hide a long list of opt-outs several layers deep in a Vendors list. I refuse to use such sites and I let them know by email.


In what way is it an attack? (I know what a mitm is, I'm not asking you to explain that - I'm pretty conversant in the concept of a proxy, I'm asking you to explain why it's an attack specifically)


they block and/or slowdown vast swaths of the internet

if that's not an "attack", I don't know what is


I don't get it. They offer a service that that people choose to sign up for and take active steps to use. I don't see how that's an attack. Honestly I'm still trying to understand who is being attacked.

Like is it an attack on the site owner - are you saying cloudflare is extorting them or something? That seems unlikely but I agree that would be a form of attack... it also doesn't seem to be what you're saying.

Is it an attack on the user of the website because the website owner successfully denies visitors it does not want? Does that mean that login credentials are a form of attack too? Would an on-prem load balancer or WAF that dropped all traffic from a region or matching patterns still be an attack?

It just doesn't make sense that it's an attack.


They block and/or slowdown over 20% of the internet

How can you not see that as anything but an "attack"?


The cloudflare customers who benefit from the bot protection do not see it as an attack. On the contrary, they see it as a defense from an attack.

Also, it’s quite disingenuous to label cloudflare as only slowing things down. One of their primary functions is a global CDN/cache which significantly speeds up otherwise bandwidth constrained sites.


Man in the middle attack typically implies an unwanted third party, which in this case is not true since Cloudfare is explicitly and voluntarily trusted by the host server. It wouldn't be all that different if the web server had the browserintegrity checks developed themselves.


>an unwanted third party

This is precisely what Cloudflare is doing to end users - causing problems like OP (and myriad others) experience by slowing down and/or blocking major chunks of the internet


I understand that it may be viewed as unpleasant, but ultimately if you install a proxy on your end that the server does not like (say an ad-blocker), I don't think it would be fair for the server to say its suffering a MITM attack. Likewise, even if the client is not happy with the third party the server is requesting, it still doesn't make sense to call it a MITM, IMO.


And this is precisely why I don't bother reporting Cloudflare's failures to site operators anymore. I used to do it, when it was pretty infrequent. Site operators were usually concerned that something was blocking customers, but most were clueless about what was causing it or how to fix it.

Eventually I gave up. I don't even bother with their captchas or other stupid human tricks anymore. Whenever Cloudflare gets between me and the site I'm trying to use, I move on and shop somewhere else. Life's too short for this.


Why not take a screenshot of the CF error and send it to the website owner? It would freak me out if I thought a significant number of my website's users were being blocked by CF.


I’ve done this before, and the response is always “this is the first time I’ve seen this” and “you must be a bot operator”.


+1 to anybody who creates a site to name and shame CF customers who block legitimate traffic. For a few months now I've been taking screenshots every time this happens, but with no end goal. Complaining to the individual site owners feels like a lifetime commitment, and there are virtually none I need that badly.


I have done that numerous times. Even sent a screen recording of the Cloudflare spinner of hell. The response is always the same: you must be running some shady software on your machine.

Cloudflare is acting as judge and executioner, and site owners never accept that the product may be faulty.


They will just tell you to use unmodified Chrome.

And soon with Web Integrity API they may start telling you to use Chrome on Windows or MacOS, rendering Linux completely unusable.


The work needed to maybe get it past outsourced customer support is not at all worth the effort for any site I don’t actually need to use


How do you send them that when you can't access the contact form and/or contact information on their site because Cloudflare blocks it? (assuming a normal visitor, not someone who knows about whois etc.)


Send it to the domain contact from WHOIS information.


Your alt solution is what?Everyone should build their shit to handle millions TB/s of DoS traffic?


Block the countries it comes from?


This is what Cloudflare already does, and it's hellish for users.


It sure would be nice if there was a reason DoS could only come from countries other than those of your customers/users. But that's not the case.


This is a very nasty comment. I was wondering if I could find some things that could lead to an exception.

But I'm pretty sure that millions of users aren't using stuff like w3m pager ( https://news.ycombinator.com/item?id=34175754 )

We're all technical here, we are the edge cases. We use exotic software / combos. Let's not get carried away here

The PM of cloudflare uses Firefox, I sometimes use Firefox and I don't notice any difference ( concerning this use-case at least).

If you want help, perhaps describe the actual use-case that is blocking you to him. He shared his email.

- country

- software ( VPN, ... )

- browser

- OS

- traceid

- ...

Either way, buying shady proxies as you mentioned is already a warning flag.

While using Firefox is not :)


Anecdotaly... I use Firefox and have noticed the Cloudflare interception pages verifying I'm human appearing more often recently. Usually it is all automatic and isn't a big deal, but I have noticed a increase in how often I see these the past week.


I use a VPN 99% of the time and I definitely see more cloudflare on my systems (as opposed to when I'm bored and surfing in the library) and see the "checking" screen. It does seem to be a lot better than last year though when it seemed I was getting a captcha every time I turned around


I have the same issue and often just reconnecting using a different node/city resolves it Annoying as hell, but sometimes the problem is the nodes in the US show as being in EU for some reason.


Same for me. Although i have ublock and canvas block and I have 3rd party cookies blocked.

2-3x per day i get some sort of "click here if you're a human" thing from cloudflare.


Same for me


Also anecdotally, I use Firefox and I haven't noticed an uptick in the amount of CAPTCHAs I need to solve. I don't even see the "connection secure" page.

Could it have something to do with that ticket extension I'm using (Privacy Pass, looks like it's called)? I don't know if it does anything.


Adam, the problem I'm running into is due to the IP proxy I normally use having been changed from ARIN to RIPE due to an ownership change at the hosting datacenter, which is still in NYC. Thus, nevertheless, I show up as coming from the UK, it looks like, when I access Cloudflare-protected sites in the US, and I'm running into more and more of them. The local newspaper, grocery store, credit card co's, etc., It seems that Cloudflare IPv6 geolocation is broken, and interferes even if you're coming from an IPv4. This is just asking for trouble if you ask me.

Troubleshooting done. If it's any consolation, I don't think Cloudflare is the only offender. Geolocation is a crappy idea to begin with, if you ask me.


FWIW I see this with Firefox when I route my traffic through ProtonVPN.

It could be caused by someone else's bad behavior on the VPN but I'd hazard a guess that it's more than that.


I've definitely seen this from time to time. I used to work for an ISP and we would occasionally, in the office, get "Your system is sending too many automated requests" from Google. Usually one of our customers had gone off the rails with some sort of amateur scraping, but this was always a pain to debug. I think we talked with Google's NOC and just had our rate limit increased or something like that.


Do you see the challenge and then you're able to pass? Or does the challenge loop forever?


I see the challenge almost always and most of the time it passes after I manually interact with it but I'll get a looping challenge every once in a while and it persists until I change VPN servers.

I don't think I've seen it for a week or two now but I've certainly encountered it in the past for spans where it'd occur at a frequency of maybe once every two or three days and then go away for a while.


> I'd love to look into what the cause of the problem is

No, you don't. Tor Browser is constantly blocked by Cloudflare and the captchas cannot be solved. And you know it.


I dont know which type of Firefox you use, but any reasonably tuned browser (in the privacy sense) fails your systems. I literally didnt have a single instance of passing them without handing over a pixel perfect fingerprint.


Would you be able to send me a rayID of a failed challenge so I can take a loop? It sounds like you can use https://gitlab.com/users/sign_in to generate one.

You can either reply in the comments with the ID (no PII), or email me at amartinetti at cloudflare.com and I'd love to dig into it.

We're building Turnstile because we want to make challenges a better system than CAPTCHA. It sounds like for you it's worse, and we want to fix that.


7f3bfdf6bee5b9ea

here is one. I'm not on my PC so i used a privacy enchanted fork of Mobile Chrome. Enabled WebGL, WebRTC and WASM. Disabled all the fingerprint resistant features i could easily (the only thing messing with your systems could be the HTTP Referrer or Timezone)

Perhaps its a DNS-level issue? Does cloudflare use any google related APIs to provide the integrity check?


> Perhaps its a DNS-level issue?

I run dnscrypt-proxy and I have seen CloudFlare protected sites reject me based on that. I haven't been able to pinpoint which upstream resolver provider causes it as I have it set to automatically cycle through and it's intermittent enough that I haven't bothered getting to the bottom of it (ie. doesn't happen for several weeks and then happens for 15 minutes or so before resolving).

Perhaps it's one of the default DOH providers being used by Firefox?

https://wiki.mozilla.org/Security/DOH-resolver-policy#Confor...

Presumably not CloudFlare themselves, though.

I'm not a Chrome user, but does Chrome automatically use Google DNS these days?


I was using my PiHole and switched to NextDNS to check. Didnt work.


Not OP, but GitLab always cycles for me on LibreWolf, even with "enhanced tracking protection" turned off. It's likely because I disable WebGL?

7f3b42d2bee22efb


Could also be web workers if you're restricting those? Turnstile won't even load if web workers are disabled, it has no backup logic for that scenario.

I can get into the linked site but only if I turn on web workers (I also have WebGL turned off), and while I don't have the RayIDs on me, I have run into scenarios where Turnstile refuses to let me on websites before. I'll add a second vote on here that Turnstile has been worse for me than the system it replaced.

It's kind of wild to me that Turnstile doesn't seem to have a fallback. Users can specify one I guess? But they're not required to, and Cloudflare does have some responsibility for giving website operators the option to just turn off alternate challenges.

The end result is that if something goes wrong while Turnstile is loading, it's just... done. It just sits there. No captcha, no advice, no feedback, no error message, we couldn't load the code we wanted and now you get to look at a spinner for eternity with no indication of whether you're blocked because of a browser config or because you don't have cookies turned on or what. And captchas have a ton of problems, but Turnstile is openly designed to test for browser API presence, it's openly designed to use black-box AIs to test how similar your browser is to other people's who have passed before. It's no wonder at all to me that it's tougher on less common browser setups. I'm grateful there are people from Cloudflare willing to help debug these issues, and I don't doubt Cloudflare's intentions, but if I was trying to build a system to encourage browser homogeneity, Turnstile is what I would build.

I used to resent being asked to prove I wasn't a robot. Now I resent not even being given the option to prove I'm not a robot.


Enabled them and restarted, still nothing (although there's a chance I messed it up somewhere, I haven't tested it thoroughly). But honestly, I don't really care given that 99% of the web works otherwise.


7f3d42342ecb4d7f


I can’t get through any cloudflare challenges on a standard iPhone when I use iCloud Relay. Your product is as anti-user as it gets. It’s obvious you avoid user testing and instead look the other way because claiming to have solved the bot problem is just too profitable.


I see lots of challenges on Firefox, but I attribute this to my use of container tabs. Is this a reasonable expectation?


Anecdotally I notice this same issue. In your Firefox install do you have `resistFingerprinting` turned on, and do you have Firefox's anti-tracking protections turned on? It's possible if you're using a default install and if you're not using VPNs that you might never see a difference between behaviors. But that's only a guess.

My experience is that Firefox as a policy is not blocked, but if anything about my setup looks sketchy (I'm on a VPN, I have Javascript disabled, I'm blocking cookies, etc...) being on Firefox seems to make Cloudflare a lot less "tolerant" for lack of a better word.

I don't think Cloudflare has a policy against Firefox, but I do vaguely suspect that certain behaviors that wouldn't trigger blocks for Chrome do trigger blocks for Firefox (particularly if it's hardened). I don't have any hard data to back that up, maybe it's my imagination -- but it is what I personally notice.


I think your automated alerts are probably too low sensitivity (understandably, because it's probably an impossible scale to handle if they're able to catch false positives). FWIW I've seen similar for a short period of time, and know people who've had it more persistently.

But my biggest practical complaint at the moment with cloudflare is that it intermittently inserts captchas in the json responses sent from Roundcube webmail - pretty amazing.

(The webmail server in question is hosted on a uni network that paid for cloudflare between themselves and the internet, so being indirect cloudflare "customers" there's no support channel. Hooray for scale)


I've seen this behavior on sites with CSPs that'd break the challenge, They somehow get loaded from cache and cause failed requests.

This somehow even persisted into the browser's incognito mode, and I had to use an entirely different browser. This wasn't on a small unknown site either.

(It looks like pinned CSPs are a dead standard, but did anyone implement it?)


Is there an easy way to report false positives instead of having to attract the attention of an employee on social media?


What are cloudflare’s plans regarding browser attestation?


I have noticed that on StarLink some sites behind CF go into "prove you are human" loops that are impassable.

What causes such loops? Just a challenge over and over.


> Just a challenge over and over.

It must be intentional. Not unlike the endless loop of frustratingly slow-fading reCAPTCHA challenges that don't go anywhere. The user gives up after some time, but doesn't see any explicit error or page blocking their access. I imagine it must be quite effective.


in my opinion, the admins do not enable the option to ban lower trust users (or set the threshold high), so CF tries over and over again instead of doing that.


There’s a patent for that. The catchpa loop is basically a honeypot for bots… and privacy-conscious legitimate folks.


I've had the exact same problem for a while. Here are some of the sites I've been unable to access (found by searching for "just a moment" in my browser history):

- https://gitlab.com/users/sign_in

- https://steamdb.info/login/

- https://www.zabbix.com/forum/

- https://casetext.com/

- https://namemc.com/login

- https://spinroot.com/

- https://camelcamelcamel.com/

It's really annoying and Cloudflare is apparently doing nothing to fix it as this has been going on for months if not years. I guess Cloudflare just hates the open web and really wants to enforce Chrome/Chromium/Blink hegemony.


Would you be willing to share a rayID you see during one of these looping challenges? I'm the PM for Cloudflare's challenge platform, and we'd love to look into this. RayIDs contain no PII so you can share publicly, or feel free to drop me an email at amartinetti at cloudflare.

We'll also release a reporting mechanism soon, so in the future you can let us know when you see these issues and we can react to them quickly.


Such a classic and incredibly annoying SaaS PM move. Pinky-promise that you mean well, pretend to be invested in the issue, ask customers to supply evidence and say you'll look into it, followed by radio silence and no follow up whatsoever.

Incidentally, another Cloudflare PM for Pages asked me to do the same thing--I shared my account ID, the request, the problem, timestamps, etc...never heard back ever, request went straight into the void.


Yup. It's all show.

A service has injected itself between you and your goal, it's going to periodically impede you from reaching that goal and then lie to you about why, all while making money off of the arrangement.


It's it more the like the owner of the website has intentionally gone out of their way to add a service between you and the website to solve issues the website owner feels are more important then you?


Here's some loop samples;

- Gitlab; Ray ID: 7f3961b4ec46c443

- Zabbix; Ray ID: 7f39624d982bc32e

- NameMC; Ray ID: 7f3962e68d251871

- Camelcamelcamel; Ray ID: 7f3962eb9cbb421f

Easily can recreate at least the never ending loop by flipping on ublock origin's 3rd party scripts and 3rd party frame blocking, which matches their recommended medium settings.


Thanks so much to you and everyone else who's supplied these. I'm collecting them now, and the team is looking into this.


It would be nice, once the investigation is concluded, if you guys posted the findings on the cloudflare blog. Otherwise it would just feel like a "your call is very important to us, please hold" kind of situation.


I think this is fair! I can promise a public blog update in the next 90 days that includes a progress update on the work we're doing now to reduce real humans being blocked and announcing the feedback form users can click on to easily let us know when there's a problem.


Would you be able to clarify your comment about ublock origin? Cloudflare's challenge page (any captcha provider as well) is a third party script. If I enable these settings I don't see the challenge load as all. Are you enabling ublock origin before entering the challenge or sometime later?



Here's a handful:

- 7f395b5ddfe43a54

- 7f395ca09bfa3a54

- 7f395d8afaf73a54

- 7f395f075e33690d

- 7f396102afef35fd


Thanks for the examples! Would you be able to share browser and extension information with me? If you don't want to share publicly I've dropped my email in this thread.



I also cannot access my VPS provider when using firefox.

ray id 7f3a169d4e630306

I previously had the same problem with ungoogled-chromium as well (regular chromium worked), but I guess it works now after 2-3 loops.


Would you be able to drop me an email at amartinetti at cloudflare dot com with more information on your setup? Some of the signals we're getting from your browser don't seem to match what we'd expect to see. We'd love to better understand what's causing the mismatch so we can improve our logic.


all from Opera Mini:

- https://gitlab.com/users/sign_in 7f3e45c3cebfb90f

- https://steamdb.info/login/ 7f3e4a04bf7a0e39

- https://www.zabbix.com/forum/ 7f3e4b681f8f1cc6

- https://casetext.com/

7f3e4cab4af40b05

- https://namemc.com/login 7f3e4debdf6cb7f1

- https://spinroot.com/ loads normally, no delay or blocking

- https://camelcamelcamel.com/ loads normally, no delay or blocking

Adammartinetti, I appreciate your interest in doing this, but would love to hear that CF maintains a giant white board in the developer area with the name of every TLS 1.3 web browser known to mankind (the same data on a Group Policy-enforced internal home page would be even better), to reinforce the idea that it takes more than Google to make the world go round.

Personally, I'll add myself to the list of people who think you've created a game you can never win, and thus shouldn't be playing.


gitlab 7f39759e1abe1bce

casetext 7f39762f693733e4

steam 7f397694995aa3b7

all over firefox


- Gitlab: 7f39707d0fa023af

- Zabbix: 7f3970eabe8ff196

- SteamDB: 7f396f534b0400d2

- Casetext: (works)

- NameMC: 7f3971a01a22d5a8

- Spinroot: (works)

- Camelcamelcamel: (works)


I'd love to get more information from you on this. We don't see any suspicious signals from these attempts, and it looks like they were completed 100% successfully from our perspective. You can drop me an email at amartinetti at cloudflare dot com.


It happens to me all the time. And it has been going on for years, but it's getting noticeably worse over time. One way or another you have to pay to use the web, be it costing you loss of access because of your strict privacy settings or paying by giving away your privacy. There's no win here..


I haven't had any problems on Waterfox. However, it is absurd to me I need javascript to simply visit a website anymore.


Sadly, the fact a given site works for you or me is no guarantee it works for someone else.

These bot detection systems tend to use all manner of imprecise statistical heuristics and weird fingerprinting.

Perhaps AegirLeet has a graphics card that a popular web scraper pretends to have. Maybe they're in a suspicious timezone. Maybe they've installed a font usually only found on a different operating system. Maybe I'm never blocked because I have an excellent IP reputation, due to regular visits to approved websites.


Fingerprinting is scarily accurate now, strangely. https://fingerprint.com/


Somehow, it doesn't matter. Like the fact that the author shared non-repudiable identification information with the site also didn't matter and he was classified as a robot anyway.


As a data point, enabling privacy.resistFingerprinting on Firefox defeats this website.


yeah, starting that that invisible pixels and cookie tracking are way in the past and unless you're using something like tor (and not changing the resolution) then they know who you are. I mean you can still block ads and cookies, but I figure they really do know who you are.


that’s because websites have evolved into web apps.


Most websites haven’t done that


Yeah, gitlab also blocks me from logging in (via its cloudflare use). It did so even when we paid for it. We no longer do. (for other reasons, but anyway, good riddance)


Users want to chrome hegemony and don't care about the open web or Firefox. Its the number #1 browser on desktop even though it doesn't come with the OS. Windows comes with edge and macs come with safari. Users have to download Chrome.


All browsers so far have come and gone in popularity. Even when it seemed unimaginable.


But today everything but Firefox is Chromium.

That's a little different


And 20 years ago everything was IE (at >90% penetration)


The problem is Chromium ,not Chrome

Like you have the illusion of choice, that's what I'm talking about and that's different


Firefox entered into a contract per-install with Google that did no evil while Google iirc was building their own browser secretly.

Browser engines are now open source a lot more than they were.

I don’t think it’s about illusion of choice as much as some browsers actively working to degooglify themselves from Chromium and maintain it.

Some browsers are maintaining their own forks, others aren’t.


I've noticed the same issue for years. The Microsoft acquisition pushed me from Github to GitLab. CloudFlare pushed me back.


If only there was some open standard for browsers to verify that a real human is visiting a website, so that website owners wouldn't have to rely on bespoke hacks that only work in chrome.


This would be great if it didn’t have any downsides. China has a system like that: QR code to login everywhere. Everything is linked to your phone number which is given after taking a picture of you and official ID.

We are gonna have to live in a slightly bot-rich society to keep this at bay.

It starts with browser control. And then, ends with needing human verification to ssh into a server that you own. Let’s just build better security.


The problem isn't that the hack only works in Chrome, it's that the system being proposed is inherently terrible regardless of how it's implemented.

There is no such thing as a reliable standard for browsers to verify that users are human that does not harm the open web or threaten user autonomy and accessibility. Every single accessibility standard and user choice about extensions and access is abusable by malicious actors, and every security measure to block abuse of automated scraping or access also blocks valid use cases.

Making it a web standard won't change that fact.


Yes, an open standard that any browser could use to prove human interaction would be great. It's also impossible, of course; all attempts so far lock in specific software or hardware stacks and then pretend that bots can't use those systems, guaranteeing both false negatives and false positives.


They're booing, but you know you're right. ;)


I see the "are you human" on there with a click, but no looping, it goes straight to the website


I can access them all fine using Firefox on Android.


Any time a large portion of internet traffic is controlled by a single source it brings problems like this with it. All cloudflare has to do is arbitrarily decide who and who can't use the internet and effectively their word becomes law. Like most things it starts with an innocent premise (e.g. "an easy way to stop bad actors") and ends up extended to any number of arbitrary things. Worse, the argument from privacy advocates rings hollow because defending privacy means you have to allow Bad People (TM). The average drooler using the internet cannot understand the nuance. Even in the most innocent of cases, a bad commit getting merged, can bring down the internet. It has happened before with Cloudflare.

Companies like Cloudflare, Google, Meta, etc are the reason anti-trust law exists. Unfortunately, it appears there is no one with any power that is willing to use the laws for their purpose. The internet in 20 years will be nothing like we've seen before. That's not a good thing.


You don’t pay Cloudflare money nor are you customer. The customers are the sites that pay them money to protect their infrastructure. The customers have many choices in CDN/WAF providers and Cloudflare isn’t even largest, Akamai is. Fastest growing CDN is cloudfront. There is healthy competition so why would anti trust laws apply?


Cloudflare controls about 19% of websites. Of the websites that use a CDN, 80% of them use Cloudflare [0] (in a note at the bottom of link).

[0] https://community.cloudflare.com/t/statistically-speaking-wh...


Traffic by bandwidth/number of users. Akamai powers Apple, Adobe, Microsoft, US government, Walmart, bunch of the major streamers.


Maybe some kind of public nuisance law applies then? Akamai knows their place and doesn't tell me to go pound sand when I'm trying to read somebody's obituary. Incidentally:

legacy.com

Ray ID: 7f3e7bad3afbb731

That's the difference to me.


Everyone forgets that websites become almost unusable because of crawlers and bots.

Website owners specifically choose for cloudflare to protect against this, it's not forced upon them by cloudflare.


And who defines is an allowed crawler? Is Google the only allowed company to crawl the web? Isn't that the definition of monopoly?

Could anybody still create a new search engine nowadays?


A valid crawler is following robots.txt


That's not going to be enough to pass Cloudflare.




Nice links!

But cloudflare references robots.txt a lot ( which i mentioned before => to respect robots.txt)

Additionally, they solve the authentication problem here. As a website owner that got bad crawlers ( that copied user agents), i just whitelisted Google's IP's and blocked all the other crawlers.

It seems that cloudflare is actually fixing this problem and making competition for Google possible here.

I recall an article that Bing actually circumvented robots.txt a bit, because site owners were only allowing Google and blocked all the rest => Bing. Which gave Google an unfair advantage ( searched for it, couldn't find it)

Similar article to highlight the issue: https://www.fastcompany.com/90709672/the-little-known-reason...

Our opinions about this seem to differ severely. Cloudflare actually enables good bots to start competing ( while respecting robots.txt).


>and honor robots.txt.


Website owner complains elsewhere in this topic of blocked users (country-specific):

https://news.ycombinator.com/item?id=37050774


That's not a lot of information for such a complex subject.

Eg. If you live in a dictatorship and use a VPN. You're traffic is together with a lot of people.

The website owner can disable cloudflare their checks and that will leave their site unprotected. The choice of that is up to the website owner, no?


It seems like a lot of people are pretending that because there's a contractual relationship between Cloudflare and many websites, that Cloudflare can't cause outsized harms to other parties.

It really sucks for a third party to claim you're a bot and as a result you lose access to resources. And the potential for harm is only increasing.

Worse still, the false positive rate appears unacceptably high (look at all the people here with substantial issues) and there's no recourse unless you get a highly voted thread on hacker news.


Tbh. There's a lot of technical folks, worldwide, here. That's a lot of edge cases in comparison.

And their technical skills probably skews perception ( exotic browsers, behavior, VPN, Tor, ... )

As such, can you say, by indication, what percentage of web surfers "a lot" is? ( = Including non technical)

I don't disagree that people can have problems. I'm just wondering how representative "a lot" in reality is.


Even if it's .01%, 1 out of 10,000 people losing a substantial fraction of utility without recourse sucks.


The website owner can disable cloudflare their checks and that will leave their site unprotected. The choice of that is up to the website owner, no?

Yeah but what sort of transparency does Cloudflare offer website owners about what kind of traffic was blocked and *why*?


Every blocked request is logged with the corresponding rule. I’ve had multiple times where someone complained about being blocked and didn’t realize that they had malware on their PC.


Hah, you know, most crawlers were fine. The only one that actively DDOSed websites was fucking Yandex. It doesn't respect robots.txt and it will actively fight against any rate limits by spawning connections on new IPs the moment one is blocked


Search engine crawlers are a subset of crawlers. Sometimes you’re dealing with aggressive screen-scraping from competitors or various marketing tools.

I’ve had to deal with these things easily bringing sites down.


Same, and when more than half your links are dynamic search results, it can pile on and really bring things to a crawl. I worked on a fairly popular auto classifieds website, and more than 90% of traffic was various scrapers, and some were definitely a burden. Worse, is that it doesn't show up in analytics as it's not running client-js... Ironically equally bad was when bing started scraping with JS and it skewed google analytics.

If all we had to deal with were the users, wouldn't need nearly the spend on the site. Started manually blocking some of the worst offenders.


? I personally experienced many unbehaving crawlers/bots not respecting robots.txt and behaving like another user agent.

Feel free to prove me wrong and disrupt cloudflare by only handling that use-case


Found an old reference of mine what I did to block crawlers/bots

https://news.ycombinator.com/item?id=34101988

> Had a lot of spammers with Russian language. Implemented expanding xml-bombs, Google Captcha, hidden input fields and a couple of other things against bots. But the block on the russian language was most effective ( and since I was dogfooding it, I didn't see the harm at the time. But it's out of scope at this very moment, yes).


> Everyone forgets that websites become almost unusable because of crawlers and bots.

No one "forgets" this because it isn't true.


I don't agree. Maybe don't run WordPress that makes 100+ dB queries before first page load on some cheap $2 vps.

There are all kinds of tools that you can easily deal with bots and the large DDOS your ISP can handle for you if you are willing to pay for it.


> the large DDOS your ISP can handle for you if you are willing to pay for it

That's exactly what people are paying Cloudflare for, because contrary to your local ISP they are actually competent at blocking a DDOS attack.

People use services like Cloudflare exactly because they don't want to spend a fortune on complex infrastructure just to deal with abuse. Even a mostly-static page running on a reasonably-specced server can easily be overwhelmed by an attack. Why spend $1000 / month on hardware when you can spend $100 / month on Cloudflare's protection?


> That's exactly what people are paying Cloudflare for

Cloudflare actually provides this service for free (for simple use cases at least).

I don't know how to come down on this issue. On one hand, I am against the centralization of cloudflare and the risks that come with it.

On the other hand, cloudflare allows almost anyone to set up a simple website and serve it to large numbers of people with very little resources/cost and advanced protection from DDOS attacks.


Similar mindset... also really intrigued with their developer tools as well. Workers, pages, D1, KV, etc. I was playing with a static site generator that deploys directly to a Cloudflare Pages setup, and it's lighning fast everywhere.


Psst https://ai.cloudflare.com/

Note: light use-cases ( or should I say shared models?). Not heavy GPU tasks


And yet why I am not a fan of a lot of this...

input: This is fucking badass!

result: [ { "label": "NEGATIVE", "score": 0.9994425177574158 } ]


Well, if some website you like uses cloudflare to block bots, maybe you can offer them some help to pay for these tools and to set them up?


The current internet is nothing like the internet of 20 years ago.


Cloudflare does not have nearly enough market share to be an anti-trust concern.


Cloudflare controls about 19% of websites. Of the websites that use a CDN, 80% of them use Cloudflare [0] (in a note at the bottom of link).

[0] https://community.cloudflare.com/t/statistically-speaking-wh...


Afaik, neither of those numbers legally constitute a monopoly and wouldn't qualify for antitrust action.


They should


It will likely be up to the EU to step up again.


> Companies like Cloudflare, Google, Meta, etc are the reason anti-trust law exists

Only if Cloudflare stops you moving to a competitor.


They [effectively] do prevent you from moving - they're guilty of vendor lock-in and running a passive, public man-in-the-middle attack platform (how many folks are using Cloudflare for DNS - either directly, or because their ISP is?)


If you've ever tried to take apart Cloudflare's various session cookies, MITMed scripts sent for "high integrity" pages (or when in "super bot-fight" mode), etc., you'll have observed that it's basically running a web-worker to heuristically do browser-integrity checking. That is, Cloudflare is trying to run a series of tests that real browsers operated by users pass, but which headless browsers operated by bots will fail.

These range from pretty simple things that check that the browser is actually a browser rather than a raw HTML parser (e.g. "draw an image on a <canvas>, export it to PNG, hash the PNG, compare to an expected result"); to things that check for low-effort headless-browsing techniques like the one you get by default using Puppeteer in a Lambda/Cloud Function (e.g. "do we have the weirder fonts you'd expect to exist on a consumer OS, but which these default batteries-included container images don't bother to bake in"); to things that work really hard to detect the "scent of humanity" through the browser (e.g. "before the user activated the integrity-check prompt, did we record a sequence of 'extraneous' mouse movements and key events that look like a human making individualized mistakes on their way to completing the form, and don't look like a recorded capture of such similar to other ones we've seen recently.")

If you're getting caught in a verification loop, it's because you're using a browser or device or extension that obscures/disables enough of these heuristics that Cloudflare can't get proof positive that you're a person rather than a bot — and so, under whatever settings the site-owner has it set at, it will just keep trying to get that proof, rather than telling you you've failed and been blocked. (Why? Because telling a bot they've failed tells them that they should stop trying something that's not working and instead — in the words of Star Trek technobabble — "rotate their shield frequency" before trying again.)


From the PoV of the user it just looks like the website is broken. No exception in the console, contant reload looping of the same page, etc.


> Why? Because telling a bot they've failed tells them that they should stop trying something that's not working

In my humble opinion if your bot is stuck in a CloudFlare loop for 10 minutes that's a pretty strong signal that something's not working...


Keep in mind that a bot will be picking some permutation of its stock library of UA+metrics info, and generating truly-random values for other more continuously-valued parameters (e.g. timing between actions), to try to find a combination that satisfies a backend integrity-check.

A "try again" just means "you haven't succeeded yet." If that's all you get, you're getting zero bits of new information — so you can't do anything other than to assume it was your timing that looked weird, and keep trying. (And you might be dealing with even more noise, e.g. trying to have the bot calibrate itself toward a very low human-tuned request rate limit, where above-rate-limit responses look no different than integrity-fail "try again" responses.)

Suddenly getting a (maybe permanent) hard-fail, meanwhile, means that you said something the integrity-checker really didn't like.

Presuming you have a lot of IP addresses to send requests from, you can then do many experiments to bisect the difference between a hard-fail and soft-fail, and use that to blacklist values from your UA+metrics library. It's free entropy!


While I'm inclined to agree, it's an old rule-of-thumb to give a potential attacker as little information as possible so they have to do the legwork to get un-broken.

(I yearn for a world where auth challenge failures give proper error messages so I can figure out why my regular, human-used authentication channels aren't working).


How does that explain blocks that happen to less common browsers and/or less common platforms?


Heuristics are about optimizing between false positives and false negatives.

Many headless-browser stealth techniques involve rotating between the signatures and reflected metrics of real — but niche and/or ancient — User-Agents. (For some reason, the developers of these stealth systems think that variety beats commonality. Maybe it makes sense if they're specifically trying to overcome Apache mod_security's signature-based UA blocking or something.)

It turns out that when you actually see one of these UAs in your server logs, it's far more (99.99%) likely to be a stealthed bot that picked that UA out of a bag, than it is to be an actual niche/ancient UA.

In the case of the niche UAs, this is a tragedy of the commons.

In the case of the ancient UAs, though, there's no downside to blocking them entirely — because if the traffic is going through Cloudflare at all, then you're already requiring of the client a minimum version of TLS that the real old UAs can't even speak. So the only things actually saying they're that old device — but managing to get through an HTTP request at all — are stealthed bots.


> Many headless-browser stealth techniques involve rotating between the signatures and reflected metrics of real — but niche and/or ancient — User-Agents

I work in this space and we just use fingerprints we collect from actual users over the previous month. The hardest work is: 1) reverse-engineering the javascript of the CAPTCHA/fingerprinting solutions so that we can collect and encapsulate the fingerprints correctly in a way that looks native to cloudflare/recaptcha/etc, and 2) Training AI models to solve the captchas well enough.

Sounds like most of the people you catch are using ancient user-agents. But I doubt most of the people you want to catch are.


I mean, when I'm talking about stealth techniques, I'm not talking about what I'm seeing in my own server logs, but rather about what I find built into various bits of, ahem, "anti-detect software for affiliate marketing" tech that I dredge up from fraud Telegram groups, carding marketplaces, etc. that the attackers I do catch seem to frequent. (Gotta stay five steps ahead!)

I suppose there are verticals where the data is so valuable, and the garden around it so walled-in, that you could build a whole IT business with a custom scraping stack just around extracting that data to then resell it. (I presume that's the business you're in.)

But for most verticals, the "attackers" you'll see in your logs aren't people building a data-broker business, and so aren't building their own secret-sauce anonymity from scratch; rather, they're end-users who want to do an end-run around your rate-limits, commit promotion fraud, etc., and so want to buy anonymity as a product, script-kiddie style. And "anonymity as a product", sold publicly (rather than through high-value contracts) tends to suck. It's script-kiddies buying from script-kiddies, with no real engineering in sight.

> I work in this space and we just use fingerprints we collect from actual users over the previous month.

Are you sure you're not in a citogenesis cycle? How sure are you that some of those "real users" aren't your peers' stealthed bots, who in turn picked up those fingerprints from unknowningly observing other stealthed bots in their logs, who...


> Are you sure you're not in a citogenesis cycle? How sure are you that some of those "real users" aren't your peers' stealthed bots, who in turn picked up those fingerprints from unknowningly observing other stealthed bots in their logs, who...

Doesn't matter, they work (at least when used in combination with high-quality proxy IP's). If they stopped working I'd do something else. We only apply hard science when absolutely needed, otherwise it's mostly wire and duct tape holding things together -- ruthless focus on creating business value.

We definitely only sell this via high-value contracts, so you're probably mostly correct there. Though puppeteer-stealth deserves at least a quiet shout-out for not completely sucking.

That said, we do pay attention to a lot of the research in the field, even if we only apply the absolute bare minimum needed to create business value. Eric Wustrow[0] at UC Boulder does really, really good work in an adjacent space, and we've found some his papers/software to be helpful, as well as those of some of the colleagues he works most closely with. I don't think he'd love our applications of his research, but our technological needs dovetail well with the needs of the anti-censorship research that he works on.

If you were interested in the degree of "citogenesis", I think that's something that academic researchers like Wustrow et. al would be very well-positioned to investigate. Highly recommend any of their papers, they make front page of HN surprisingly often.

0: https://ericw.us/trow/


One thing that sometimes gets lost is site owners that use cloudflare have sort of global options for how paranoid they want to be, then they can make specific WAF rules that can be as granular and aggressive as they want. So at least in some cases, cloudflare gets blamed for website owners setting really aggressive rules. The effect on the end user usually looks exactly the same.

Case in point, I set a waf rule that blocked all non verified bot traffic from several big datacenters (Google cloud, OVH, digital ocean, etc). That turned out to be a mistake because a lot of corporations were routing their traffic through those ASNs for some reason. Now they’re blocked. They could have gotten pissed out cloudflare, the error page looks the same, but it was really misconfiguring it.


Anecdote: For my programming classes, one example I use is a simple browser. It doesn't do CSS or Javacript, so display is primitive, but it works.

On some sites. Many sites, especially the big ones, see that it's an unknown browser, and refuse to send content. Probably they think it's a bot. But even if it were, what's wrong with bots, as long as they're well-behaved?

What kind of closed web have we let the megacorps build?


I mean, the site gets to decide if it will service a request. theres no requirement for a service to respond to everyone.


This is the tech equivalent of "we don't serve your kind here", and limits any attempt at competition.


Playing the devil's advocate: Why shouldn't a server get to decide which clients it wants to talk to?


Why shouldn't a store get to decide the $protected_class of which customers it will do business with?


User Agent is not a protected class. Neither is Intentionally Obtuse for that matter.


Long term, I can imagine this becoming a corollary to Net Neutrality. Whenever that finally becomes a thing, the next step could be a law/rule that "public web sites need to be accessible by the public". Not that developers need to test their work in dozens of different browsers, but that they can't actively choose their customers.


In principle? Of course. I mean I remember blocking Yandex bots from hammering some e-commerce site on shoestring budget.

But each person developing a web scrapping bot realizes at most after a week that being honest with User-agent has negative impact on how well it works, and changing it to existing browser takes literally seconds.


It's also annoying how that check page ends up breaking page reload in Firefox. When Cloudflare redirects you back to the page it will happen via POST. This initial POST gets captured by Cloudflare, but if you reload the page that POST will go to page itself and there's pretty good chance it doesn't know what to do with that and just shows error.

The only fix is to navigate back to page somehow, either by going to address bar and pressing enter (to navigate there again instead of reloading) or finding some link that points you back to the page.

I wouldn't be surprised if those POSTs will end up banning you from some website since they "know" you shouldn't POSTing to that page so clearly you are evil bot trying to hack them.


The amount of times Cloudflare is making me sit through their 15 to 30 second "checking your connection" page is insane.

For people going through life with ADHD such as myself, the impact of all these delays and disruptions throughout the day can be severe. Despite being properly medicated this measure is absolutely debilitating and makes for a dreadful and very taxing online experience.


That's typically deployed on sites under heavy DDOS attacks. Nobody wants for their users to see that, they are forced to.


No need to simplify so much that only false dichotomies remain. My brain might be wired up differently but its capacity for nuance and reason is fully intact :)

No one is forcing the website owners to sign up with Cloudflare to enable this service with these aggressive configurations, and yet I understand why they would even just pre-emptively. It's cheap and effective, there's no denying that.

It is Cloudflare Inc. (66,59USD, +23.57USD/54.79% YTD), however, that architected the solution, markets it as a service, and controls it as a core part of their (i.e. everyone's) internet architecture.

As a serviceprovider they could be better at informing their customers of these unintentional side-effects and how they impact otherwise innocent visitors, but whose mental disorders/impairments cause them to be flagged for and having to undergo additional verification steps disproportionately more than others, likely due to some atypical behavioural patterns they show and their often adjusted hardsoftware setups producing an unconventional signature.

Some modifications to the system could probably be made on the architectural level too. We can get people in wheelchairs to the top of the empire state building, surely we can also find a solution that allows us to enjoy the benefits of these protective measures without wrecking the web's inclusivity and accessibility this much every time the measures need to be stepped up.

Am I asking for too much, what do you think?


I'm pretty sure anyone who deploys aggressive cloudflare DDOS protection knows the impact and believes it's the lesser evil.

If there was just as effective a way to tell cheap bots from legitimate browsers without making users wait I'm pretty sure it would have been used.


Being "pretty sure" is an opportunity and a starting off point, not a dead end. Not to forget the classic "Assumptions are the mother of all fuck ups".

Things aren't set in stone either. The most effective method to communicate over long distances used to be carrier pigeons, but only because they hadn't yet invented the telegraph.


Cloudflare isnt making you do that. The customer using Cloudflare has configured it to do that…


  "There must be some way out of here," said the joker to the thief
  "There’s too much confusion, I can’t get no relief"


Cloudflare's Privacy Pass may help here: https://privacypass.github.io/

It should significantly reduce the amount of CAPTCHAs you see in a way that's not terrible for privacy.

For Safari, you can enable Private Access Tokens: https://blog.cloudflare.com/how-to-enable-private-access-tok...

Both of these mechanisms are similar to Google's web DRM proposal in that they rely on external issuers to generate tokens, but unlike Google's attempt they don't guarantee that ad blockers are disabled on pages that try to use tokens.


Wow, that doesn't sound like a terrible idea!

Which is honestly surprising in this area where it feels like privacy, anonymity and human verification are incompatible with each other.

I am trying to minimise my time wasted by websites, which is hard to balance with privacy, one other one is the repetitive consent forms (if you don't retain cookies, it's a never ending process). I think consent forms and human verification are the 2 biggest human time wasters.


> Which is honestly surprising in this area where it feels like privacy, anonymity and human verification are incompatible with each other.

The thing is, it still allows for some correlation between attestation provider and the websites themselves, potentially exposing part of your browsing history to these companies based on how many tokens you use and what websites consume them.

That doesn't matter much for Cloudflare's implementation (now Cloudflare knows when you visit Cloudflare, oh no!) but with Apple's attestation provider the risks increase. The smaller the attestation provider gets or the fewer parties trust that particular attestation provider, the higher the risk becomes.

It's better for your privacy than the current norm (de-anonimisation through fingerprinting while you fill out a CAPTCHA) but it's still not great. It also allows for attestation providers (and their algorithms) to arbitrarily deny you access to the web if other websites decide to start using them.

Privacy in exchange for power, I'm not so sure about that. I imagine for someone suffering from ADHD the small risk that Cloudflare decides to screw you in particular is worth the massive improvement in browsing experience, but everyone will have to determine the pros and cons for themselves.


Cloudflare is "helping" more than enough already, but thanks for the suggestions.

Besides, those solutions have far too much in common with blackmail/extortion to my liking. Either you continue to suffer this structural harassment, or hand over all your bits and maybe in specific cases suffer slightly less! :)


Pay us with your information to make your problem go away.


What information do you think this addon gathers? Cloudflare already receives every bit of information this addon collects when you visit a Cloudflare website without it.


un-CF'ed websites? im speculating however


This should be illegal.


Just yesterday I realized that I couldn't log into Paypal on Safari or Firefox, only a Chromium-based browser. We're getting deeper all the time into "this site is best viewed in Google Chrome".


I have this on Twitch. I can't log in with Firefox + fingerprint resistance. Apparently 2FA isn't strong enough for account protection, I have to let Twitch uniquely identify my computer or it won't let me log in.

I just stopped watching Twitch streams.


I've been experiencing the PayPal one for a while. Firefox on Windows, Linux or Android. Thankfully the app is still signed in, but I've used credit card for things that I have PayPal money just sitting there ready to spend as I can't get into PayPal on Firefox.


Time to switch from PayPal to FedNow. FedNow is run by banks and the Fed, which are regulated as to whom they can refuse to server.


It's not available for consumers yet, and there are only 40 or so banks currently on the network.


Never seen an online shop accept that, nor do I know if it's open to UK citizens. But thanks


In the UK, you have Single European Payment Area payments. (Despite Brexit, the UK chose to stay within the SEPA zone.) The US didn't have a bank-level national service for consumer to consumer payments until FedNow. Just PayPal, Venmo, etc., which are second-tier services, which becomes an issue when they break.


Still better than cutting off most of the world. PayPal have their problems but neither a US-only nor an EU-only system is an improvement.


> The next day, I tried accessing a web page internal to my company… […] I couldn’t get past a security check page because of issues in Cloudflare’s software. […] The silliness of it all is that I was on my work device the whole time, which was behind my workplace VPN.

This seems more like an "IT department gone mad" problem than a Cloudflare problem. I'm surprised they'd rather switch to Chrome than submit a support ticket.

Having used passkeys for a month+ now via macOS/iOS/1Password betas, I don't understand how they're related or the author's concerns. Couldn't you just replace "passkey" with "password" in all of their questions?


Passkeys have optional attestation payloads, which is basically what WEI is doing. Google in particular doesn't recommend requiring attestation except in corporate-security scenarios, but the fear is that banking and media sites will require attestation anyway, which locks users into whatever attestation mechanisms supported by the server; so basically Google, Apple and Microsoft.


You know I knew there was going to be something I really didn’t like about passkeys, and here it is


Yeah, unfortunately the point of passkeys is to replace multi-factor authentication. Usually you have a username+password as the primary factor, and a secret that's hard to copy and replay as a second factor (TOTP, non-resident WebAuthn credential/FIDO, SMS code). Passkeys replace the primary factor with a signed challenge, but the second factor is up to the authenticator (such as biometrics). WebAuthn relying parties verify that the authenticator is locking the primary factor behind the second factor, and they do that with attestation.


Does Microsoft have much control over Windows Hello's key attestation? It's not clear to me how they could pull that off, other than just relying on TPM attestations which are easy to obtain as long as you can buy a TPM that works with your motherboard.


This has been my experience for a handful of years but of course it's getting worse. In the past I'd just be getting blocked from access commercial websites or applications and things I didn't really need. But in the last few years many scientific publishers have put all their content behind cloudflare walls. Pretty much my only hope of being able to read a paper these days is that it came out long enough ago to be on sci-hub or they published the pre-print on arxiv/bioarxiv/etc. Once arxiv goes behind a cloudflare I don't know what I'll do.


Suing Cloudflare for interference with contract[1] might be an option. Cloudflare is not protected against lawsuits by some EULA, because the outside user has no contract with them. They're a third party in the middle. Talk to a lawyer.

Most contract law lawsuits are settled out of court. The great advantage of suing someone is that you get past the low-level customer support people and talk to someone who's authorized to settle.

[1] https://www.lodhs.com/blog/interference-with-contractual-or-...


Can you provide examples of people suing Cloudflare to get around being blocked from accessing certain websites?


I've been getting stuck in the “browser integrity check” loop a lot on firefox lately. Not an issue in chrome, not using a vpn, etc. I assume it is some combination of extensions and/or settings in firefox.


Its happened a few times here too

I also have maxed out anti fingerprinting etc on FF, so it comes with the territory. I have to slowly enable JS on some sites to see if the loop will break, or i just navigate away.

I use all browsers except chrome, but i only navigate the web with FF


Users in Egypt are unable to visit my Fitness website https://musclewiki.com

Cloudflare is a huge part of the internet. Often they won't respond and it appears that for whatever reason, their IP range is blocked in Egypt. We probably get 10 support emails per week. I contacted Cloudflare and they simply said there is nothing they can do.


Sounds like you should stop using Cloudflare


I'm surprised to hear that. I actually used Cloudflare Tunnel to connect to a corporate intranet about 8 months ago while in Egypt, not sure if things have changed though.


If you don’t want to stop using CloudFlare or need a temporary solution ask your users from Egypt to use VPN- many already do as they come across similar problems for other services


Egypt do block VPNs in much more aggressive way than Cloudflare. Also this is my first time to hear that cloudflare is blocked in Egypt. People will even complain that they cannot connect to their cooperate VPNs.

Blocking cloudflare ip addresses means that half of the internet wouldn't be accessible from Egypt. its closw to blocking port 443 because some people use DNS over https.

disclaimer: I'm Egyptian living in the US.


Have someone in Egypt look at our site. Its been blocked for about 3 months.

Apparently there is a pool of IP's that Cloudflare use for their CDN and some of them are blocked in Egypt. If you are unlucky enough to be one of the websites that is using that IP, it's blocked. Apparently they rotate them, but I haven't seen it yet, so chances are, when they do rotate them, more will be blocked.


I feel like this these days: the right to decide if I am free to choose to use or access a service or website is not based on whether I claim to be human (in captchas tests), but based on the data people collect about me - and decide on them - something that I don't know behind invisible doors.

I thought privacy was on the rise after the data leaks and irresponsibility of the big tech companies, and the public's involvement in the issue of individual privacy, but it seems like everything is still a step backwards.


The fact that this person thinks Cloudflare has their MAC address leads me to believe they shouldn't be speculating on the "implications for the web"


"Checking if the site connection is secure" what a blatant lie. If you can read this you already have a TLS connection.


I'm surprised Apple PAT and Google WEI wasn't mentioned in the article.

Especially since Apple has partnered with Cloudflare on PAT.


They do mention it (Google's at least) in this section [0]

[0] https://jrhawley.ca/2023/08/07/blocked-by-cloudflare#implica...


Also, the scope of PATs is vastly different than WEI. Think of PATs as a "probably a human" signal that mostly replaces the need for CAPTCHAs.

https://blog.cloudflare.com/how-to-enable-private-access-tok...


Their scope is the same [0], both in terms of stated intent as well as what kind of things they could be used to attest. The only significant differences are that one is already deployed in prod while one isn't, one got a marketing blitz while the other didn't, and that they're done by different companies.

There are no vast technical differences, only incredibly subtle ones.

[0] https://www.snellman.net/blog/archive/2023-07-25-web-integri...


I cannot access flyertalk.com, which hosts lot of useful airline content from any IP from my country. I tried reaching out via email as mentioned in the error page and admin does even have a valid email posted anywhere.

I know cloudflare is not to blame here, but they provide way easy access to blocking to bad admins.


Boils down to gatekeepers doesn't it.

Unfortunately there's also bad actors on the web (and the definition of bad varies). I understand reasons to try centralise the removal of that so called bad, but obviously a central group deciding on the 'bad' just isn't democratic.

Ironically when chatgpt mentioned their UA on a web page the other day, users were presented with an anti-bot challenge.


Increasingly more sites are getting stuck in a Cloudflare verification loop on my end. I use Firefox on the beta channel, and I do have a few privacy extensions and a heavily modified user.js. If you want to give in to the browser fingerprinting, I have found that enabling WebGL, enabling performance timing (wow), setting network.http.referer.XOriginTrimmingPolicy to 0, among other tweaks, helps me break out of the verification loop.

In other words, if Cloudflare can't reliably fingerprint your browser, you are treated as a "bot" and denied access to a huge chunk of the web. Well, in that case, I would rather be a bot than a human. Being a human seems to be increasingly annoying nowadays :)


Insanely enough, Cloudflare sometimes puts these pages in front of API endpoints, as if that JSON were for human eyes only.


The End of the Road for Cloudflare CAPTCHAs

https://blog.cloudflare.com/end-cloudflare-captcha/

Discriminate against all but "major browsers". Why.

https://developers.cloudflare.com/fundamentals/get-started/c...


> Anyone who uses a de-Googled Android phone has to go to great lengths to ensure hardware attestation is working correctly [...] or else they can’t using banking apps.

I have a relatively Google'd Android running lineageOS. It passes SafteyNet on a fresh install, but even that isn't good enough for one of my banking apps (or netflix) - they both also perform a CTS Profile (Compatibility Test Suite) check and block me from using the app if they don't like what they see.

I ultimately had to root the phone to be able to use my bank's app. Rooting allowed me to use a fake CTS Profile, and then because it was rooted, SafteyNet started failing and I had to install a bypass to work around that.

Now everything works great, except OS updates un-root the phone and then "secure" apps stop working again.

(Oh, and if you mention that you're rooted, the LineageOS folks will refuse to provide any support, even for unrelated issues. Making you choose between friendly help and a usable phone is probably the only thing I don't like about LineageOS and, to my view, the biggest break from it's CyanogenMod roots.)


> I temporarily disabled extensions. I opened a private browsing window.

FYI When using Chrome, incognito window carries a lot of baggage. For issues like this use Guest profile as it doesn't include extensions, caches, storage, etc. Optionally do a Google search first to seed it with cookies.


Doing a Google search will not populate any cookies Cloudflare could access.


Of course not, but it is less likely to trigger manual recaptcha on the site you're trying to visit in Guest mode.


Cloudflare does not use Google's reCAPTCHA anymore, and hasn't for some time.


Mind expanding on what you mean by baggage? Or linking to something to start research.


I personally experience this loop all the time on different sites. I’ve completely given up - if a site loops I don’t use it and try again a few weeks later. If it’s something extremely urgent I use my mobile device which for some unknown reason never loops.


This is a bit tangential to the author's point but it does seem to indicate that IPv6 is mostly pointless for human users for exactly this reason.

Since it's so much easier to hide behind a new unique address, compared to IPv4, that any service such as Cloudflare would need to be extremely aggressive in blocking to meet their internal metrics and customer advertised minimum thresholds.

So much so that it actually costs more to use IPv6 then sticking with IPv4.

I imagine the scenario described by the author would become more and more common as time goes on as more of the world's internet users becomes harder to distinguish.


Not really.

IPv6 doesn't allow you to easily get a new completely random address. You get a subnet allocated by your ISP, and you can use any address within that subnet. Rather than blocking a single IPv6 address, a service like Cloudflare can just block the entire IPv6 subnet prefix and get the same result as blocking an IPv4 address.


They obviously don't do it this way and implement more roundabout ways because it's not as simple or straightforward.


You're basically saying this behavior is acceptable and should be considered normal and should be expected to become the norm.

If you think IPv6 is mostly pointless, I think you're unaware of the fact that a significant majority of phones already use IPv6 most of the time they're on cellular.


Last time I checked when I hotspotted my T-Mobile (US) phone, hotspot clients got only an IPv6 address, with 6to4 [edit: no, NAT64] translation used to reach IPv4 addresses.

This breaks OpenVPN, which insists on both endpoints being one or the other.


> Last time I checked when I hotspotted my T-Mobile (US) phone, hotspot clients got only an IPv6 address, with 6to4 [edit: no, NAT64] translation used to reach IPv4 addresses.

You would have the same problem without IPv6 - your phone doesn't have a spare IPv4 address to give out, it would have to give you some kind of internal-only address and then NAT it when talking to the public internet.

> This breaks OpenVPN, which insists on both endpoints being one or the other.

That seems unlikely (also why would your server not have a v6 address?). You can't connect to IPv4 addresses by IP because you don't have an IPv4 connection, but connecting to your server by hostname or by v4-address-embedded-in-v6-address should work.


T-Mobile uses NAT64. 6to4 is a (mostly deprecated) protocol for tunneling IPv6 over IPv4.


> You're basically saying this behavior is acceptable and should be considered normal and should be expected to become the norm. If you think IPv6 is mostly pointless, I think you're unaware of the fact that a significant majority of phones already use IPv6 most of the time they're on cellular.

Can you point to where I suggested that? Or did you misread the comment?


He quoted you:

>it does seem to indicate that IPv6 is mostly pointless for human users for exactly this reason


> He quoted you:

> >it does seem to indicate that IPv6 is mostly pointless for human users for exactly this reason

Huh? There is no quote in that comment: https://news.ycombinator.com/item?id=37051011.

Unless you are referring to a different comment?


I'm referring to your comment that he quoted - https://news.ycombinator.com/item?id=37050359

Maybe you're misremembering - but you wrote it :)


> I'm referring to your comment that he quoted - https://news.ycombinator.com/item?id=37050359 Maybe you're misremembering - but you wrote it :)

'johnklos' did not quote a single word from anyone in the comment I linked, nor a single word from my original comment you linked, let alone an entire phrase, and that was the only response he made from what I can see.

Are you confusing him with a different HN user?

You don't have to believe me, it's accessible to every passing reader right there in the comment chain...


I have to believe you're attempting (but failing) to gaslight me

you wrote: "it does seem to indicate that IPv6 is mostly pointless for human users for exactly this reason" (https://news.ycombinator.com/item?id=37050359)

he wrote: "If you think IPv6 is mostly pointless, I think you're unaware of the fact that a significant majority of phones already use IPv6 most of the time they're on cellular" (https://news.ycombinator.com/item?id=37051011)

As you so helpfully put it, "You don't have to believe me, it's accessible to every passing reader right there in the comment chain..." (https://news.ycombinator.com/item?id=37056142)

Feel free to continue to deny what you wrote, how he responded, and me repeatedly showing you where quotes your words if you want - no one stop you from ignoring your own words (eventhough they're right there on display for all) ...but he used your own words in response to your own comment :)


What are you on about?

There's no sign of 'johnklos' quoting me as you claim. It literally appears in his own words, and he even wrote it in a manner that could not possibly be confused with my style of writing. Just check my recent comment history?

Frankly, it's just impossible for there to be a 17 word phrase in that comment that can be interpreted as a quotation, even by a teenager who only started learning English.

And in any case, you are free to contact dang, or any other HN user you trust, and ask them to double check whether such a thing occurred.

You can also review HN norms, other examples of HN users quoting each other and explicitly acknowledging it, the popular dictionary definitions of 'quote', etc...

Since your HN account is nearly 11 years old I'm fairly confident you know various methods to verify these things to a reasonable degree of certainty.


Please accept this award for your outstanding contributions to the field of pedantry.


Since presumably your a different person from 'warrenm', though with some degree of uncertainty due to the pseudonym, why lower your account's built up credibility with obvious low-effort trolling?

'warrenm' clearly needs some help with resolving confusion regarding HN norms, or something along those lines, not piling on and exploiting the odd comment chain at his expense.


this MichaelZuo character seems to be pretty lost in understanding how to communicate (and then denying he's doing what he's actively doing)

I think he's in it for the humor value


MichaelZuo is defending an unnecessarily precise definition of the word "quote" and you are refusing to acknowledge it.

I doubt that either of you are in this for the humor; that simply emerges from the duration and pointlessness of the argument.


I get it now!

you're a comedian!

thanks for the laughs - anyone who can straightfacedly deny what they wrote, and people referencing it such an obvious manner that you still deny must be in it for the humor value


Like I said, contact dang, or any other HN user you trust, and ask for their opinion on this 'gaslighting', on the meaning of quoting another HN user, that you believe to be happening.

If they confirm your view then post it here, because I think many passing readers would be curious in seeing it too.


> any service such as Cloudflare would need to be extremely aggressive in blocking

Cloudflare needs an algorithm to deduce the IPv6 prefix size controlled by a given entity, but the details of that algorithm are not obvious. You are jumping to the conclusion that they must be doing a bad job because the problem is challenging.

IPv4 abuse detection is also challenging because of (e.g.) the prevalence of CGNAT with multiple users sharing an IP address.

Which problem is harder? Which solution is better? I don't know, without a lot of proprietary data and analysis.


I would imagine they have a list of the default / max DHCPv6 blocks that ISPs hand out. They are generally public knowledge so it would be easy for them to say Comcast hands out /56 so when blocking they block at that level for the ASN.


It probably makes more sense to derive that information from measurements. There are a lot of ISPs in the world, and allocation policies can change, even if you manage to find a person who knows what they are.


No problem for me accessing Gitlab without using a web browser. Moreover one can use Internet Archive, Archive.today, Google's cache, etc. to avoid SNI.

The author did not specify which project he was trying to access so I picked a random one to test from /explore/projects/topics/bioinformatics. No problem accessing it without a web browser. TLS1.3. No SNI.

https://one-touch-pipeline.gitlab.io/otp/


Some comments I have on this post:

> Worse yet, I know that Cloudflare knows I have those certificates. Why? Because it asked for them!

Not really. Cloudflare notices your browser has TLS authentication available and asks you for it. That's really annoying, but part of the protocol spec. Your browser won't send this information unless you pick a certificate and hit OK.

Disable your ad blocker and you'll find that many trackers will also ask you to identify yourself this way. It's really annoying, browsers need to design better UX for this type of authentication.

> · MAC address of my machine that I have previously used to access this site

How does it gather your MAC address? Did you disable IPv6 Privacy Extensions? Unless the website is sitting behind the same switch as your computer or you run some kind of native application that sends the MAC address, websites can't read the MAC of your network interface. Enable the MAC randomisation that's present (sometimes even turned on by default!) in every modern OS if you consider the local switch or WiFi network to be a privacy risk.

> Will I be able to create and sync these passkeys myself?

Yes, assuming they follow the standard

> Can only certain types of software use passkeys? If so, who decides what software meets this standard?

I don't really understand the question. Any software supporting passkeys will be able to prompt you for generating or using a passkey.

> Will I only be able to generate passkeys on a device with specific hardware/software requirements like a TPM, DeviceCheck, or Integrity API?

According to the spec, keys can be stored in software no trouble. Websites and apps can ask for securely generated keys, but I don't think those are all that common. Hardware can also be faked relatively easily in most circumstances.

> Can I, at any time, export my passkeys from one service provider and switch to another provider?

Ask your service provider for export options. Most likely, you can't just dump the keys and import them elsewhere (that would defeat the point).

> If a passkey is invovled in a suspicious event, will that suspicious mark propogate to any other device that uses that same passkey? Do devices that contain suspicious passkeys also get marked as suspicious? If so, would that impact the ability of that device to access other independent websites?

That depends on the software using the key for authentication. Maybe?


I've seen the misconception a lot; people seem to think that random websites can grab your MAC like they can get your IP address.


MAC is a weird thing, it is important for ethernet to work efficiently. But in actuality it only travels to about next router from machine...


With WiFi, MAC addresses become more of an issue (as you're often scanning and roaming), but we have randomisation algorithms for that.

The same was true for IPv6 for a while, but that too has been solved a long time ago.

For a SLAAC client whose privacy extensions have been disabled for some reason, it's very much possible to figure out your MAC address. This may be a problem on old hardware that doesn't receive updates anymore!


The big problem I have with Cloudflare's integrity check is that all the spam domains use a fake version which mimics it, and tries to trick you into completing a captcha.


This happened to me just a few days ago. I tried to open a link in an app, which then tried to open it with an in-app WebView. Thus, the Cloudflare captcha loop of death. I could even see a "human verification failed" string appearing after clicking on an "I am human" checkbox. Alongside the annoyance of not being able to browse, this kind of language is awful! Literally being told to my face I am not a human.


Actually cloudflare doesn't have access to your MAC address so it's a bit more difficult to attest that you are a legitimate user.


Prediction: if Google manages to ram Web Environment Integrity down our throats, CloudFlare will implement it as a part of these checks.


I was going to assume that the corp VPN is the reason as maybe someone is abusing that connection for something else and it’s getting flagged, but the fact that the site worked using chrome says otherwise.

Will using chromium for such cases work while having Firefox for the rest of sites?

And what’s cloudflare alternative that provides similar services for free including traffic analysis?


Does anyone who knows about GDPR know if being blocked by a CDN comes under "Automated individual decision-making"?



Yea I noticed the same thing for awhile. Cloudflare actively blocks non-Chrome browsers.


I use Firefox (stable and dev) and Waterfox, and Cloudflare doesn't block me. My settings are close to the defaults though, I don't enable things like privacy.resistFingerprinting.

For a while I did notice that when using certain IP blocks, they would show me more captchas when using Firefox than when using Chrome, but I haven't had that problem in a while.



Off topic - but how can Cloudflare block access to an internal website when accessed by via VPN? And if CF has some kind of request verification API that the internal server is using, why would you use it for an internal resource?


As a Firefox mobile user, I've never been able to go past that page for ~2 years.

And so I've stopped visiting websites that use that system (several per days).

There's no way to report that to Cloudflare so f*ck'em.


Anyone else find it odd that the author's company-internal work intranet, which requires a VPN to access, is deployed behind a Cloudflare CDN? Why would anyone do this?


The same for Safari. If your website is available in only one browser, this means that you need to change the content delivery provider.


I use ungoogled chromium on my openBSD machine, and I think it's better for privacy than firefox :-).


On one of my machines I run OpenBSD with Firefox with "Strict" privacy settings and "privacy.resistFingerprinting" enabled. There are so many websites I can't access, I get a straight-up 403 Forbidden page because CloudFlare has decided I am not trustworthy. I mean like pretty big companies like DigiKey, Home Depot, Canadian Tire, etc. I simply cannot use their websites, or I can load the initial page but then the API calls that provide the functionality all fail with a 403. DigiKey did something to unblock me and I can use their site again, but I know it's just a matter of time before it happens again. It's also a frequent problem on smaller sites that are simply using CloudFlare , and I never know when I'm going to be blocked from a site arbitrarily. It's especially egregious when it's a plain old text-based site like hamuniverse.com , or a small independent vendor like digirig.net ...

This is one of the things that makes it so clear to me that the web is diverging into two, one that is the "clean walled-garden capitalism web" and the continuation of the original web that was open, freely-accessible and built around sharing and knowledge.


Cloudflare sucks, plain and simple, no idea why anyone uses it. So many better alternatives.


For example? Especially when taking price into account.


Fingerprinting is probably load-bearing for captchas and other anti-fraud stuff that many Internet services and businesses depend on:

https://xkcd.com/2347/

It should be replaced with something better. Unfortunately all attempts to do something better get attacked by people who don’t realize that you can’t just get rid of it, or important things will break.


I use cloudflare for hosting my sites, can I and how would I disable this functionality?


For good or bad, this is why I have the Privacy Pass extension installed.


Yeah also noticed I could t get past certain sites recently


> Worse yet, I know that Cloudflare knows I have those certificates. Why? Because it asked for them!

It doesn’t make sense for Cloudflare to request any client certificates.

I think there are real bugs somewhere.


Cloudflare allows you to enable mTLS for websites:

https://developers.cloudflare.com/ssl/client-certificates/en...

https://developers.cloudflare.com/cloudflare-one/identity/de...

This would then require Cloudflare to request a client certificate. This is great for securing websites using corporate identity that is derived from AD certs for example to make sure the device being used has a valid cert on it.

Alongside MDM for example forcing the certificate to have a short lifespan (my $CORP uses 7 days) you can validate that the device has the correct security posture to access the resources.

If for example I let my device not update the version of macOS often enough my cert expires and I can't access internal resources until I update my OS and MDM software checks that and provisions me a new device certificate.


It's requesting client certs for their internal intranet stuff hosted behind cloudflare.


Please rename to "Blocked by Firefox"


Is there something more going on here like you're using some kind of blocker and it's stopping the captcha/security 'widget' from loading?


Fuck Cloudflare


The problem with Cloudflare is they purposely attempt to break user privacy by dangling websites as carrots. It's deception that they are attempting to determine if the person is human or not, because often they won't even show you the CAPTCHA. Even if you do get to the CAPTCHA, sucessfully doing it usually won't give you access to the website either. So, what is the point?

They want people to disable any privacy protections or push usage of browsers that have no to less privacy protections, in order to access the website they are blocking. This has nothing to do with if a user is an actual threat or bot, but is more a strategy to shape what browsers are used and destroy user privacy.

Cloudflare is also very aware of the numerous and constant complaints about what they are doing, coming from users and for years. They are ignored, because they have something else in mind.


Using something like Edge (for work), Vivaldi or Brave would be better than Chrome and probably let you in instead of Chrome.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: