Hacker News new | past | comments | ask | show | jobs | submit login

It's cool and all that you're making an exception here, but how about including a "no, really, I'm actually a human" link on the block page rather than giving the visitor a puzzle: how to report the issue to the page owner (hard on its own for normies) if you can't even load the page. This is just externalising issues that belong to the Cloudflare service.





I am not trying to "make an exception", I'm asking for information external to Cloudflare so I can look at what people are experiencing and compare with what our systems are doing and figure out what needs to improve.

Some "bots" are legitimate. RSS is intended for machine consumption. You should not be blocking content intended for machine consumption because a machine is attempting to consume it. You should not expect a machine, consuming content intended for a machine, to do some sort of step to show they aren't a machine, because they are in fact a machine. There is a lot of content on the internet that is not used by humans, and so checking that humans are using it is an aggressive anti-pattern that ruins experiences for millions of people.

It's not that hard. If the content being requested is RSS (or Atom, or some other syndication format intended for consumption by software), just don't do bot checks, use other mechanisms like rate limiting if you must stop abuse.

As an example: would you put a captcha on robots.txt as well?

As other stories here can attest to, Cloudflare is slowly killing off independent publishing on the web through poor product management decisions and technology implementations, and the fix seems pretty simple.


From another post, if the content-type is correct it gets through. If this is the case I don't see the problem.

It's a very common misconfiguration, though, because it happens by default when setting up CF. If your customers are, by default, configuring things incorrectly, then it's reasonable to ask if the service should surface the issue more proactively in an attempt to help customers get it right.

As another commenter noted, not even CF's own RSS feed seems to get the content type right. This issue could clearly use some work.



I had a conversation with a web site owner about this once. There apparently is such a feature, a way for sites to configure a "Please contact us here if you're having trouble reaching our site" page...usage of which I assume Cloudflare could track and then gain better insight into these issues. The problem? It requires a Premium Plan.

Some clients are more like a bot/service, imagine google reader that fetches and caches content for you. The client I’m currently using is miniflux, it also works in this way.

I understand that there are some more interactive rss readers, but from personal experience it’s more like “hey I’m a good bot, let me in”


An rss reader is a user agent (ie. a software acting on behalf of its users). If you define rss readers as a bot (even if it is a good bot), you may as well call Firefox a bot (it also sends off web requests without explicit approval of each request by the browser).

Their point was that the RSS reader does the scraping on its own in the background, without user input. If it can't read the page, it can't; it's not initiated by the user where the user can click on a "I'm not a bot, I promise" button.

It was a mental skip, but the same idea. It would awesome if CF just allowed reporting issues at the point something gets blocked - regardless if it's a human or a bot. They're missing an "I'm misclassified" button for people actually affected without the third-party runaround.

Unfortunately, I would expect that queue of reports to get flooded by bad faith actors.

Sure, but now they say that queue should go to the website owner instead, who has less global visibility on the traffic. So that's just ignoring something they don't want to deal with.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: