A small, single EU country focused non-static e-commerce, with proper robots.txt instructions that worked perfectly well in the search & co bots -only "era" with rate limiting for nginx/php-fpm setup - is kinda struggling without CF to handle 15000 requests per 15 minutes, coming from Chrome "users" from IPv6. Best so far was an avg. server load in htop = 40 on an 8-core server x_x
That's 16.6rps. A single guy holding the F5 key on chrome can generate that much traffic and take down your website. That kind of performance was never acceptable.
People will always reframe their request numbers to avoid stating their pitiful requests per second numbers, it's hilarious. "This thing is handling hundreds of thousands of requests per day!" Like cool, you're barely making it double digit requests per second.
Maybe a plain WordPress install. Run something like WooCommerce and install a bunch of plugins to get the functionality that WordPress and WooCommerce should have built-in, and suddenly a cheap VPS can only handle 2 or 3 requests per second.
It's phenomenal how inefficient the WordPress/WooCommerce stack is.
Though the main issue I'm seeing is credit card testing, not scraping.
And I'm ideologically opposed to using a CDN (because it shouldn't be needed for such a small site!) so it's somewhat a self-inflicted problem...
"Security" plugins are also HUGE problem here, most of them turns "few cached DB SELECTs" (or static file read if you use caching plugin) into now a bunch of inserts, just to log/analyze "offender" IP and maybe block it, in many cases turning "blocking offender" to be more costly that would be serving the page without the security plugin
You can calculate traffic stats for a day by IPs/subnets and probably bots will stand out. If they are using IPv6 you can figure out the ASN and block it completely.
reply