

How do you stop scripters from slamming your website hundreds of times a second? - yield

My server is on a VPS so the bandwidth is limit.<p>how can I prevent robot except Google to crawl my website down.  how to prevent it make  hundreds of times a second to my site.<p>thanks.
======
byoung2
The major crawlers should respect robots.txt, so you can put directives there
to restrict access by useragent. For other types of traffic, look into rate
limiting and throttling settings on your web server. Apache has mod_evasive,
for example, and others should have similar features. Since it is a VPS, your
hosting company should also have some protections in place such as a load
balancer or firewall that has some way to throttle requests to your VPS. If
the traffic is malicious, they should be able to filter it based on IP or
region. If it is legitimate traffic, you should consider upgrading to a more
scalable setup, for example several servers behind a load balancer, maybe with
some way to programmatically add servers based on load. This is easy to do
with Amazon EC2, ELB, and cloudwatch

~~~
yield
very helpful

thank you very much thank you!

------
gtani
<http://news.ycombinator.com/item?id=1787354>

this is about (defending) brute force/ssh attacks, but the iptables rules are
similar

~~~
yield
helpful! Thank you very much!

------
fak3r
for nginx look at Limit Requests <http://wiki.nginx.org/HttpLimitReqModule> \-
or if not running 0.7.20 or newer, try Limit Zone. This is the best if you're
just looking to control things with nginx, but I would also recommend putting
HAproxy in front of it and have it handle the connections limit, a much richer
feature set to handle such things. It handles slow DDos attacks like Slowaris
automatically [http://www.snapt-ui.com/haproxy/defend-against-slowloris-
wit...](http://www.snapt-ui.com/haproxy/defend-against-slowloris-with-
haproxy/) \- and is able to do tons more. Here Willy (HAproxy developer) talks
about other things he's done with it [http://www.mail-
archive.com/haproxy@formilux.org/msg00291.ht...](http://www.mail-
archive.com/haproxy@formilux.org/msg00291.html)

~~~
yield
very helpful

thank you very much.

------
spooneybarger
you can use robots.txt to stop the ones that behave. those that don't behave,
you would need to cut them off upstream. depending on how you are metered,
that might mean dropping the connection when it gets to your box or it might
mean having your hosting provider do it before it gets to your vps.

~~~
yield
thank you very much

I use the Nginx as the web server currently. is there some best practice for
this？

thank you.

~~~
spooneybarger
you want to drop connections either upstream from your machine or before nginx
ever connects. assuming that is, you want to block certain connections.

