

Ask HN: How do you deal with rogue bots? - zader

I run a niche dating and social networking website. After I built my own traffic analysis suite, instead of just relying on Google Analytics, I realized a ton of my bandwidth and server resources are devoted to serving up content to bots.<p>Many are valid bots whose spidering I welcome, but some of the most aggressive do not identify themselves as bots and are from places like North Korea and Russia. So I'm researching solutions, and the best I've come up with so far is using a bad_ips table in my Rails app to block addresses such as the ones listed in the blacklist at myip.ms.<p>How are other online entrepreneurs dealing with this phenomenon? Are there pre-existing solutions out there that are worth using, or should I proceed with my own custom model? Can any of you recommend best practices or seasoned advice in this area?
======
paulsutter
You may want to use countermeasures that aren't so easily detected by the bot
owners. You want to lengthen their feedback loop. They know instantly when you
block them and can just move to a different server at amazon.

For example you could put them in a tarpit, and gradually serve successive
pages more and more slowly. Exponential backoff or something.

You could also serve bad data back to them when they get so far into the
tarpit.

It all depends on how much work it's worth doing.

Id love to hear more ideas, it's a fascinating question. So many sites have
this problem, and its an endless cat and mouse game, perhaps there is a
product opportunity in here somewhere.

------
TeHCrAzY
Just ban the most intrusive bots; Its just going to turn into whack a mole
otherwise.

