Hacker News new | past | comments | ask | show | jobs | submit login

A well-behaved crawler will be self-limiting in this regard in that they seek to avoid requesting URLs from the same domain too often, and then the better you are at spreading requests over as many domains at possible, the less you'll be affected by such traps.

You can do a lot of other things too, but the above already takes most of the sting out of such traps unless you use a huge number of domains.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: