Hacker News new | past | comments | ask | show | jobs | submit login

> we only noticed them if they hit us too hard, we told them to back off, and then they didn't

Response code 429 is your friend: https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#429




If you're writing a scraper, you should handle any http error code and back off.

And if you want to get really pedantic, 429 didn't exist when we did this. It wasn't approved until April 2012 and the first patches for it didn't show up until around 2014. We could have monkey patched if we really wanted to, but we didn't really want to.


You should back off whatever the error. This is on the client to implement. 429 is not directly supported by HTTP libraries to make the client wait, so I don't feel like it would help getting misbehaving bots to slow down.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: