Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, this is just a Python web scraper, and Instagram does in fact attempt to detect and prevent/rate-limit this kind of scraping. They rely very heavily on the source IP to help them determine when to cut you off.


... indeed. Isn't the script 'under the hood' calling the API anyways (looking at 'scrolldown' here)?

I reversed engineered the API myself a couple of weeks ago which was great fun - especially figuring out Instagram's rate limits on interactions such as comments and likes per day/hr.


Not the Instagram developer APIs, but the one that instagram's frontend consumes. The script scrapes instagram's frontend here.


I put a timeout of 2-3 sec between each image download. Do you think this will prevent instagram from detecting the scraper?


Interesting, I might reverse engineer and see if we can introduce some kind of backoff and then resume the scraping again.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: