Hacker News new | past | comments | ask | show | jobs | submit login

Ohhh yea I run into this memory issue very quickly when scraping (especially if you have a large URL dataset then it will inevitably find a website with a giant bit of markup). So I have to set timeouts and blacklist timely requests but also completely reset the (headless) browser on 2-3 requests (which is overkill but I am restricted on memory for those workers). Feel free to drop me an email sometime (should be on my HN profile)





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: