Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's public data! You could scrape and index it now for free if you wanted...


They even throttle the friendfeed scraper which graciously pulls all its users data at once.

You can't write a simple scraper that is not distributed in 100 of machines across the web to pull out their data.


I heard that there is this thing called "the cloud" where you can rent services based on the work time. That makes cheapo servers both realistic and quite simple ;)

Actually I just noticed you get 750h of free micro instance time from aws... I wonder if it would be worth doing. I imagine the link+tags are <100GB in total.


Though I've noticed only pages up to 200 work when going back through history.. this only gets you a few days back on the most popular tags.


Sure, but user pages go back farther than that...


"... It's public data! You could scrape and index it now for free ..."

That is the most insightful thing I've read today.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: