Hacker News new | past | comments | ask | show | jobs | submit login

I work on a lot of web scraping and we have business agreements with every site that we scrape explicitly allowing us to do so and with pre-approval for the scraping rate (which we carefully control).

None of this gets around over-eager Cloudflare or Akamai rules set up years ago by some contractor that the businesses have no real ability to change.




Why scrape at all if there are agreements in place. Seems like a API task


> None of this gets around over-eager Cloudflare or Akamai rules set up years ago by some contractor that the businesses have no real ability to change.

If the business has no ability to even change the CloudFlare settings I don't expect them to be able to provide an API.


Only about 1 in 20 of our partners has the in-house tech necessary to provide us an API. In other words, we were able to accerelate our integrations by 20x by scraping.


In that case, the web site IS the API… only difference is data is wrapped in HTML not JSON.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: