Hacker News new | past | comments | ask | show | jobs | submit login

The whole point of using an headless browser is to work around web sites that attempt to block simple "curl" style scraping (or where you need to execute JavaScript to scrape).

So making it detectable (intentionally, even, right there in the user agent!) is really absurd.

Or actually, it makes one wonder about Google's motives.




That's one use-case for Headless browsers. Most people actually use Headless browsers to test their website, i.e. for functionality / performance / rendering.


That's definitely not the whole point of headless browsers, that's more of a side-effect. The whole point of headless browsers is rather automation and testing.


Same as torrents are for the distribution of legal content. That was the original thought and it's still used for that but I'd bet the majority of headless browser requests crawl websites not owned by the scraper.


Making the web harder to crawl would make it harder to create a Google competitor. I doubt that that's their intention, though.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: