Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For sure, but they shouldn't be hypocritical about it. If they don't consider themselves content parasites, they shouldn't consider people scraping their site to be content parasites, either. (Some sites really are just parasites, though.)


There's nothing hypocritical about it. Googlebot respects robots.txt configured on pages it scrapes. Google in turn expects that their own robots.txt will be respected. What's the issue?

https://www.google.com/robots.txt


Can I politely point out that the conversation is not about respecting robots.txt.

If you want to talk about this in terms of robots.txt, Google is thriving on the fact that other companies don't block their content in robots.txt, but at the same time Google blocks all of its content in its robots.txt.


> If you want to talk about this in terms of robots.txt, Google is thriving on the fact that other companies don't block their content in robots.txt, but at the same time Google blocks all of its content in its robots.txt.

It seems like you're stating this as though to cast some sort of moral aspersion. I don't get it. If other companies don't want Googlebot to scrape them they just have to say so. Most companies want Googlebot to scrape their content. Google doesn't want other people's scrapers to scrape Google's content. Nobody involved in any of this has done anything unreasonable or morally objectionable.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: