Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Isn't that just creating a duplicate internet then - one that taxpayers have to pay to host instead? That creates a ton of issues around being forced to pay for hosting other people's speech. For example, I don't want to pay to host anti-vaxxer site information but because of the first amendment, this public service can't remove it.


Why would you care about the cost to store a copy of sites you disagree with? It's a crawl, it's not the government making value judgements one way or the other. I'm sure the library of congress also contains material you disagree with.

I don't think it would count as a duplicate internet any more than archive.org would.


The library of congress is curated by Selection officers for material value, the internet is not.


I didn't mean to say that the library of congress was indiscriminate, just that it likely also contains material that you disagree with. And that that doesn't really matter, because it's not the purpose of the repository to contain only non-controversial material.


Without an example, I can't agree with you, and no - the point is that the library of congress is curated content, the internet is not.


Old books from the 1800s with very outdated views on race, sex, etc would be an obvious example.

Edit: and why is curation relevant to the question of whether there should be a common crawl that can be accessed by all, a publicly hosted mirror of the web that can be used to create competing search engines?


I really don't care about old books from the 1800s.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: