Hacker News new | past | comments | ask | show | jobs | submit login

> Whereas Google was previously a way for sites to be discovered and for sites to generate revenue, it is increasingly becoming the sole source system where data is scraped and imported into Google, and Google keeps all of the revenue to itself.

I wondered yesterday: if you provide microdata, Google scrapes it, and you later decide to remove your sites from Google - is Google allowed to keep the microdata and continue to publish it?




If you once communicated the fact that 2+2 is 4 and Joe makes very good spaghetti you own the copyright to the text you published but neither the fact nor the opinion belongs to you in any meaningful way nor should it.


That's true, but a collection of facts ("a database") falls under copyright.


Not sure why you are downvoted. In the EU databases fall under copyright, which indeed leaves the question how google legally deals with this (technically database right isn't copyright, but in this context that's a technicality).

Also, to quote from the Wikipedia article [1]: An owner has the right to object to the copying of substantial parts of their database, even if data is extracted and reconstructed piecemeal

1: https://en.wikipedia.org/wiki/Database_right


> Not sure why you are downvoted.

Because they didn't say "in the EU", and it not being copyright is not just a technicality. Copyright is about creative expression, and utilitarian collections of facts aren't.


> Because they didn't say "in the EU"

They also didn't say "in the US". From context you can only assume "in some jurisdiction google cares about"

> Copyright is about creative expression

That's not true, or at least a very US-centric view. The Berne Convention, the international standard for copyright, reads:

"[...] shall include every production in the literary, scientific and artistic domain, whatever may be the mode or form of its expression, such as books, [...] works expressed by a process analogous to photography; works of applied art; illustrations, maps, plans, sketches and three-dimensional works relative to geography, topography, architecture or science."

also

"Collections of literary or artistic works such as encyclopaedias and anthologies which, by reason of the selection and arrangement of their contents, constitute intellectual creations shall be protected as such"

That's lots of things that are not exactly "creative expression" (even though exceptions for pure statements of fact do exist).

https://en.wikipedia.org/wiki/Berne_Convention

https://wipolex.wipo.int/en/text/283698


"by reason of the selection and arrangement of their contents"

If there was no selection or you make the original selection irrelevant, while also giving your own arrangement, then there's no violation of copyright.


https://www.bitlaw.com/copyright/database.html#data

This doesn't provide any protection for the underlying facts.


No, but if Google imports a database then they are still affected by the compilation copyright. It's too obvious of a hack to "just" claim that "yeah, we imported that entire database, but then we cracked all the facts apart and they're all separate now and it's just as if we never imported the database". That's not how the law works.

Even more interestingly, we've still never yet resolved the question of why Google gets to lift your entire site's contents and re-serve them in arbitrary ways to their own profit in the first place. It's really just a thing that happens on the internet because it was happening on the internet before the lawyers got there. I've said before and still believe that if there was no such thing as a search engine and they were just invented today, they'd be annihilated in court as nothing but one big copyright violation.


I'd rather give up copyright than search engines. Anyone who wants to push too hard ought to consider whether an entire nation might make the same choice.


IANAL, but as long as google is only distributing the individual facts (not the database of facts) they would be in the clear, legally


Removal is irrelevant because Google doesn't rely on a license for its index.

robots.txt is a courtesy, not a legal obligation.


I am not sure there's a specific copyright applicable there. You can ask Google to remove your website's data from their index (primarily via robots.txt)... but of course, that also delists you from search. Essentially Google has left the impossible choice to either let them steal your data for free or accept not being findable in the primary search engine on the Internet.


Yeah, but if you don't get any clicks, that choice is no longer impossible: you're providing value without getting any in return.

Granted, it's still a while away to get into that territory, I think most sites still profit from Google.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: