You're going to need to explain how scraping publicly available information on a website is theft.
If information is your competitive advantage maybe you shouldn't have it on a publicly accessible website, and should instead stick it behind an API with pay tiers and a very clear license regarding what you may do with it as an end user.
Note, a simple sign up being required to view a website makes it not publicly available information any longer and you can cover usage, again, in a license.
Then you have a whole bunch of legal avenues you can use to protect your work. Assuming you can afford it that is.
How practical is this really though? Like, imagine you're a newspaper. Unless you're the FT or Wall Street Journal or something like that, nobody is making an account to read an article. They'll just go somewhere else.
Do I need to explain that copyright is practically unenforceable in the 21st century? Data is trivially copied and there's nothing you can do to fight that, no amount of laws will ever make it non-trivial again. Even if you successfully sue somebody for this, it won't stop them.
At some point people are gonna have to accept this.
For what it's worth, I didn't think it was rude. And I do think it was a substantive aspect of his response. It's a valid perspective even if I personally think it's somewhat orthogonal and incomplete.
I was replying to a similar sentence, but it is true that in the end it did nothing but escalate the situation. I apologize and yes, I will try to be more polite.
Well if you want to stick to the hard facts then it's even simpler: copyright infringement is not theft - those things are covered by entirely separate laws.
That may be true. A list of prices might not be copyrightable in your particular jurisdiction. However I was only responding to the raw assertion made without any such qualification.
On the opposite end of the spectrum might be a photographer's website containing a gallery of their sample work. The fact that the gallery is openly published doesn't represent a relinquishing of copyright over those images.
No but those are substantively different situations such that this exact thing is being argued in the highest courts of the US. It's not quite the cut and clear case you seem to believe it to be.
Interesting. From what I can gather, the material in HiQ Labs v. LinkedIn was not claimed to be under copyright and that the argument being fought over in court was with respect to mechanically subverting access to a competitor.
You appear to have claimed that rights over material is broadly relinquished if it's published in public:
> If information is your competitive advantage maybe you shouldn't have it on a publicly accessible website
And your distinction was further clarified when you argued that placing barriers to access fundamentally changes the equation:
> Note, a simple sign up being required to view a website makes it not publicly available information any longer and you can cover usage, again, in a license.
Perhaps you meant to speak only of material which is not subject to copyright? In which case I think your argument does track.
Mmm I’ll go with that, to be honest I saw that LinkedIn had initially made a complaint under DMCA (which HiQ then got an injunction for) as well and given how the case played out I was uncertain to the extent the case was signaling that you may be waving certain rights by making some content publicly available with no gating like a sign up.
No, but there is a legit philosophical argument about theft when it comes to copyright. There are two ways to look at theft: acquiring something you didn't earn vs. someone losing something they did earn. Generally, we tend to focus on the latter. From that perspective, "copying" is really not "theft", and arguably "copyright" does more net societal harm than any benefit it provides.
No? If you place information publicly on a website it's pretty much free game, no copyright violation, especially regarding user generated information. That's my take, but legally it's a gray area and it's still going back and forth in the courts (at least in the US) but for a while before a decision was vacated by the supreme court scraping publicly available information on a site was legally protected and seemingly inline with my thoughts on it.
If we are to live in a mutually prosperous society, how is the labor and therefore well-being of the content creator improved by a web scraper? Does this precedent not injury future opportunities for exercising one’s life to making website data available for others to scrape?
By my understanding any website with a copyright disclaimer warrants their data as exclusively their own and are granting permission for other web users to generate it, ie people are not entitled to share their web data with anyone. So if they are, and we agree that it’s good that they do, and continue to create information for others to know, how do we avoid the implicit harm in extracting data without nothing being given in return but possibly harming the internet’s experience for everyone accessing the same information?
I'm actually cool assuming there is implicit harm and no benefit, but by that logic we need to tear google down too. I'm cool making that trade but it has to be done equally.
If you can't make that trade then you've weighed the value provided by an organization like google to be more valuable than the copyright of these content creators and I want other players who may want to be able to challenge google to have the same protections and access google does to have a chance at providing the same value.
Well actually, isn’t Google improving the value of the content ergo property itself by improving its accessibility? I was inferring a one-way street with the accumulated data that can lead to server crashes - which I don’t believe Google’s web crawl does at all (in fact that would be counter-productive).
I'd say most crawlers are looking to provide enriched value for content at their end use. Google is just an aggregator (the biggest by far) but other aggregators are looking to provide similar value.
So then an aggregator is different than a scraping service with respect to the value given to the rest of humanity? In that, in principle one adds value to the content creation and the other deducts through potential harmful interference with its reciprocity?
I'd say many aggregators do offer value to the rest of humanity but I imagine there are probably some exceptions and also not all scraper services offer no value it's just different value to different people.
Some scraping services make their money by offering scraping services to companies for specific information and you could argue they provide value to other businesses that way, but not to the broader "rest of humanity".
So I'm not sure it's as simple as just "aggregator" good "scraping service" bad as value provided takes on many different forms, and that's what makes this difficult.
I guess it may come down to your take on what you think of middlemen, because they are all effectively middlemen in the data economy.
Edit: I was rereading your comment, in respect directly to the value added to the content, then yes maybe it is more clear that aggregators are in principle different because they do add that value where scraping services that sell the data do not offer any enrichment to the content creator. I personally think protecting content aggregators that republish the data to create visibility or other value for the content creator to the extent that they're not worried about being sued for that is probably a worthwhile thing to happen because of the net benefit to our ability to find information/content.
If information is your competitive advantage maybe you shouldn't have it on a publicly accessible website, and should instead stick it behind an API with pay tiers and a very clear license regarding what you may do with it as an end user.
Note, a simple sign up being required to view a website makes it not publicly available information any longer and you can cover usage, again, in a license.
Then you have a whole bunch of legal avenues you can use to protect your work. Assuming you can afford it that is.