

Ask HN: Do Startup Databases have copyright over the data on their platform? - flylib

I have a question regarding Startup Databases that charge money to use their platform such as Mattermark (https:&#x2F;&#x2F;mattermark.com), Datafox (http:&#x2F;&#x2F;www.datafox.co), CBInsights (https:&#x2F;&#x2F;www.cbinsights.com), and Tracxn (http:&#x2F;&#x2F;tracxn.com)<p>who owns the data? They are sometimes or mostly just scraping publicly available info in potentially illegal ways in the first place so how can they then claim copyright on the data they have that is publicly available anyway?<p>Example:<p>CBInsights admits to scraping data of newspapers which I believe is illegal to then use for commercial gain, are they claiming they have legal contracts with every newspaper?<p>&quot;We crawl over 12,000 information sources daily ranging from local, national, and international news publications to SEC filings to investor and corporate websites to social media sources and parse out structured data elements including company name, investor names, amounts, Board of Director names, etc. programmatically. &quot;<p>http:&#x2F;&#x2F;www.cbinsights.com&#x2F;venture-capital-database
======
postjock
The quoted text indicates only that they parse out "structured data elements."
The test for Copyright infringement is a four prong test.

1\. Access to the original work 2\. Copying of the original work 3\.
Substantial similarity between the oriinal and the copy 4. Damages

If you were to compare select elements from the database and compare each
element to the work from which is was scraped, would you find that there is a
substantial similarity between the data elemment and the work in its entirety
from which the element was obtained? If the answer is NO, then the tjhird
priong of the test fails, and there is no infringement.

~~~
flylib
the scraping they are doing is illegal in the first place, do they have
written permission for and cite and source every data point they have?

By their logic, Someone can scrape their site and do enough manipulation on
the data to avoid similarity's hence no infringement

