
Dataset of VCs investing in seed and Series A+ rounds - scherbak
https://unicorn-nest.com/dataset/
======
bloudermilk
Judging by the 100% match rate with Crunchbase I'm guessing this is an
analysis of CB data. It's still valuable, but I'm curious if/how they managed
to secure a license to redistribute this publicly. I've spoken to their sales
team before and they've very particular about how you can export & share their
data.

------
dsaavy
Can't be used for analytical purposes... Welp looks like no one should even
download or look at it. You could violate the ToS by looking at the data and
drawing conclusions!

~~~
munchbunny
Feels like someone in legal wanted that written in without appreciating that
you don’t release datasets just to look pretty. You release datasets so people
can look for insights.

Or was PR the point?

~~~
smoe
Looks to me like it is basically a sitemap to their website with some
additional info sprinkled in.

I'm not really involved much in VC so I don't know how useful and reliable
that information is or how hard would be to come by otherwise?

~~~
gunshai
I had the exact same question.

Edit: and judging by the sites robots.txt you could just scrape the
information your self and avoid any ToS problems.

------
klohto
> Shall not be used for any scientific or academic research, in commerce, for
> analytical purposes, for or any mailout or information distribution
> purposes, as well as for any illegal purpose

Well that's a bummer

~~~
icedchai
Not to worry. It says "shall not" instead of "must not."

~~~
oefrha
RFC 2119:

> MUST NOT This phrase, or the phrase "SHALL NOT", mean that the definition is
> an absolute prohibition of the specification.

[https://tools.ietf.org/html/rfc2119](https://tools.ietf.org/html/rfc2119)

~~~
icedchai
RFC's aren't legal documents. See [https://feltg.com/shall-will-may-or-
must/](https://feltg.com/shall-will-may-or-must/) "The only word of obligation
from the list above is must – and therefore, the only term connoting strict
prohibition is must not. The interpretation of everything else is up for
debate."

------
scherbak
Today we published the database of funds for free. This is currently the
largest, most complete and most relevant dataset in the world. It contains
more than 500,000 data cells about 26,000+ venture funds and more than 30,000
employees who make investment decisions, including the rules of their email
formation (we cannot share emails directly because of GDPR). We will update
investor profiles on our website on a weekly basis, and this database on a
quarterly basis.

~~~
axlee
Incredible. What led your company to divulge this valuable data?

~~~
scherbak
We don't see any intrinsic value in owning the dataset. We are working on a
tool that optimizes the search of investors and saves dozens of hours. We are
happy to share the dataset with those entrepreneurs who know what they need
and what type of an investor they are searching for.

------
bt1a
If I want to publish a public visual on some of the most common words in the
field "Some of TOP industries" \- does this fall broadly under analytical
purposes? I'm not sure I even understand the purpose of releasing the data if
it's not to be analyzed. In a commercial manner I understand...

~~~
scherbak
we meant prohibition of use in a commercial manner. Will have to adjust the
terms accordingly. Thanks for pointing this out.

------
mistrial9
Is there not massive irony in publishing "very valuable" something that is
then restricted to any practical use? seems to echo the bizarre and tortuous
legal world these funds, and their people, live in...

------
kragen
wget [https://backend.unicorn-nest.com/investor/csv](https://backend.unicorn-
nest.com/investor/csv)

They purport to assert some undefined kind of intellectual property rights
over the data using a clickwrap contract. I modified the DOM to say "Disagree"
before clicking. In the US, I'd think _Feist_ would protect you from any
copyright claims on the data, although I haven't looked at it in detail, so it
might be creative.

~~~
bowmessage
"Your honor, I simply modified the DOM to my liking and now claim ownership of
all copyrighted material everywhere!"

~~~
kragen
That does seem to be the reasoning behind clickwrap contracts in general, yes
— that by someone clicking on a button or opening an envelope or whatever,
they are signaling their agreement to whatever contract terms are in the DOM
or written on the paper or whatever. It's absurd, but if you accept it, it
seems that you would have to accept that the fact that the button I clicked
said "Disagree" means that clicking on it did not signal any such agreement.

------
anticsapp
I think this is cool. Not going to bother anyone just yet. For any VCs reading
this, do you like this sort of thing or is it intrusive? Feels like your inbox
would get powerbombed with shitty dealflow.

~~~
tixocloud
Most shitty deal flow gets ignored and it doesn’t serve founders well. The
best approach is to spend time to research the firm and the partner you’re
looking to seek investment from. You’ll get results when what you’re building
matches what the partner’s thesis or perspective is.

~~~
anticsapp
I agree. It just seems like basically giving out their email addresses is a
bridge too far, even though it's not that hard to figure them out.

What if this data was clustered not just by top industries invested in, but by
thesis and ethos?

~~~
tixocloud
Yes agreed. That would be very helpful indeed. Even more so, which partners
invested in which companies would also be good.

------
bernardlunn
Hit Agree but no idea is 3 is true. If they scraped it, then they do not own
it .

1\. Is for information use only

2\. Shall not be used for any scientific or academic research, in commerce,
for analytical purposes, for or any mailout or information distribution
purposes, as well as for any illegal purpose

3\. Is the property of Unicorn Nest, all rights reserved.

------
gumby
I used to find the reviews on thefunded interesting (especially the companies
that tried to get them suppressed). Anything recent like that?

------
Yaroslav1024
Very useful. Big job. Thanks Denis

------
fludlight
Saving this as a binary excel file (.xslb) speeds the spreadsheet up a bit.

------
sturza
From the ToS it seems to belong to Disney

------
vkedyk
WOW! Thanks for this!

------
MaxWellMax
I bet this dataset will be much more popular in investor’s network.

------
oi_enderturing
You guys rock! Thanks

------
wolco
For those that want to use this dataset for analytical purposes, research. Use
this link:

[https://backend.unicorn-nest.com/investor/csv](https://backend.unicorn-
nest.com/investor/csv)

No terms are required to be agreed to if downloading directly.

~~~
oefrha
Wait, are you an owner of this?

> No terms are required to be agreed to if downloading directly.

That’s not how terms work... Downloading something from a deep link doesn’t
mean it’s automatically WTFPL. (Unless you’re an owner and you say so, of
course.)

~~~
wolco
Not the owner.

Those terms were only presented and asked if you agree if you went through the
landing page. This provides a direct link. Follow local laws.

~~~
bepvte
This doesn't seem to hold up at all. What about using bots to scrape a page,
they dont see the EULA but still can violate licenses.

