

The startup world, ranked and visualized - dbrush
http://endrank.com/crunchbase

======
artursapek
The way this is laid out it looks like the entities that are in a row are
associated, it's confusing. You name the columns at the top and label the
ranks on the left.

I love the use of different-sized circles, but I feel like you're comparing
apples to oranges in a way. I think the circles should all be relative to one
master size (#1) only their own category; rather than comparing Ron Conway to
Google his circle could be of equal size. This way when searching for entities
their circles would make more sense because the user is aware of the
consistent max circle size. By comparing Conway to Google you're giving up a
wider scale you could be using for the Peoples' circles.

Still though, cool project. I found the 5-person startup I'm interning at this
summer :) Makes for a great crash-course on who matters if someone were trying
to study up on the startup scene.

~~~
herdrick
Thanks!

We tried both options and liked using one scale for all categories. The idea
is that you are comparing influence, or something like it, so that can be
visually compared across categories. You do have the downside of having very
small dots for the people after only a few pages, but since there are >
100,000 people, most are going to have the minimum sized dot anyway.

------
stevenj
I like the interface design.

Questions:

Dave McClure at #3, Paul Graham at #943; whereas 500 Startups is at #41, and Y
Combinator is at #15?

Why is 500 Startups classified as a financial firm, whereas Y Combinator is
classified as a company?

Also, I'm surprised that Andreessen Horowitz is ranked #67 of financial firms
(given that Marc Andreessen is ranked #14), and Elon Musk is #119 for people.
I would have thought they'd both be ranked much higher.

~~~
herdrick
Thanks! We like to keep it clean.

In the data source, TechCrunch's Crunchbase, the influence of Paul Graham,
Jessica Livingston, et al. is captured in YC the company, and Marc Andreessen
is split between his personal identity and AH. So you get a lot of odd things
like that.

As for why YC got classified a company, who knows? It's accident of history.
They should change it.

~~~
bootload
_"... In the data source, TechCrunch's Crunchbase ..."_

TechCrunch as a single data source. Are there any other sources used?

~~~
dbrush
At the moment no. This was one dataset we used to get started but, we intend
to incorporate data from Angel List and other sources as well.

~~~
bootload
_"... This was one dataset we used to get started but, we intend to
incorporate data from Angel List and other sources as well. ..."_

Sooner the better, TechCrunch data is iffy ~
<http://www.flickr.com/photos/bootload/2913315731/> though useful for a start
point. Did you do a select on the companies to check for multiple listings?

<http://www.crunchbase.com/company/y-combinator>

<http://www.crunchbase.com/company/ycombinator-4>

<http://www.crunchbase.com/company/y-combinator-2>

------
herdrick
Co-builder of this here. Hope you like it! Don't forget to click on a dot -
then you can walk the graph of relationships among startups, people, and
financial firms.

------
aba_sababa
This reminds me of an analysis I once did:
<https://github.com/astanway/Crunchbase-Network-Analysis>

The data from Crunchbase is very, very dirty, but I managed to clean it up a
lot. Feel free to fork if you want pre-built NetworkX graphs.

~~~
herdrick
This is really interesting. I'm surprised you found that startups with higher
centrality raised less money. Not sure how to explain that. Your investor
graph has edges pointing both from investors to startups and back, right?
Anyway, thanks.

~~~
aba_sababa
Mm, no, just from investors to startups. Startups don't invest in investors :)

------
jmspring
Great to see Pejman on the list. His Plug and Play facility has some great
companies renting spaces and interesting events come through.

I miss attending events when a prior startup was based there.

------
dmor
Want to hustle? This is a great list to go through and familiarize yourself
with these people, what they make, what they need, how you can help them, what
you might ask them for if they ever offered to help you, etc. Great way to
visualize Crunchbase, which is a great resource.

~~~
joshbetz
Except for the part where there are 10 companies ranked higher than Apple...

~~~
drats
Given what this is ranking I am surprised they are so high, this has nothing
to do with market cap, cash on hand or the devotion of fans.

~~~
joshbetz
What does it have to do with?

 _The 217,000 most important companies, financial firms, and people..._

~~~
yariang
Why did you stop there though?

"...in the startup world..."

In fact, I am surprised to see a lot of non-startup companies like Nokia
listed quite highly.

~~~
joshbetz
So Google, Yahoo, and Microsoft are in the startup world, but Apple isn't?

------
dbecker
There will be a few surprises in any system...

But Yahoo ahead of Facebook, Amazon and Apple?

Myspace ahead of Apple?

Paul Graham is #943 among people?

Cool project, even with the eyebrow-raising results.

~~~
herdrick
Paul Graham's credit is mostly subsumed into Y Combinator's rank (#15 among
companies). Yahoo has bought a lot of startups. But yeah, there are some
things that don't make sense.

------
blo
Can you augment the data set with Angellist? There's more info there than on
crunchbase for newer startups

------
staunch
Minor pagination issue: when you click "next" the "prev" link takes its place.
I went back and forth between the first page and second page a couple times
before noticing.

Very nice work!

~~~
herdrick
Glad you like it! You're right - that pagination would be better if the 'next'
link behaved better. We'll fix it sometime soonish.

------
graiz
MySpace ahead of Apple? That about sums up my opinion of crunchbase.

------
AndyNemmity
Did some random searches. Noticed quite a bit of data missing, but I was
searching for older stuff.

Good luck, useful.

------
athst
I think you can only say that this is a visualization of how things are in
Crunchbase, not "the startup world" in general. It's ranking things based on
the number of connections each thing has. For angel investors this may be a
good indicator, but not necessarily for companies or VC firms.

------
joshu
Yay! I'm #21 on the angel list. I should get more of my stuff in crunchbase
(more than half are missing)

------
hsshah
Nice work. Interesting way to visualize.

One request, in the UI, where you have checkboxes, can you add 'exclusive'
select. So for example, I want to see ONLY 'acquisitions' made; right now, we
have to deselect everything else manually.

------
pedalpete
Very cool, but you really need to work on the algorithm. It's easy to pick out
the reason why MySpace might be found higher than Facebook or as a few people
have pointed out, PG vs YCombinator.

But my last start-up, which I shutdown over a year ago (HearWhere, 19,883) is
ranked as more influential than companies that are significantly larger
traffic that are still operating (example, AllRecipes, 23,087).

Nice to think I'm that influential, but I can assure you, I'm not (yet ;))

------
jquery
The "impartial algorithm" line was good for a laugh.

~~~
herdrick
It's impartial. Did you click thru to Wikipedia?

~~~
necubi
I'm not sure if you're joking or not, but there's no such thing as an
"impartial" ranking of things.

Any ranking will put importance on certain factors which are chosen from all
the possible factors by the designer of the ranking. You have chosen to rank
based (I presume) on incoming links on CrunchBase. However, this is
intrinsically no more impartial than a ranking based on yearly revenue, or
number of employees, or size of their offices, number of mentions in the New
York Times, or any other of an unbounded number of metrics.

I would argue that PageRank, while it may have done a good job of ranking
websites for the purposes of search, is a pretty poor choice here. It's highly
susceptible to reporting bias, where some companies will be better-represented
on CB. Empirically, you also get weird artifacts, like the ranking of MySpace
ahead of Apple.

Choosing a good metric requires a theoretical explanation of why that metric
is important and why it helps answer your question (which you don't state, but
I suspect is something like, "which are the most important startups", whatever
that might mean). Just choosing _something_ doesn't necessarily tell you
anything useful.

[If you're interested in high stakes debates about rankings, look into the
hoopla that comes about every year when US News & World Report releases their
college rankings. These numbers can have huge effects on college's prestige
and, lower down the food chain, their bottom lines.]

~~~
herdrick
It certainly is susceptible to reporting bias, like most data out there.

Perhaps the better word, instead of 'impartial', would be 'disinterested'.

------
SkyMarshal
Nice reminder that Yahoo is far from down and out, despite what you hear
amongst the technorati these days.

------
franze
crunchbase is to the "start-up world"-metrics what alexa is to web-metrics.

------
googletron
Apple #11??

------
mwmnj
myspace?

~~~
herdrick
Yep, there are a number of surprises. For one thing, this is cumulative, so
deals made and competitor relationships from 2005 are as good as from 2012.
For another, having a well-flushed out Crunchbase profile makes all the
difference.

Try clicking on MySpace's blue dot and see what is connected to it.

~~~
pg
You might want to experiment with weighting data by recency.

~~~
herdrick
Unfortunately most of it isn't dated. But yes it'd probably help.

~~~
drx
You could try scraping archive.org for some estimation of dates.

