Hacker News new | comments | show | ask | jobs | submit login
The startup world, ranked and visualized (endrank.com)
90 points by dbrush 2115 days ago | hide | past | web | favorite | 46 comments

The way this is laid out it looks like the entities that are in a row are associated, it's confusing. You name the columns at the top and label the ranks on the left.

I love the use of different-sized circles, but I feel like you're comparing apples to oranges in a way. I think the circles should all be relative to one master size (#1) only their own category; rather than comparing Ron Conway to Google his circle could be of equal size. This way when searching for entities their circles would make more sense because the user is aware of the consistent max circle size. By comparing Conway to Google you're giving up a wider scale you could be using for the Peoples' circles.

Still though, cool project. I found the 5-person startup I'm interning at this summer :) Makes for a great crash-course on who matters if someone were trying to study up on the startup scene.


We tried both options and liked using one scale for all categories. The idea is that you are comparing influence, or something like it, so that can be visually compared across categories. You do have the downside of having very small dots for the people after only a few pages, but since there are > 100,000 people, most are going to have the minimum sized dot anyway.

I like the interface design.


Dave McClure at #3, Paul Graham at #943; whereas 500 Startups is at #41, and Y Combinator is at #15?

Why is 500 Startups classified as a financial firm, whereas Y Combinator is classified as a company?

Also, I'm surprised that Andreessen Horowitz is ranked #67 of financial firms (given that Marc Andreessen is ranked #14), and Elon Musk is #119 for people. I would have thought they'd both be ranked much higher.

Thanks! We like to keep it clean.

In the data source, TechCrunch's Crunchbase, the influence of Paul Graham, Jessica Livingston, et al. is captured in YC the company, and Marc Andreessen is split between his personal identity and AH. So you get a lot of odd things like that.

As for why YC got classified a company, who knows? It's accident of history. They should change it.

I don't think it's an accident. It's the way they talk about what they're doing. Partly because they don't want to be seen as VCs or an Incubator.

"... In the data source, TechCrunch's Crunchbase ..."

TechCrunch as a single data source. Are there any other sources used?

At the moment no. This was one dataset we used to get started but, we intend to incorporate data from Angel List and other sources as well.

"... This was one dataset we used to get started but, we intend to incorporate data from Angel List and other sources as well. ..."

Sooner the better, TechCrunch data is iffy ~ http://www.flickr.com/photos/bootload/2913315731/ though useful for a start point. Did you do a select on the companies to check for multiple listings?




And we already have an API for you: http://angel.co/api


Co-builder of this here. Hope you like it! Don't forget to click on a dot - then you can walk the graph of relationships among startups, people, and financial firms.

This reminds me of an analysis I once did: https://github.com/astanway/Crunchbase-Network-Analysis

The data from Crunchbase is very, very dirty, but I managed to clean it up a lot. Feel free to fork if you want pre-built NetworkX graphs.

This is really interesting. I'm surprised you found that startups with higher centrality raised less money. Not sure how to explain that. Your investor graph has edges pointing both from investors to startups and back, right? Anyway, thanks.

Mm, no, just from investors to startups. Startups don't invest in investors :)

I actually did a lot of Crunchbase clean-up for http://seedtable.com - if people are interested in this perhaps we should collaborate on some kind of data cleansing project.

Great to see Pejman on the list. His Plug and Play facility has some great companies renting spaces and interesting events come through.

I miss attending events when a prior startup was based there.

Want to hustle? This is a great list to go through and familiarize yourself with these people, what they make, what they need, how you can help them, what you might ask them for if they ever offered to help you, etc. Great way to visualize Crunchbase, which is a great resource.

Except for the part where there are 10 companies ranked higher than Apple...

Given what this is ranking I am surprised they are so high, this has nothing to do with market cap, cash on hand or the devotion of fans.

What does it have to do with?

The 217,000 most important companies, financial firms, and people...

Why did you stop there though?

"...in the startup world..."

In fact, I am surprised to see a lot of non-startup companies like Nokia listed quite highly.

So Google, Yahoo, and Microsoft are in the startup world, but Apple isn't?

Me too.

Apple is in the startup world. How different would the startup economy be without Apple's platforms to develop, deploy, and discover on?

Can you augment the data set with Angellist? There's more info there than on crunchbase for newer startups

MySpace ahead of Apple? That about sums up my opinion of crunchbase.

Minor pagination issue: when you click "next" the "prev" link takes its place. I went back and forth between the first page and second page a couple times before noticing.

Very nice work!

Glad you like it! You're right - that pagination would be better if the 'next' link behaved better. We'll fix it sometime soonish.

There will be a few surprises in any system...

But Yahoo ahead of Facebook, Amazon and Apple?

Myspace ahead of Apple?

Paul Graham is #943 among people?

Cool project, even with the eyebrow-raising results.

Paul Graham's credit is mostly subsumed into Y Combinator's rank (#15 among companies). Yahoo has bought a lot of startups. But yeah, there are some things that don't make sense.

I think you can only say that this is a visualization of how things are in Crunchbase, not "the startup world" in general. It's ranking things based on the number of connections each thing has. For angel investors this may be a good indicator, but not necessarily for companies or VC firms.

Yay! I'm #21 on the angel list. I should get more of my stuff in crunchbase (more than half are missing)

Did some random searches. Noticed quite a bit of data missing, but I was searching for older stuff.

Good luck, useful.

Nice work. Interesting way to visualize.

One request, in the UI, where you have checkboxes, can you add 'exclusive' select. So for example, I want to see ONLY 'acquisitions' made; right now, we have to deselect everything else manually.

Very cool, but you really need to work on the algorithm. It's easy to pick out the reason why MySpace might be found higher than Facebook or as a few people have pointed out, PG vs YCombinator.

But my last start-up, which I shutdown over a year ago (HearWhere, 19,883) is ranked as more influential than companies that are significantly larger traffic that are still operating (example, AllRecipes, 23,087).

Nice to think I'm that influential, but I can assure you, I'm not (yet ;))

The "impartial algorithm" line was good for a laugh.

It's impartial. Did you click thru to Wikipedia?

I'm not sure if you're joking or not, but there's no such thing as an "impartial" ranking of things.

Any ranking will put importance on certain factors which are chosen from all the possible factors by the designer of the ranking. You have chosen to rank based (I presume) on incoming links on CrunchBase. However, this is intrinsically no more impartial than a ranking based on yearly revenue, or number of employees, or size of their offices, number of mentions in the New York Times, or any other of an unbounded number of metrics.

I would argue that PageRank, while it may have done a good job of ranking websites for the purposes of search, is a pretty poor choice here. It's highly susceptible to reporting bias, where some companies will be better-represented on CB. Empirically, you also get weird artifacts, like the ranking of MySpace ahead of Apple.

Choosing a good metric requires a theoretical explanation of why that metric is important and why it helps answer your question (which you don't state, but I suspect is something like, "which are the most important startups", whatever that might mean). Just choosing something doesn't necessarily tell you anything useful.

[If you're interested in high stakes debates about rankings, look into the hoopla that comes about every year when US News & World Report releases their college rankings. These numbers can have huge effects on college's prestige and, lower down the food chain, their bottom lines.]

It certainly is susceptible to reporting bias, like most data out there.

Perhaps the better word, instead of 'impartial', would be 'disinterested'.

Nice reminder that Yahoo is far from down and out, despite what you hear amongst the technorati these days.

crunchbase is to the "start-up world"-metrics what alexa is to web-metrics.

Apple #11??


Yep, there are a number of surprises. For one thing, this is cumulative, so deals made and competitor relationships from 2005 are as good as from 2012. For another, having a well-flushed out Crunchbase profile makes all the difference.

Try clicking on MySpace's blue dot and see what is connected to it.

You might want to experiment with weighting data by recency.

Unfortunately most of it isn't dated. But yes it'd probably help.

You could try scraping archive.org for some estimation of dates.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact