Hacker News new | comments | show | ask | jobs | submit login
Moneyball for startups? PG, Fred Wilson, Chris Dixon discuss (splatf.com)
41 points by asanwal 1528 days ago | 10 comments

Angel List has the largest data set of startups, investor profiles and activity on the planet. That makes Nivi and Naval some of the most interesting players in Silicon Valley. I suspect we'll be seeing a very interesting investment thesis coming from them in the next couple of years. I've talked with the Kauffman Foundation, and one of the reasons Kaufman funded them was to have access to that data

Chris Farmer at General Catalyst is pretty open in talking about his data driven investment approach. He is doing a big data play, mining source code repositories, LinkedIn profiles, CrunchBase, etc... He wrote up a little slice of his research here: http://techcrunch.com/2011/05/25/top-10-vc-firms-investorran...

I know that SVAngel keeps pretty good stats on their investments, and relies on a good bit of data to make their investments.

A number of top tier VC firms I've talked to this past summer are actively building systems to do more data driven investment.

Brendan Baker's Anatomy of Seed study is one of the best data driven studies of early stage investment I know of. He is working with a grad student to replicate the study to see if his conclusions are reproducible. ref ( http://www.quora.com/Brendan-Baker/Anatomy-of-Seed-An-Inside... )

BlackBox.vc did the startup genome project, and are actively pursuing early stage investments based on that data. http://blackbox.vc/


A good entrepreneur is up "at bat" half a dozen times. I don't think there's enough data.

OTOH, as an angel, I've invested in ~50ish startups. Lots of patterns (YC vs non YC, equity seed vs convertible seed, etc) emerge.


Can you list some of the patterns you encountered?


Well you could say Yuri Milner's YC investment approach is more based on data than anything else. It is on YC's, not on the companies, but meta-heuristics are still heuristics.


It does not really matter if you are using hard data like number of previous companies, or soft data like impressions based on personality. In fact, these days I would say success depends more on the VC(s) involved than the founders. Mark Cuban was at least half-right when he compared tech investments to a Ponzi scheme (http://blogs.wsj.com/venturecapital/2011/08/15/mark-cuban-th...).

This is, obviously, a generalization -- there are a few companies that stand on their own feet, and have even done so without outside investment. But in the majority of cases it seems like startups are acquired because some kind of "inside" deal is going on. When you think about it, most companies rarely benefit from acquisitions and mergers - instead they slowly die. To list a few: AOL, Bebo, Myspace, Flip, Map Quest, Alta Vista, Netscape, Broadcast.com, Excite, Lycos, Ask Jeeves, Sun Microsystems...

What startups are still going strong post-acquisition? I can think of Youtube off the top of my head...what else? If the parent companies are (generally) not benefiting from acquisitions, why are they happening?


Reddit? Whatever became Google maps? Hotmail? Android?

There are plenty of startups which have thrived after acquisition. There are also spectacular failures. M&A is risky, but can have a very high payoff.


The interesting thing here is how alike the two scenarios are - despite how prevelent statistics are in baseball, they're still not the be all and end all.

There's a quite story from not too long ago involving the LA Dodgers - for the life of me I can't remember the names or the time period, but it will hopefully come to me when I'm more awake (or someone here might refresh my memory). Every day, an expert in sabermetrics (aka baseball stats) would pprepare a huge load of paperwork for the Dodgers manager ahead of that day's game, and every day the manager would say thanks, wait until he had left the room, then chuck it all in the trash.

Tech investors aren't the only peope who value gut feelings over statistics.


Re: predicting it based on qualities of the founders: pg has said determination is very important - based on data from startups, it turned out much more important than anticipated. Therefore, one would expect an objective measure of determination to have (some) predictive power. Maybe not as much as the present YC process; but it would be hard data, and a new perspective which might be revealing. Certainly, scientific at least!

It sounds like a sloppy thing to measure; but Martin Seligman (learned helplessness/learned optimism) has quizzes to measure optimism/pessimism, which have experimentally demonstrated predictive power (and actually have been used to supplement job interviewing, with measurable success). So measuring determination might be possible. There's even checks to prevent/detect cheating. Of course, extra smart candidates with million dollar (VC) motivation may quickly subvert any a test. Still, it would be interesting, intellectually, to see if it did work.

Might also be interesting to ask one of Seligman's students to assess YC candidates for optimism (his definition means that you bounce back from failure with energy - e.g. a pivot). It seems plausible to me that that would also be predictive of startup success (and also predictive of determination - the ability to keep going).


It's funny, I would assume pg - as in "a plan for spam" pg - would be supremely interested in running some kind of Bayesian predictive model over YC candidates. Maybe the subtext of that quote was that he tried it and it didn't give any useful data.

An interesting conjecture is that numerically predicting startup success (defined as a high standard deviation of return on investment) might actually be impossible because any venture risky enough to get those kinds of results would fall outside the acceptable error bars of the predictive model. The equivalent in spam filtering would be if you wanted a system to show you only messages that were 99% likely to be spam, but still not spam.

I'm not sure if that actually makes any sense, someone feel free to jump in and tear it to shreds.


This data driven approach doesn't even work well for other sports,e.g. soccer. I guess the reason it works so well for baseball is that it's made up of a large number of well defined tiny one to one faceoffs.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact