

Moneyball for startups? PG, Fred Wilson, Chris Dixon discuss - asanwal
http://www.splatf.com/2011/09/moneyball-for-tech-startups

======
iamelgringo
Angel List has the largest data set of startups, investor profiles and
activity on the planet. That makes Nivi and Naval some of the most interesting
players in Silicon Valley. I suspect we'll be seeing a very interesting
investment thesis coming from them in the next couple of years. I've talked
with the Kauffman Foundation, and one of the reasons Kaufman funded them was
to have access to that data

Chris Farmer at General Catalyst is pretty open in talking about his data
driven investment approach. He is doing a big data play, mining source code
repositories, LinkedIn profiles, CrunchBase, etc... He wrote up a little slice
of his research here: [http://techcrunch.com/2011/05/25/top-10-vc-firms-
investorran...](http://techcrunch.com/2011/05/25/top-10-vc-firms-
investorrank/)

I know that SVAngel keeps pretty good stats on their investments, and relies
on a good bit of data to make their investments.

A number of top tier VC firms I've talked to this past summer are actively
building systems to do more data driven investment.

Brendan Baker's Anatomy of Seed study is one of the best data driven studies
of early stage investment I know of. He is working with a grad student to
replicate the study to see if his conclusions are reproducible. ref (
[http://www.quora.com/Brendan-Baker/Anatomy-of-Seed-An-
Inside...](http://www.quora.com/Brendan-Baker/Anatomy-of-Seed-An-Inside-Look-
at-a-1M-Seed-Round) )

BlackBox.vc did the startup genome project, and are actively pursuing early
stage investments based on that data. <http://blackbox.vc/>

------
joshu
A good entrepreneur is up "at bat" half a dozen times. I don't think there's
enough data.

OTOH, as an angel, I've invested in ~50ish startups. Lots of patterns (YC vs
non YC, equity seed vs convertible seed, etc) emerge.

~~~
Caligula
Can you list some of the patterns you encountered?

------
DanielRibeiro
Well you could say Yuri Milner's YC investment approach is more based on data
than anything else. It is on YC's, not on the companies, but meta-heuristics
are still heuristics.

------
gavanwoolery
It does not really matter if you are using hard data like number of previous
companies, or soft data like impressions based on personality. In fact, these
days I would say success depends more on the VC(s) involved than the founders.
Mark Cuban was at least half-right when he compared tech investments to a
Ponzi scheme ([http://blogs.wsj.com/venturecapital/2011/08/15/mark-cuban-
th...](http://blogs.wsj.com/venturecapital/2011/08/15/mark-cuban-this-tech-vc-
bubble-is-like-a-ponzi-scheme/)).

This is, obviously, a generalization -- there are a few companies that stand
on their own feet, and have even done so without outside investment. But in
the majority of cases it seems like startups are acquired because some kind of
"inside" deal is going on. When you think about it, most companies rarely
benefit from acquisitions and mergers - instead they slowly die. To list a
few: AOL, Bebo, Myspace, Flip, Map Quest, Alta Vista, Netscape, Broadcast.com,
Excite, Lycos, Ask Jeeves, Sun Microsystems...

What startups are still going strong post-acquisition? I can think of Youtube
off the top of my head...what else? If the parent companies are (generally)
not benefiting from acquisitions, why are they happening?

~~~
emmett
Reddit? Whatever became Google maps? Hotmail? Android?

There are plenty of startups which have thrived after acquisition. There are
also spectacular failures. M&A is risky, but can have a very high payoff.

------
corin_
The interesting thing here is how alike the two scenarios are - despite how
prevelent statistics are in baseball, they're still not the be all and end
all.

There's a quite story from not too long ago involving the LA Dodgers - for the
life of me I can't remember the names or the time period, but it will
hopefully come to me when I'm more awake (or someone here might refresh my
memory). Every day, an expert in sabermetrics (aka baseball stats) would
pprepare a huge load of paperwork for the Dodgers manager ahead of that day's
game, and every day the manager would say thanks, wait until he had left the
room, then chuck it all in the trash.

Tech investors aren't the only peope who value gut feelings over statistics.

------
6ren
Re: predicting it based on qualities of the founders: pg has said
determination is very important - based on data from startups, it turned out
much more important than anticipated. Therefore, one would expect an objective
measure of determination to have (some) predictive power. Maybe not as much as
the present YC process; but it would be hard data, and a new perspective which
might be revealing. Certainly, scientific at least!

It sounds like a sloppy thing to measure; but Martin Seligman (learned
helplessness/learned optimism) has quizzes to measure optimism/pessimism,
which have experimentally demonstrated predictive power (and actually have
been used to supplement job interviewing, with measurable success). So
measuring determination might be possible. There's even checks to
prevent/detect cheating. Of course, extra smart candidates with million dollar
(VC) motivation may quickly subvert any a test. Still, it would be
interesting, intellectually, to see if it did work.

Might also be interesting to ask one of Seligman's students to assess YC
candidates for optimism (his definition means that you bounce back from
failure with energy - e.g. a pivot). It seems plausible to me that that would
also be predictive of startup success (and also predictive of determination -
the ability to keep going).

------
sgentle
It's funny, I would assume pg - as in "a plan for spam" pg - would be
supremely interested in running some kind of Bayesian predictive model over YC
candidates. Maybe the subtext of that quote was that he tried it and it didn't
give any useful data.

An interesting conjecture is that numerically predicting startup success
(defined as a high standard deviation of return on investment) might actually
be impossible because any venture risky enough to get those kinds of results
would fall outside the acceptable error bars of the predictive model. The
equivalent in spam filtering would be if you wanted a system to show you only
messages that were 99% likely to be spam, but still not spam.

I'm not sure if that actually makes any sense, someone feel free to jump in
and tear it to shreds.

------
Jun8
This data driven approach doesn't even work well for other sports,e.g. soccer.
I guess the reason it works so well for baseball is that it's made up of a
large number of well defined tiny one to one faceoffs.

