

The Man Who Knows Whether Any Startup Will Live or Die - InternetGiant
http://www.wired.com/2015/01/growth-science/

======
colanderman
> According to the U.S. Bureau of Labor Statistics, about half of all
> businesses fail within five years.

> He [...] thinks that even a model that’s only right about 50 percent of the
> time could help investors and entrepreneurs avoid particularly bad ideas

...does he have a bridge to sell me too? What am I missing?

(One can simply predict "always succeeds" and will be right half the time.)

~~~
brudgers
Perhaps what has been missed is the difference between a single coin flip and
a combination of coin flips?

Consider one startup.

    
    
        f(x) = #fail
    

succeeds better than 50%, and

    
    
        f(x) = #succeed
    

succeeds less than 50%. This is due to the nature of startups.

Sure, it's easy to get about 50% accuracy for one startup by flipping a coin:

    
    
        f(x) = if rand(1) > 0.5 then #fail else #succeed
    

But consider the case of two companies A and B. There are now four outcomes:

    
    
       A = #fail, B = #fail
       A = #succeed, B = #fail
       A = #fail, B = #succeed
       A = #succeed, B = #succeed
    

If we flip a coin, we have to flip it twice. Our probability that two coin
flips match the correct tuple is 25%, and bumping that up to 50% is a massive
improvement.

Investors diversify their portfolios. In a portfolio of 100 startups there's
probably a winner. Improving the selection of companies means reducing the
number of a fund's portfolio companies necessary for a reasonable probability
of a winner. More smaller yet successful funds makes capital more efficient.

Better pruning of boolean search spaces has real value. Hence:

    
    
        When predicating that a company will fail, 
        he adds, they’re right 88 percent of the time.

~~~
x1798DE
That's... not accurate. There are two success conditions (Fail, Fail) and
(Succeed, Succeed), and two failure conditions (Succeed, Fail), (Fail,
Succeed). If you flip two coins, the chance that they come up _both heads_ is
25%, but the chance that they come up _the same_ is 50%.

~~~
brudgers
The odds of (#fail, #fail) for two startups (A, B) are much greater than 50%.
When investing in two companies (#fail, #fail) is not #success.

At a 1% #success probability the failure rate is ~98%. The 1% #success rate is
based on the a knowledgeable person choosing A and independently choosing B.
If that person can obtain information that lets them improve their selections
to %2 #success probability, they can reduce the total number of investments
necessary to achieve any particular expected return on investment.

Reducing the number of investments may improve the investor's ability to
influencing the outcome of each company in their portfolio, because the
investor can allocate more time, energy, and resources to each company in
their portfolio [resuming the investor brings business expertise to the
table].

~~~
x1798DE
The article is claiming that the guy can _predict_ which things are going to
succeed and which are going to fail, not _make_ them succeed or fail. His
evidence for this is that 50% of the time, he's right (e.g. [success,
success], [fail, fail] are both success conditions _for him_ ). For him to be
adding any information to the system, he has to get it right more often than
either choosing randomly or using a fixed zero-information strategy (always
bet fail/always bet succeed). You explicitly said that the joint probability
matters, but then miscalculated the joint probability of him guessing
correctly by random chance.

~~~
brudgers
I apologize for not making myself clear enough to communicate as effectively
as might be hoped.

------
Fede_V
I would be incredibly interested to know how the model was crossvalidated. It
is utterly trivial to cherry pick features post-facto to correctly predict
winners, however people who are unfamiliar with machine learning might not
know this.

------
tarkofski
"He admits the models will never be perfect, but thinks that even a model
that’s only right about 50 percent of the time could help investors and
entrepreneurs avoid particularly bad ideas that, to the untrained eye, look
like excellent opportunities."

He basically admits his model is no better than a monkey

~~~
falcolas
> He basically admits his model is no better than a monkey

Well, technically, a monkey with a coin to flip. "Heads this startup will
succeed, tails it will fail"

~~~
gerhardi
Now coin flip wouldn't be anything near 50% accurate, unless there is a world
where more than a small fraction of startups survive!

~~~
Hasu
What? A coin flip would still be 50% accurate. If 10% of startups succeed, and
we say that heads is "succeed" and tails is "fail":

5% of startups will be predicted to succeed and will succeed (correct
prediction)

5% of startups will be predicted to fail and will succeed (false prediction)

45% of startups will be predicted to succeed and will fail(false prediction)

45% of startups will be predicted to fail and will fail (true prediction)

50% true predictions, 50% false predictions. A coin flip is always 50%
accurate.

------
ballpoint
I'm not sure having better predictors of what businesses are good is
necessarily a good idea. Part of the attraction of silicon valley is that it
takes some of the risk out of trying new things, even if they might be bad
ideas. This culture of trying things leads us to find the occasional really
good idea. If we sit around all day plugging our ideas into models to see if,
statistically speaking, the will succeed, we won't find the really novel ideas
that look bad but are actually good.

------
JonoBB
Retroactively fitting a model to prove history is not that difficult.
Accurately predicting the future may be a bit trickier. Just ask technical
stock traders.

~~~
darkmighty
Well it depends on proper cross-validation, sample size and of course
predictability (i.e. 4 * P(Suc.|data) * P(Fail.|data)<<1). But online
prediction is of course what will tell the long term reliability of any model.

------
sickpig
from the article:

 _He admits the models will never be perfect, but thinks that even a model
that’s only right about 50 percent of the time could help investors and
entrepreneurs avoid particularly bad ideas_

dunno why but I can't stop thinking that tossing a coin could achieve the same
goal :)

~~~
joshyeager
That only works if startups fail 50% of the time. If they fail more
frequently, a coin flip would be wrong.

~~~
sickpig
I stand corrected, you're right. I'm supposed to know better.

~~~
jgeralnik
Nope, you're still right.

Assume startups fail 90% of the time. 50% of the time your coin comes up
tails, and you claim the startup will fail. 50% of the time your coin comes up
heads and you claim the startup will succeed.

You guess correctly 0.5 (the chance your coin comes up tails) * 0.9 (the
chance the startup fails) + 0.5 (heads) * 0.1 (success) = 0.5 of the time.

------
darkFunction
I don't understand what the inputs to the model are.

------
normloman
I wish this guy the best in improving his algorithm. If it really worked, it
could do a lot of good. But the economy is so complex, I doubt he'll ever make
the model more accurate than a coin toss. I predict the model will just make
people overconfident in their investment decisions.

~~~
TeMPOraL
Investing is anti-inductive; if his algorithm actually starts to be used in
investment decisions, people will keep gaming it until it will no longer be a
useful signal.

~~~
kylebrown
Reflexive is the term used by George Soros:
[http://en.wikipedia.org/wiki/Reflexivity_(social_theory)#In_...](http://en.wikipedia.org/wiki/Reflexivity_\(social_theory\)#In_economics)

------
zupa-hu
Too bad that the very best startups are always the outliars (aka without
historical data).

~~~
bluedevil2k
Do have any data to back that up? It seems a blanket statement to call "the
best" start-ups outliers. What defines "the best" startup?

~~~
zupa-hu
As the topic was about investments, in the context the best means whichever
makes more money I guess. You are right, I have no data to back that up.

------
bbody
Misleading title but an interesting article nonetheless. The comments here
really show how much statistics are misunderstood by the public, even by a
more technical-minded crowd.

------
gtirloni
> even a model that’s only right about 50 percent of the time could help
> investors

That's equivalent to a coin flip. I didn't get the point he was trying to
make. 66% looks good though.

------
krampian
title: "The Man Who Knows Whether Any Startup Will Live or Die"

text: ""He admits the models will never be perfect, but thinks that even a
model that’s only right about 50 percent of the time could help investors and
entrepreneurs..."

Which is not surprising... if the title was really true, this man would likely
be richer than Warren Buffett at this point.

~~~
rimantas
Wouldn't a coin toss be right 50% of the time?

------
dia80
The data mining bias / model risk here is huge. Let's see how he gets on truly
out of sample.

------
skmurphy
I wonder if he has applied the algorithim to his firm to improve his chances
of survival

~~~
agarden
Yes. From the article: "According to Thurston’s own model, Growth Science’s
own chance of survival following its current business model is about 69
percent. Adding the automated service would actually improve its chances, he
says."

------
thurstont
Hi Y'all, Thomas here (guy in article). Just want to start by saying (1) this
is a very intelligent thread, and (2) I didn't write the article, was just
interviewed for it. You never know what's going to be written, no matter what
you say.

Here's how the models really play out. We compare our accuracy against the 10
year survivorship benchmark of 25% (not the 5 year). When you look at small
businesses, VC-backed, and corporate ventures (ex. new products coming out of
companies), the 10 year survival rate is around 25%, plus or minus 10%
depending on the industry.

Our models have made thousands of predictions for around nine years now - all
the predictions were live, real-time and forward looking (no back-testing
included here). From those predictions, around 3,400 have matured to date.
That is, only around 3,400 of the results have happened - the businesses have
either become big successes (ex. Uber) or failed. In our research, we have to
actually wait for businesses to live or die to test our accuracy.

From the roughly 3,400 predictions that have matured, we were right 66% of the
time when predicting survivors, and 88% of the time when predicting failures.
When we scratched beneath the surface, we were really around 66% accurate in
both cases (just most businesses fail, which is why gloomy predictions were
22% more accurate - just a function of dumb luck since most things die).

So we consider our algorithms to be 66% accurate, which is much more accurate
than anything we're aware of in human history (remember, the baseline we're
compared against is 25%).

If you do a statistical analysis (to make sure our predictions weren't just
luck), the models maintained a statistically significant correlation with 99%
confidence. There was less than 1 chance in over 500,000 that the results were
a function of luck (definitely not a coin toss).

We've used these models in venture, and our performance puts us in the top 5%
of all VC funds for our vintage years, so we've monetized these models
effectively with real dollars and made considerable gains.

I hope this gives folks a better sense for how it works. There's been a very
emotional backlash to the Wired article today (not accusing this thread, just
thinking of some others) and it's weird because it's just basic scientific
research. Pretty drab stuff on most days, but apparently offensive to some
people. Not sure why. We're using statistics to improve venture and startup
odds, just like stats have been used to improve just about every other field
humans have ever taken seriously. Seems obvious that stats are similarly
useful in the startup world, and my dream has always been to help more
businesses use stats to succeed.

Anyway, definitely a lot more controversy and emotion than I would have
expected. Otherwise pretty basic science, not claiming perfection, just
striving for improvement, using data as best we can, etc. I hope at least some
folks see this for what it is - nothing out of the ordinary in any other
domain of science. Why should entrepreneurship be any different?

~~~
kylebrown
So if the baseline is 25% survive, then 75% fail, for a sum of 100%. How is it
correct to compare that baseline to two numbers (66% and 88%) which don't sum
to 100? Is failing different from not surviving?

~~~
thurstont
Great question - as this thread showed it's a 2x2

Possible outcomes:

1\. The models predicted survival, and the business survived 2\. Predicted
survival, but the business failed 3\. Predicted failure, but the business
survived, 4\. Predicted failure, and the business failed

Here's how our results stacked up: #1. 69% of observations (it wobbles around
66%, but was 69% in the last update) #2. 31% of observations (notice #1 + #2 =
100%. Out of 100% of the times we predicted survival, we were right 69% and
wrong 31%) #3. 12% #4. 88% (again, #3 + #4 = 100%. Out of all the times we
predicted failure, we were right 88% of the time and wrong 12%).

Again, there's more luck when you predict failure, so it's really around 66%
for both positives and negatives when you dig deeper into the stats.

------
claypoolb
The output of his model is contrarian to what every accelerator and VC firm in
the Valley says about investing. They believe the team is the most important
factor of success. Thurston says its 12%. We love pushing the envelope!

------
pp19dd
"The Machine That Won the War."

------
michaelochurch
Ignoring the already-discussed issue of overfitting, I'm going to discuss
something that surprises many, but shouldn't.

 _Using this process, he discovered some surprising things—most notably that a
company’s team is only about 12 percent predictive of a company’s success._

I'm surprised that it's even that high. In the real world (e.g. outside of the
VC-funded world) the team is important, because they're going to grow the
company from scratch, and that requires getting more things right than wrong
over 5+ years. On the other hand, if you're Snapchat, you have several
investor-level people working to protect the company from its founder-quality
problems and its own worst impulses, and the company will IPO or be bought
before its cultural rot reaches a critical point.

The Valley's founder-quality problem creates a lot of awful corporate
cultures, and it has dragged the status of engineers way down, and it's
generally been bad for the world... but it doesn't kill businesses _because_
the investors are able to keep enough of them on track to produce successes.
The influence of VC "rocket fuel" and guidance is why a company can have
terrible founders and still succeed.

~~~
mojuba
There is a factor at play here, I think, and it's the age. Many investors are
simply older and are more experienced individuals. I'm not even sure if their
VC status matters here any more than their age.

(In a rather comical example, two youngsters want to put together another chat
app with cool new smiley icons, then two VCs step in and ask: how are you
going to monetize on smileys?)

Investors should definitely be viewed as part of the team - one, and two, age
should also be taken into account. I wonder if Growth Science considers this
as well.

------
Simp
>> Using this process, he discovered some surprising things—most notably that
a company’s team is only about 12 percent predictive of a company’s success.
“You need to find a good team that won’t ruin the company, but hiring ‘rock
stars’ isn’t that great,” he explains. The market the company is entering is
far more important than who’s running the company.

Makes you wonder why we're paying these people so much money.

------
mojuba
I wonder if they use Bayesian logic, because they should.

I also suspect that even though the article does not disclose a lot, the major
factor at play in their model is the market sizes. I'm pretty sure their
online version would heavily rely on the industry/segment you select. In other
words it's a business-plan-looking-good approach which in today's rapidly
changing world becomes less and less relevant. So I'll remain skeptical about
it.

~~~
Someone1234
> I wonder if they use Bayesian logic, because they should.

Looking at his qualifications, I think it is safe to say he too took
statistics 101.

