

Are we in another bubble? Empirical data from CrunchBase - keven
http://www.kevenlin.com/entry/are-we-in-a-bubble-empirical-data-from-crunchbase

======
daniel_levine
I worked at TechCrunch for a year focusing mainly on CrunchBase data analysis.

There are a number of problems with your analysis:

I suspect you are using founding year. Unfortunately there is a tremendous lag
between when a company is founded and when it is entered into CrunchBase. The
same is true of funding data in particular where we saw only around 20% of
fundings within a quarter of happening and only about 70% a year out. That is
due to CrunchBase's continued growth (it's much better known now) as well as a
natural reporting lag.

Second, CrunchBase is a very new product and as it turns out data is only
reliable as far back as 2007 and even that took a lot of work. Some time has
been spent pushing to get more accurate data further back but it is
scattershot at best.

(Possibly, can't tell) CrunchBase investments are stored in a number of
currencies, did you make sure to recalculate them? Yen can really cause
problems :)

Lastly, your NASDAQ chart is from 94 - 2005 which never overlaps with reliable
CrunchBase data even by your own admission. I suspect that graph will be a bit
more telling and worrisome potentially:
[http://www.google.com//finance?chdnp=1&chdd=1&chds=1...](http://www.google.com//finance?chdnp=1&chdd=1&chds=1&chdv=1&chvs=maximized&chdeh=0&chfdeh=0&chdet=1299226777603&chddm=1699990&chls=IntervalBasedLine&q=INDEXNASDAQ:.IXIC&ntsp=0).

I do not necessarily think we are in a bubble and I am happy to see people
diving in on data I just wanted to point these things out as it would be
irresponsible not to.

~~~
notahacker
Wouldn't it also be the case that well funded startups and super angels are
much more likely to disclose sources and levels of funding to TechCrunch at an
early stage for the PR than they were back when TC was a popular blog rather
than a big-name publisher?

Presumably there are also some sort of editorial policies over what sort of
startups merit inclusion in CrunchBase? The 2010 _drop_ in startups covered by
could easily be a reflection of reduced interest in covering smaller startups
that don't effectively court TC and don't disclose relevant funding data.

~~~
daniel_levine
I'm not sure the amount or willingness of sources has changed that much since
the AOL acquisition though it is possible.

But there is definitely a huge selection bias of contributors. People love to
disclose when they invested in the hot startup and neglect to mention their
big mistakes retroactively.

I suspect the biggest reason for the drop is just a smaller team and less
commitment like the OP said. There has been a lot of headcount flux at TC for
awhile and even more so since the AOL deal.

------
andraz
Take this with a grain of salt.

CrunchBase does not by any means offer stable and comprehensive picture over
the years. It has been maintained with different levels of commitment,
especially for entering in old data (since Crunchbase did not exist in 2000).

CrunchBase is a great resource, but doing those kinds of statistics without
appropriate research of how consistent is the coverage is at least sloppy if
not willful negligence.

~~~
keven
I agree. The data is not consistent but it's probably the most well-maintained
and accessible data (with its api) to my knowledge.

That is the reason I look at investments from 2005 since most investments
prior to that were not entered.

~~~
andraz
The data is useful, but not in a way you used it.

I don't think you did a good job of disclosing or the problems that might lie
in the data.

Since I've been following crunchbase both from the data perspective and from
the perspective of how much resources TechCrunch is devoting to it, I can
assure you it varies wildly.

Also the editorial policy has changed a lot in that period, in about 2007/2008
they started putting much more emphasis on international start-ups. So there
are specific skews that you should be aware of and disclose them in the blog
post.

So I think you did a nice job, but conclusions are not to be trusted at tall.
Yes it might be the most well-maintained open data out there, but it does not
make it in any more useful for this kind of analysis.

------
tuhin
A 38% decline in number of startups and a 31% increase in investments!

Also the underlying assumption seems to be exponential increase is the only
reason for a bubble.

~~~
Tyrannosaurs
This was what struck me, it seems to be a fairly clear pointer to a lowering
of standards (unless someone can come up with a convincing reason why start
ups have suddenly got significantly more viable in the last couple of years).

"This is actually a great time to be a startup founder" - yep, so was 1999, no
business plan needed, just a half baked idea and people will start throwing
money at you. This is not necessarily a good thing.

------
dools
There's no discussion here about revenues. Wouldn't the only reasonable metric
indicating a bubble be over-valuation of companies?

~~~
keven
It would help but most private companies don't report annual revenues.

~~~
dools
Sure, well any conclusions in the absence of valuation or revenue data seem
moot to me.

------
gojomo
I wouldn't trust CrunchBase for reliable trend data because its coverage over
time is unlikely to be consistent. But, a similar analysis based on legally-
required (and thus comprehensive) SEC Form Ds might give stronger insight.

At least two Form-D watching services have been mentioned previously on HN:

<http://stealthmodewatch.com>

<http://www.formds.com>

Though, I think they're relying on easier electronic access that may not go
back more than a few years.

------
hessenwolf
"In 2010, there was actually a 38% drop in number of new startups."

Pffttt.... where did this come from? Is it not that after two years of a lousy
economy everybody's saving were tapped out?

------
funthree
Based on the last graph we aren't in a tech bubble _yet_

We are just entering the last quarter of 1998.

~~~
Tyrannosaurs
Or that we're in a bubble, it's just not quite as large as the last one?

------
warrenwilkinson
I doubt the failed startups of 2000 - 2006.. etc fully represented in
CrunchBases database. This alone could account for the curve.

