
The TechCrunch Bubble Index: Parsing Headlines to Quantify Startup Hype - lil_tee
http://toddwschneider.com/posts/techcrunch-bubble-index/
======
minimaxir

        article.published_at = Time.zone.parse(elm.css(".byline time").first.to_h["datetime"])
       # timezone seems inconsistent, no big deal because we only care about date anyway
    

I've written a TechCrunch scraper myself, and it turns out the reason the
time-zone is inconsistent is TechCrunch outputs the time in 12-hour
AM/PM...but forgets to include the AM/PM. So the hours just loop around from 0
to 12.

~~~
samcrawford
I too wrote a little TC scraper a few months back, as a friend's startup was
expecting an article there and I wanted to see it first. In the process I
found a Wordpress bug that meant unpublished article headlines were leaked.
Reported it to TC, no response at all. Reported it to Wordpress via Hackerone,
they've fixed it for Wordpress VIP sites, but not in the main codebase yet.
That was four months ago, and they've stopped responding to emails now.

~~~
jdbiggs
Who did you email? I don't see anything. john@techcrunch.com

------
josefresco
I stopped visiting TechCrunch almost 5 years ago when it changed from
highlighting new and upcoming startups to simply covering news of the already
established and huge tech companies such as Twitter, Facebook etc.

I haven't been back, but do catch an article or two when it's linked from HN.
Does anyone here still frequent TC enough to comment on the current editorial
direction?

~~~
photorized
They have a separate startup section now, but even that's getting repetitive.

------
logicallee
This is pretty good - until you published it. Now all you need to ask is how
much incentive TC's owners have to show a certain graph. (That is just one
example of gamification this can create.) I mean because they can publish
whatever headlines they want and track and hack this graph absolutely
directly.

I think it would have been better to keep this pretty powerful heuristic for
yourself :)

------
photorized
I remember doing a very basic analysis (via Google search) of TC headlines
back in 2012, but I was curious about their preference for funded companies vs
bootstrapped, in terms of coverage. Predictably, they were mostly covering
funding rounds:

[http://blog.itrendcorporation.com/2012/07/23/no-coverage-
for...](http://blog.itrendcorporation.com/2012/07/23/no-coverage-for-self-
funded-companies/)

Your data is very interesting. Any change you could also "group by" company
and list companies which are more frequently (repeatedly) covered by TC? I
have a theory about that, would be interesting to test.

------
debacle
Maybe the equation is that TechCrunch articles about startup funding get more
"help" from those startups on sites like HN/Reddit, thus they get much more
exposure, thus more views.

This is an amazing analysis. Thank you. I really enjoyed the aggregated "x for
y" headlines.

------
softdev12
This is really awesome. It really says something about the quality of the
product being churned out at TC.

I imagine that most of the funding articles are short. It would be interesting
to see how much of the total words were about fundraising (i.e. not just
headlines, but the entire article).

------
001sky
Funraising is very envy-inducing. Envy tends to be good for advertisers (look
at womens print magazines, if you have any doubt). I'd be interested to see
how this impacts their advertising business metrics.

~~~
minimaxir
For my TechCrunch scraper, I obtained the number of Facebook shares for each
TechCrunch article to use as a proxy for popularity/page views. There appears
to be a correlation, as Facebook social media shares of TechCrunch articles
have doubled in the same months as the peak of fundraising announcements.

[http://i.imgur.com/Xqhahjs.png](http://i.imgur.com/Xqhahjs.png)

EDIT: reduced unintentional financial verbiage.

~~~
nerfhammer
In the same months? Can't you just at # of shares post by post and see it it
correlates with whether it's a fundraising post?

~~~
minimaxir
I had the chart+data on hand: doing that analysis would take a little more
work.

------
ultimape
I'd love to see this done with the new hackernews api.

------
billclerico
it's interesting to see a spring surge in headlines & a fall surge in
headlines - perhaps a result of the summer slowdown at VC firms.

------
jstoiko
Very interesting. I wonder if any changes in TC editorial line could have
biased this graph.

------
randomname2
This article indicates startup funding is way down since this spring.

