Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: $4.4B in Startup Funding Rounds Visualized (machete.io)
85 points by viggity on June 24, 2014 | hide | past | favorite | 36 comments



I created this with a new service I released (Machete). Machete is built on top of dc.js (which in turn is built on d3.js and crossfilter). dc and d3 both have decent learning curves, Machete is aimed at non-programmers or as a prototyping tool for developers.

The visualization works best in chrome or any edge browser. In the off chance you're a huge Luddite and surfing HN with IE8, here are a couple extracted stats across the 2,040 rounds:

YC dwarfs other accelerators with $2.8B in funding, next largest is AngelPad with $185M.

Average A Round - $4.3M, B - $17M, C - $58M, D - $98M

Airbnb and Dropbox so skewed stats, that I created a filterable attribute for them called "Freakish_Outlier".

I should be on for a couple hours, please feel free to comment or ask questions.


Looks good. There are two things that will be really important:

- How is data imported? It would be great to be able to set up custom SQL queries, CSV imports, and other service gateways to do analytics on.

- How do I define graphs? I'll want to be able to do time-series, but easily switch between day/week/month/year resolution, and create trailing period averages without having to think about it.

I signed up for the beta. I'd love to play around and do some deep diving on stats we have collected.


Right now we only support CSVs they get served directly from blob storage to the browser (without going through our webserver). We obviously want to expand on those options, but CORs makes serving from internal company data sources a PITA. We could use our domain as a proxy to the relevant info, but that has its own issues (auth, perf). I'm sure we'll be able to work it out though. Automated CSV imports will probably be the best way to do things initially.

When you import your CSV, we take educated guesses about the data in each column (number, category, date), but you can change it if necessary (sometimes numbers should be treated like categories). We give you the ability to group dates by day of week and month. Month-year isn't in there yet, but should be by launch.

It works best one un-aggregated source data (since we do so all the fine level aggregation ourselves in the browser). Dealing with weighted averages is possible and we've done it a lot in custom work, but machete doesn't yet have support for it.

Thanks for the kind words :)


I changed the primary sort to the count, then filtered on just "Fan Go". Then, I wanted to compare the total amount Fan Go had raised in comparison to other companies, so I changed the primary sort back to the total, but then was unable to see "Fan Go" on the "Total Raised - Company" bar chart. I had to hover over the bubble in the graph instead to see the total amount Fan Go had raised.


yeah, things can get a bit screwy when you filter and then change the active measure. It isn't a simple fix unforunately. I've added a bug to our backlog. Thanks!


Very nice! I missed the currency symbol ($) in the labels, I felt that I wasn't sure sometimes if I was seeing a "count" of something, or a measure of currency.

Perhaps it's all currency so it's redundant, but I still think proper units are a good thing to have when presenting data. Better one too many than one too few, imo.


Under "Region" (with raised_total), what is the region between Atlanta and Seattle with no label?


its just data that wasn't provided :(


You could use median instead of filtering out outliers explicitly so your averages don't get skewed.


Median might solve some of that issue, unfortunately it is computationally heavy to do median on a rolling basis, unlike average. Part of the reason the filters work so quickly is because when I add or remove items from the active set, I can just add/subtract from the total and the count for that one item. With median, I'd have to keep the active list sorted which even with a binary tree under the covers is still more expensive than two math operations. The filtering library under the cover is crossfilter.


Anyone know why Dropbox is considered a Dublin-based company? Shouldn't their SF office be considered the main HQ?


What are the tax laws of Ireland like this time of a year?


Soft day, thank god


12.5%


Where are you getting the data for this? It seems to be missing a fair number of startups that have had significant (>20M) rounds.


http://www.seed-db.com/companies/funding?value=1 plus a little extra meta-data from crunchbase. Seed-DB (so far as I can tell) only tracks startups that went through an accelerator. We've got a larger list that includes everyone, but it made viewing the accelerator info more difficult because there are so many startups that haven't gone through one.

We'll probably create another board with all startups but will probably do it on a company basis instead of a funding round basis. There are a lot of different ways you could want to see this data.

If you sign up for the beta, we'll let you know when you can create your own. We're following the github model - public projects are free, private projects have a modest fee.


> Seed-DB (so far as I can tell) only tracks startups that went through an accelerator.

You should alter the HN post's title to reflect that.


You think so? I read the title and it never crossed my mind to think it was all funding. To me the fact that it says "this much of it" already implies it's a selection.

Edit: that said, the page should definitely mention it (the source and selection criteria).


It wasn't apparent from the title that this data was mostly from Seed-DB, and partly from Crunchbase.

Seed-DB is accelerator based funding, and the crunchbase data that was used is incomplete.

It lists Seattle behind Atlanta in terms of Startup Funding.


for example you are missing Groupon which raised multiple large rounds.


Also, some of the existing data does not seem up to date. Just looked up Instacart and CrunchBase has them at 54M raised whereas your chart says 11M.


Surprised to see so little startup activity in the Seattle area. Is there a reason for this? - in spite of what I expect to be a pretty serious pool of a talent drawn in by Amazon and Microsoft.


The data is missing a lot of Seattle based startups.

Including some of the larger non-exited startups:

Chef raised 65 million. INRIX 78.1 million Avalara 84.6 million


Awesome idea. The name, "Machete" is fine, but the logo is too aggressive. Consider something more playful, like a smiling machete, or ASCII art of Danny Trejo :D, http://upload.wikimedia.org/wikipedia/commons/8/86/Danny_Tre...

Just kidding. But seriously, a friendlier logo would make me more likely to try the service. Best of luck!


I like the fact that the logo is aggressive given the startup scene is very much so. So many startup logos are cutesy and playful.


ASCII Art of Danny Trejo, lol. I can certainly say we did not consider that for our logo. I like where your head is at though. The name "Machete" was/is a bit of a risk, I'll definitely remember your feedback when we re-do it. It currently was just a quick and dirty fiverr job :)


It's an awesome logo, don't wimp out. :)


Keep the logo :)


What's the data source?

Is it the rapid increase in total raised from 2012 - 2014 (even without Freakish_outlier) a data artifact or has the amount of funding actually exploded?


Every company in this dataset went through an accelerator, my guess is that it is due to the explosion in accelerators after YC's success. (Hell, I'm from Iowa and we have one now and two more are opening in the summer).

We've got another dataset that has more companies (ones that didn't go through accelerators). There is more data so we wanted to re-work and re-aggregate some of the data on a company level instead of a round level. We'll probably release it in a couple days.


It's almost impossible to click on Grant, and other. Suggestion: Make the Title a part of the Label area.


Took me a second to understand what you are saying, but YES, that is definitely on our backlog. It's a PITA when item has very small clickable bar.


Wow, YC-accelerated companies has captured 64% of the 4.4B.


You can thank Dropbox for a large chunk of that.


This is a crazy stat... Seriously.


dc.js as always made a good job.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: