I created this with a new service I released (Machete). Machete is built on top of dc.js (which in turn is built on d3.js and crossfilter). dc and d3 both have decent learning curves, Machete is aimed at non-programmers or as a prototyping tool for developers.
The visualization works best in chrome or any edge browser. In the off chance you're a huge Luddite and surfing HN with IE8, here are a couple extracted stats across the 2,040 rounds:
YC dwarfs other accelerators with $2.8B in funding, next largest is AngelPad with $185M.
Average A Round - $4.3M, B - $17M, C - $58M, D - $98M
Airbnb and Dropbox so skewed stats, that I created a filterable attribute for them called "Freakish_Outlier".
I should be on for a couple hours, please feel free to comment or ask questions.
Looks good. There are two things that will be really important:
- How is data imported? It would be great to be able to set up custom SQL queries, CSV imports, and other service gateways to do analytics on.
- How do I define graphs? I'll want to be able to do time-series, but easily switch between day/week/month/year resolution, and create trailing period averages without having to think about it.
I signed up for the beta. I'd love to play around and do some deep diving on stats we have collected.
Right now we only support CSVs they get served directly from blob storage to the browser (without going through our webserver). We obviously want to expand on those options, but CORs makes serving from internal company data sources a PITA. We could use our domain as a proxy to the relevant info, but that has its own issues (auth, perf). I'm sure we'll be able to work it out though. Automated CSV imports will probably be the best way to do things initially.
When you import your CSV, we take educated guesses about the data in each column (number, category, date), but you can change it if necessary (sometimes numbers should be treated like categories). We give you the ability to group dates by day of week and month. Month-year isn't in there yet, but should be by launch.
It works best one un-aggregated source data (since we do so all the fine level aggregation ourselves in the browser). Dealing with weighted averages is possible and we've done it a lot in custom work, but machete doesn't yet have support for it.
I changed the primary sort to the count, then filtered on just "Fan Go". Then, I wanted to compare the total amount Fan Go had raised in comparison to other companies, so I changed the primary sort back to the total, but then was unable to see "Fan Go" on the "Total Raised - Company" bar chart. I had to hover over the bubble in the graph instead to see the total amount Fan Go had raised.
yeah, things can get a bit screwy when you filter and then change the active measure. It isn't a simple fix unforunately. I've added a bug to our backlog. Thanks!
Very nice! I missed the currency symbol ($) in the labels, I felt that I wasn't sure sometimes if I was seeing a "count" of something, or a measure of currency.
Perhaps it's all currency so it's redundant, but I still think proper units are a good thing to have when presenting data. Better one too many than one too few, imo.
Median might solve some of that issue, unfortunately it is computationally heavy to do median on a rolling basis, unlike average. Part of the reason the filters work so quickly is because when I add or remove items from the active set, I can just add/subtract from the total and the count for that one item. With median, I'd have to keep the active list sorted which even with a binary tree under the covers is still more expensive than two math operations. The filtering library under the cover is crossfilter.
http://www.seed-db.com/companies/funding?value=1 plus a little extra meta-data from crunchbase. Seed-DB (so far as I can tell) only tracks startups that went through an accelerator. We've got a larger list that includes everyone, but it made viewing the accelerator info more difficult because there are so many startups that haven't gone through one.
We'll probably create another board with all startups but will probably do it on a company basis instead of a funding round basis. There are a lot of different ways you could want to see this data.
If you sign up for the beta, we'll let you know when you can create your own. We're following the github model - public projects are free, private projects have a modest fee.
You think so? I read the title and it never crossed my mind to think it was all funding. To me the fact that it says "this much of it" already implies it's a selection.
Edit: that said, the page should definitely mention it (the source and selection criteria).
Surprised to see so little startup activity in the Seattle area. Is there a reason for this? - in spite of what I expect to be a pretty serious pool of a talent drawn in by Amazon and Microsoft.
ASCII Art of Danny Trejo, lol. I can certainly say we did not consider that for our logo. I like where your head is at though. The name "Machete" was/is a bit of a risk, I'll definitely remember your feedback when we re-do it. It currently was just a quick and dirty fiverr job :)
Is it the rapid increase in total raised from 2012 - 2014 (even without Freakish_outlier) a data artifact or has the amount of funding actually exploded?
Every company in this dataset went through an accelerator, my guess is that it is due to the explosion in accelerators after YC's success. (Hell, I'm from Iowa and we have one now and two more are opening in the summer).
We've got another dataset that has more companies (ones that didn't go through accelerators). There is more data so we wanted to re-work and re-aggregate some of the data on a company level instead of a round level. We'll probably release it in a couple days.
The visualization works best in chrome or any edge browser. In the off chance you're a huge Luddite and surfing HN with IE8, here are a couple extracted stats across the 2,040 rounds:
YC dwarfs other accelerators with $2.8B in funding, next largest is AngelPad with $185M.
Average A Round - $4.3M, B - $17M, C - $58M, D - $98M
Airbnb and Dropbox so skewed stats, that I created a filterable attribute for them called "Freakish_Outlier".
I should be on for a couple hours, please feel free to comment or ask questions.