Hacker News new | comments | show | ask | jobs | submit login
Crunchbase admits it may need to learn more about Creative Commons (techcrunch.com)
63 points by pkmehta on Nov 6, 2013 | hide | past | web | favorite | 11 comments

I was part of the team that created Crunchbase at Techcrunch and it is really sad to see what has happen, since the way People+ are using Crunchbase is exactly as intended with the design.

I can remember the specific conversation about picking CC-BY+A, it was because we wanted other apps to use the data, even if they were competing with us (and they were)

It also seems they have dumped the open API, previously you could just add '.js' to any page and get a JSON representation of it, no developer signup, no stupid API key, etc.


My impression now is that since the AOL acquisition Crunchbase has tried to retract a lot of its openness, which was a big reason for its success.

I think the key difference is that we saw Crunchbase as a free service to the startup community that would reinforce Techcrunch, while AOL see it as having a distinct and independent commercial value.

Nik - good stuff. Was there ever a monetization plan when you were building it or was it always just always viewed as a complementary asset to TC as you highlight above?

Comment from TC article left by IronToby which is a great point:

"Sounds like they wanted people to contribute content for free while they're the only ones allowed to profit from it."

We incubated a few projects (Crunchbase, Crunchpad, Crunchies, Techcrunch40) and even acquired some (Inviteshare) - they were all developed with less or no emphasis on revenue or commercialization but to offer products/services to the startup community (eg. Techcrunch40 didn't charge startups to appear, Crunchbase was CC licensed and open, the Crunchpad was going to be sold at cost).

There was some upside for us, but in those first years the entirety of revenue was poured back into developing other projects (Crunchbase and Crunchpad were both 7-figure investments). The projects did help the brand, but likely not in any real ROI sense. I don't know how Crunchbase was valued in the acquisition, it likely wasn't, since the acquisition price was close to a market multiple on Techcrunch ad revenue alone.

They were all a lot of fun to work on, we had a great team of people, some excellent interns, developers and partners and it made working at Techcrunch all that more interesting (at many points we had more developers and people working on these projects than we did working on the main site). It was a philosophy of doing well while doing good.

The idea of Crunchbase was for it to be a co-op with other blogs, and that the writers would keep it updated while they wrote posts on each company. That didn't work because the other blogs saw it as a threat (a lot of them ended up signing with a commercial alternative that had raised $5M and spent a lot of that money signing up blogs as part of 'revenue guarantees').

The active decision was then made to make Crunchbase open. This is alluded to in some of the other comments in this thread, and that is if you develop the site as a commercial service then users have an expectation that the data is up to date, correct and deep. The alternative is an open platform, where users and other blogs are part of it - the argument that when Wikipedia has poor information users don't blame Wikipedia, they roll their sleeves up and fix it themselves. That was at the core of Crunchbase - not commercial, owned by users, therefor we wouldn't have to make the tradeoff of 'owning' it completely and having to invest extensively to keep the quality at a level expected of a full commercial service (although we did have almost a dozen people working on Crunchbase).

I don't think AOL have thought that through in the same way. You can't on one hand take advantage of the openness of user contributions, not keep the quality up with investment but at the same time expect to protect it as a commercial service. The only commercial ideas we had for Crunchbase were as additions, eg. custom reporting, research, etc. By design the platform was open, free, forkable and insulated from commercial acquisition.

That philosophy just came about naturally because of the group of people involved. There was a little skepticism, but it didn't take long for everybody to start throwing out ideas on making and keeping things as open as possible. It shouldn't surprise us that with an entirely new group of people (bar one, IIRC) and a new corporate parent that parts of it might be altered, but in reading Matt's response I can see that 95% of what was in place is still there, they were just caught out and mishandled this one case. Very strange to see now that Techcrunch and the EFF are on opposite sides of an argument, I hope they clear it up. We used to monitor implementations[0] and send an email requesting a link back or a logo, never had a problem with it - it definitely never reached the point where a lawyer would be drafting a C&D. I don't think we ever reached the point of having a clone or API user that we didn't get a along with.

[0] We setup a number of Crunchbase records that served as canaries - fake company names (on domains we registered) that were only in Crunchbase. If we find those company names anywhere online, it would be a hint that somebody took the data. If they didn't cite the source, we would email them, and in almost all cases it was resolved amicably. We wanted nothing more than to enforce the 'attribute' part of CC+A. A big part of the reason why Crunchbase records rank so well in SERP pages (it is usually in the top 5-6 results when searching for a name of a startup) is because of the "free data, but link back" policy. It means crunchbase.com was always the canonical source for Crunchbase data.

I think the comments by Matt Kaufman (president of Crunchbase) about the damage that can be caused by "Crunchbase replicas" reflect the knowledge on AOL's part that Crunchbase is a potentially very interesting data asset. And so the reality is that they'll prob revisit rights to Crunchbase, and it'll go from very open to something less so (how far is the question). I would also guess that Crunchbase has a paid subscription offering in the works.

I would also speculate that People+ may be the first "offender" Crunchbase will target. Datafox.co, Mattermark, Inkwire.io might all find themselves in the crosshairs.

This also underscores a common mistake developers make which is creating on someone else's platform where the rules can and will change significantly especially when monetization becomes important. And it seems monetization is on Crunchbase's mind.


This post on the Crunchbase blog articulates that monetization is coming and that the above "offenders" may be hearing from AOL counsel soon.


"CrunchBase must remain open to anyone who wants to contribute, and retrieving that data for non-commercial benefit must remain open as well. That said, to invest in CrunchBase’s constant improvement requires building a business around CrunchBase in a way that successfully takes into account our terms of service and our openness."

How reliable is Crunchbase, anyway? In my experience, its frequently outdated and incorrect, but I don't search it often.

I've used it for a few projects. If you're using it as an only source, keep that caveat in mind, but it's a good starting point.

I love Crunchbase but I would have more sympathy for them if their long tail content was given a bit more maintenance love. If your content stagnates, you deserve competition. To be fair this takes effort, but how many young people would knock themselves out for a low paid or unpaid internship there to get this stuff done? It seems more like a lack of caring, although admittedly I'm an ignorant outsider.

AOL are used to doing the content exploiting, not being the ones having all the value extracted from their work.

I've caught Crunchbase employees blatantly copying and pasting content from AngelList - which is not Creative Commons as far as I'm aware. I'm surprised they're up in arms over this.

AngelList is one of CrunchBase's partners, with syndication rights.

Sounds like Twitter -- "It's open, contribute all you want, until you want to get all of the data back out; Then you have to pay"

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact