
The Future Is Big Data in the Cloud - bpung
http://gigaom.com/2009/10/25/the-future-is-big-data-in-the-cloud/
======
aristus
"disclosure: Accel is an investor in Cloudera, the company behind Hadoop"

Um, kinda... no.

This is a decent article, and he's free to give a plug to his portfolio even
if it's disguised as a disclosure. But it's disingenuous at best to call
Cloudera "the" company behind Hadoop when 80% of the code flows from Yahoo.
It's like calling Transmeta "the" company behind Linux.

~~~
jhammerb
Hey,

As a founder of Cloudera and a participant in the Apache Hadoop community, I'm
dismayed whenever the press writes something about Cloudera "creating" Hadoop.
No one here believes that, and we try to acknowledge the great work done by
the community, especially at Yahoo!, Facebook, Amazon, Powerset, et al.,
whenever we can.

To the best of my knowledge, we're the only company to provide a commercially
supported distribution of Hadoop. I'm pretty sure that's what Ping meant when
he says "behind"--supporting, not necessarily solely creating.

Anyways, it's an open source project, and the copyright belongs to the Apache
Software Foundation. In the case of Linux, no one organization contributes
more than 13% of the code (e.g. <http://lwn.net/Articles/222773/>). The same
distribution of contributions will likely be the case for Hadoop as it
matures, and it's reasonably close to that state now: there are subprojects to
which neither Yahoo! nor Cloudera is the primary contributor, and it is most
certainly not the case that 80% of all Apache Hadoop code comes from any
single organization.

Either way, Hadoop is the most interesting open source project with which I've
been involved, and I hope the increasing attention doesn't distract the
community from its primary goal of producing kick-ass code.

Later, Jeff

~~~
spidaman
Ping seems generally confused about how to discuss commercial support of open
source projects, not just Cloudera. Note his reference to "Northscale, parent
company of Memcached" is even sillier. AFAIK, the only "parent" I'd think of
for memcached, Brad Fitz, is still ensconced at goog.

I've talked to the Northscale guys, sounds like their onto good stuff and I'm
not faulting them (just as I wouldn't fault Cloudera), Ping just needs a copy
editor.

------
mey
A future is big data in the cloud.

~~~
loup-vaillant
Good point. I'd rather have small data (e-mail, personal website, shared
files…) at home. It has a smaller "Big Brother" potential.

~~~
vicaya
All the traffic to/from your home is monitored. Encrypted data in the cloud is
much safer and more available.

~~~
mey
Why can't traffic to/from your home be encrypted?

