
Meet Big Data equivalent of the LAMP Stack  - yarapavan
http://gigaom.com/2010/08/01/meet-big-data-equivalent-of-the-lamp-stack/
======
yarapavan
Summary:

The Big Data Stack comprises of:

* Hadoop Distributed File System (HDFS) for storage

* MapReduce for distributed processing of large data sets on compute clusters

* HBase for fast read/write access to tabular data

* Hive for SQL-like queries on large data sets as well as a columnar storage layout using RCFile

* Flume for log file and streaming data collection, along with Sqoop for database imports

* JDBC and ODBC drivers to allow tools written for relational databases to access data stored in Hive

* Hue for user interfaces

* Pig for dataflow and parallel computations

* Oozie for workflow

* Avro for serialization

* Zookeeper for coordinated service for distributed applications

~~~
adnam
HMHHFJHPOAZ doesn't quite have the same ring to it, although I look forward to
listing it on my CV.

~~~
gojomo
Excellent point. So I propose instead "HART" -- "Hadoop And Related Tools".
The HART stack.

~~~
ratsbane
Upvoted, seconded, and added to my CV. I was just thinking, in reading that
article, that one of the biggest obstacles to selling the HART stack to
management types are the names. Can you really, with a straight face, imagine
standing before a room full of grey-haired VPs and saying "yes, we want to
implement this with Hadoop using Hive, Flume, Hue, Pig, Oozie, Avro, and
Zookeeper." And then explain that "Hadoop" is named after a teddy bear.

~~~
anamax
> And then explain that "Hadoop" is named after a teddy bear.

Hadoop is named after a stuffed elephant.

------
rbranson
So this stack will curse us all with having to fix awful programming mistakes
for decades?

~~~
d2viant
Could you be more specific?

------
earl
sweet christ. that reads like fucking trendy hadoop word soup spat out to get
suckers to hire this firm for $500 an hour.

Also, how does he think that installing this crap is ever going to be like
installing excel, until you're installing and configuring excel into your
custom datacenter topography

