

Brisk = Cassandra + Hive + Hadoop - yarapavan
https://github.com/riptano/brisk/tree/brisk1

======
ajays
The main selling point is that Cassandra has replaced HDFS as the underlying
filesystem for Hadoop.

I'm curious: what are the performance implications of using CFS instead of
HDFS? Is it faster/slower/more fault-tolerant? (It appears to be the latter,
given that there's no single point of failure in CFS).

~~~
jbellis
That's one point: CFS means it's more fault tolerant and simpler to run (only
one type of node), with equal-or-better performance. You also get multi-
datacenter support essentially for free.

The other big win is you can run analytic queries against your realtime data
(data in Cassandra columnfamilies/tables, rather than a blob interface like
CFS/HDFS) instantly, with no ETL. (An example:
<http://www.datastax.com/docs/0.8/brisk/brisk_demo.>)

Finally, Brisk takes care of integrating the Hadoop job/task trackers with
Cassandra automagically, which otherwise requires fairly deep knowledge of
both Hadoop and Cassandra (<http://wiki.apache.org/cassandra/HadoopSupport>).

Brisk also integrates the full stack (including Hive queries) with DataStax
OpsCenter:
[https://s3.amazonaws.com/uploads.hipchat.com/6528/23268/uohl...](https://s3.amazonaws.com/uploads.hipchat.com/6528/23268/uohlm21erckytzx/jt.png)

------
yarapavan
Documentation: <http://www.datastax.com/docs/0.8/brisk/index>

