

Cassandra driver for Spark - tjake
https://github.com/datastax/cassandra-driver-spark

======
anko
I'm really interested in spark, but know next to nothing about Hadoop. What's
the best way for me to get started?

~~~
pkolaczk
You don't really need to know anything about Hadoop Map/Reduce to start using
Spark. Spark has its own, more powerful "map-reduce".

You need familiarity with one of the storage platforms supported by Spark -
currently these are Hadoop File System and Apache Cassandra. The easiest way
to play with Cassandra is:

1\. grab a copy of DSE (free to test or develop) and install it (download
here: [http://www.datastax.com/download](http://www.datastax.com/download))

2\. launch 'cqlsh', create a Cassandra keyspace and a table and insert a few
rows into it

3\. launch 'dse spark' and query your data with e.g.
sc.cassandraTable("keyspace", "table").toArray

Doing it with Apache Cassandra (not DSE) is going to be slightly harder,
because besides installing Cassandra, you'll have to set up standalone Spark
cluster (see Spark docs), then follow the instructions in README.md of the
driver.

