

Cutting cost and power consumption for big data - tpatke
http://newsoffice.mit.edu/2015/cutting-cost-power-big-data-0710

======
jcr
The ( _unmentioned_ ) title of the paper is, "BlueDBM: an appliance for big
data analytics"

Abstract:

> _" Complex data queries, because of their need for random accesses, have
> proven to be slow unless all the data can be accommodated in DRAM. There are
> many domains, such as genomics, geological data and daily twitter feeds
> where the datasets of interest are 5TB to 20 TB. For such a dataset, one
> would need a cluster with 100 servers, each with 128GB to 256GBs of DRAM, to
> accommodate all the data in DRAM. On the other hand, such datasets could be
> stored easily in the flash memory of a rack-sized cluster. Flash storage has
> much better random access performance than hard disks, which makes it
> desirable for analytics workloads. In this paper we present BlueDBM, a new
> system architecture which has flash- based storage with in-store processing
> capability and a low- latency high-throughput inter-controller network. We
> show that BlueDBM outperforms a flash-based system without these features by
> a factor of 10 for some important applications. While the performance of a
> ram-cloud system falls sharply even if only 5%~10% of the references are to
> the secondary storage, this sharp performance degradation is not an issue in
> BlueDBM. BlueDBM presents an attractive point in the cost-performance trade-
> off for Big Data analytics."_

[http://people.csail.mit.edu/wjun/papers/ISCA15_Sang-
Woo_Jun....](http://people.csail.mit.edu/wjun/papers/ISCA15_Sang-Woo_Jun.pdf)

