
RAMClouds: Scalable High-Performance Storage Entirely in DRAM - DavidSJ
http://www.stanford.edu/~ouster/cgi-bin/papers/ramcloud.pdf
======
vicaya
Ousterhout and gang are credible guys, but this is completely back of the
envelope vaporware. Some numbers are a bit dubious as well, like 1M request/s
per server. Let's say the requests is simple messages like twitter with
average length of 200 bytes (including message overhead), it would need cross
sectional switch bandwidth of 2Gbit/s on small frames; anything more
interesting say 20KB web pages, would need 200GBit/s switch bandwidth, which
is not gonna happen any soon. It's no big deal to do 1M/s in process, but I'd
love to see a real implementation that can do that over a commodity network.

Real RAM clouds already exist, as Bigtable etc have durable in-memory table
that can take advantage of SSD as well.

~~~
neilc
_this is completely back of the envelope vaporware_

Sure -- the linked-to paper is a position paper that basically just explains
the problem they are trying to solve, not the specific details of their
solution (the project is just beginning).

As for your particular example: presumably maximum throughput involves
batching requests together, so the need to achieve that performance for small
frames is reduced. You can also use Valient Load Balancing or similar
techniques to avoid the need to achieve the necessary cross-sectional
bandwidth with a single switch.

I think the latency number (5-10 microseconds) is actually more interesting:
you can use batching and load balancing to improve throughput, but not
latency. Given that 5-10 microseconds is typically significantly below the
port-to-port forwarding time for _a single switch_ , achieving that latency
figure will require work at many different levels of the stack (network
hardware, kernel <=> user space transitions, etc.)

------
rythie
This for the most part assumes traditional disks and the problems with those.
The problem with the RAMcloud is that it's expensive.

The problem is that memory is expensive, and when you can only put ~64GB in a
1U server it's a lot of rack space costs too. For about the same money as 64GB
of RAM you can buy 1TB of Intel SSD storage and that (6x160GB disks) will fit
in a 1U server too. SSDs have very good random read performance and that is
likely to get significantly better in the coming years. RAM is already fast
enough for this type of job and the real problem is lack of capacity / high
cost.

I suspect that the people like facebook who have relied on RAM mostly so far
will start moving to SSDs to cut costs - they have already moved away from
netapp for their storage to cut costs.

~~~
DavidSJ
But as the article points out, capacity per dollar/watt/cm^3 is increasing
exponentially for all of these storage mediums, and is expected to do so for
many years to come. However, throughput and latency per dollar/watt/cm^3 is
not improving nearly as quickly.

This means that throughput and latency will be the scarcest resource in the
future, and regardless of the demands of your application, a time will
eventually be reached where SSD has enough capacity at $X, but HDD does not
have good enough throughput or latency at $X, so you switch to SSD. And
eventually, the same logic will apply for SSD -> RAM.

~~~
rythie
Well HDDs never really increased in access time in the last 20 years [as they
state] because the problem is basically rotational latency, so get a better
speeds than a 15k RPM disk you need a 30K RPM, 60K RPM, 120K RPM disks etc
which would be crazy.

SSDs are pretty new and are very fast despite several problems that the makers
haven't quite worked out yet. Random access seek time can be improved by
adding more chips. Also seek times will no doubt increase with faster clock
speeds and reduced feature sizes in the same way CPUs and memory do now.

SSDs are lower power than HDDs and Memory is high on power usage not to
mention the savings by having fewer servers.

SSDs are simply too new for people to design for them, if you look at what
<http://www.rethinkdb.com/> are doing, or the TRIM feature, or log based file
systems, there is quite a lot to be done to update the software people use to
take advantage of SSDs. It has all been written with the limitations of HDDs
in mind.

------
koblas
Most of the time we're working somewhere between the legacy solutions and the
ideal solutions. Only a few companies have devoted time to focusing on a
specific technology to yield completely interesting tidbits.

What's most interesting is to think of this as an ideal solution to a class of
problems and then think about how you might solve any one of those. For
instance if access to large amounts of data is in essence "free" then the
problem is RPC latency. It might take you a while to figure out where all the
underlying bottlenecks are in your or the ideal RPC framework is.... That's
really the point, to start thought and start investigation...

