
Superfast SSDs are coming, but will they be used the right way? - evo_9
http://arstechnica.com/business/news/2010/10/super-fast-ssds-are-coming-but-will-they-be-used-the-right-way.ars
======
bryanlarsen
tl;dr: operating systems need to support SSD's as a cache for the main hard
drive rather than as a hard drive themselves.

IMO: sure, that would be nice, but he's overstating his case. SSD's are even
more useful as the main hard drive than just a cache. 80-160GB (the current
sweet spot for cheap SSD's) is big enough for 80% of all users, in 12 months,
we'll have hit 95%. The only operating system that moves fast enough to add
support for something like this in that sort of timeframe is Linux, which is
only used by 1% of desktop users anyways. (And Linux works well in a 10-20GB
partition, let alone 160GB)...

~~~
ergo98
>The only operating system that moves fast enough to add support for something
like this in that sort of timeframe is Linux

<http://en.wikipedia.org/wiki/ReadyBoost>

The ars article isn't very clear, however it isn't talking about a system
having a magnetic hard drive and a flash hard drive (in a standard 2.5/3.5
fashion), but rather a magnetic disk and some sort of built-in or extension
flash memory. There are new upcoming motherboards that have a decent amount of
onboard flash memory for precisely this usage model.

In the SAN market there is a raging trend called Easy Tiering, FAST, or
whatever other marketing glop the vendor makes. It allows you to have a couple
of SSDs, and then lots of magnetic disks, finding a sweet spot between volume
and speed as it automatically moves hot spots to the SSDs. It's a perfect use
of varying speed technologies.

The new hybrid hard drives bring this -- to some success -- to the consumer
market. They just need a lot more flash and a better algorithm and it's a
killer solution: A 2TB magnetic disk with 32GB of high speed flash would be
incredible.

~~~
chapel
You have to remember that the focus is about moving past old hard drive
connections like SATA, so even if you got one of the new hybrid hard drives
with flash on them, you would be limited by the SATA port it was connected to.

That is why all those companies are moving to PCIe directly, for the extra
speed. If the OS were to handle caching and other things so the user didn't
have to worry about it, then the experience would be much greater and faster
all said.

~~~
ergo98
The gross throughput factor is _hugely_ oversold for the vast majority of
uses: In most scenarios IOPS is Godzilla to throughput's mouse. IOPS is
reality, and throughput is something that only generally comes up in
benchmarks.

------
bensummers
Look into what ZFS can do with a spot of flash.

ZIL: <http://blogs.sun.com/rdm/entry/zil_ssd_and_other_fun>

L2ARC: <http://blogs.sun.com/brendan/entry/test>

I believe this is the "right way" the article is advocating. You can use it
now.

~~~
Nate75Sanders
Am I missing something or is this pure filesystem cache? What happens instead
if you configure an SSD as your swap device? Wouldn't the OS then be able to
use the SSD more generally as a piece of the memory hierarchy?

~~~
CJefferson
If the OS used the SSD as a swap device, it would still try to write the data
back to the underlying drive within a reasonable period of time, which isn't
necessary for SSD.

~~~
Nate75Sanders
Unless I'm mistaken, you're still thinking of file writes. I'm talking about
using your SSD as "slow RAM". Your statement is true, but it doesn't account
for the use of SSD as something that could be used for very large ephemeral
data structures.

------
Roboprog
Question about the article, and suggested use of hierarchical file storage:
why would you want to make the user "mark" files? Why not just have the OS
keep most-recently-used (perhaps weighted by size?) files on the fast disk,
and migrate inactive files to the slow disk?

Or, just buy a hybrid disk and let the hardware do it.

~~~
eru
Or even most recently used blocks, instead of files.

------
comatose_kid
"$8,000 for a 160GB model" according to the article. But looking at
Newegg.com, it seems that you can buy 24GB of RAM for $750 so 8K could buy you
240GB with some cash left over.

And looking at supermicro's page, their 4U tower allows you to populate up to
192GB....

Naive analysis, since it doesn't take into account um..persistence, but still
interesting to see how expensive these fusion io parts are...

~~~
olegkikin
Where are you going to insert all this RAM? Did you calculate the number of
motherboards you would need and how you'd connect all that?

Also, RAM resets as soon as you restart your computer.

~~~
gxti
You can buy a Dell R715 today with default configuration and 256GiB RAM for
the low, low price of $26,231.

------
jbooth
Nobody mentions OCZ?

"And right now, they're also wallet-bustingly expensive, even for enterprise
customers."

Check the revodrive - 480GB PCI-E drive, bootable, costs the same amount as a
512GB standard SSD with the SATA bottleneck.

~~~
bconway
The article mentions it in the third paragraph:

 _If you want to go slower, smaller, and cheaper, you can pick up a consumer-
grade, 80GB OCZ PCIe SSD card from Amazon for $300 and change, but you'll get
only 540MB/s read and 450MB/s write._

~~~
jbooth
Well, they have higher-end options too that go to 1.4GB/s, they're pricier but
still way cheaper than fusionio -- roughly comparable to 1.25X the price per
GB of a 2.5" drive for 5-10X the bandwidth.

------
alecco

      > Intel, Dell, IBM, EMC, and a host of other component makers and
      > OEMs have announced a partnership aimed at developing a standard
      > interface for PCIe-hosted solid-state disks
    

And this is how a company like OCZ can beat the big guys. Why use the high-
latency PCIe at all? SSDs are several orders of magnitude faster than other
I/O.

~~~
bradleyland
I'm curious, if they don't use PCIe, what interface would they use?

~~~
adbge
Lightpeak was designed with this in mind, so that _should_ be a viable
alternative whenever it hits production phase.

Right now the only viable interfaces to PCIe (that I know of) are Infiniband
and Fibre Channel, though I imagine such systems don't come cheap.

------
wedfvgberfgbh
"in that they put the OS partition and __swapfile __on a fast SSD"

So at $8,000 for 160GB, we now use ultra expensive SSD as an alternative for
more cheap faster RAM? You do remember why swap was invented - because memory
was scarce and disk was cheap.

~~~
fragmede
Violin Memory does just that (no relation) <http://www.violin-
memory.com/products/dram/>

I'm not finding it on their page, but they had one product that was 1TB of
ram, backed by a 1 TB disk with a backup-battery. You get super fast storage,
and if the power went out, the TB of ram would get saved to disk.

~~~
moe
Yup, I second violin.

They have it about right when you're really starving for IOPS. It naturally
comes at a price. But last time I checked the violin's were a much more
natural next step after maxing out local SSDs and RAM than the FusionIO
devices (which seem more like a stopgap-solution rather than a scaling path).

~~~
qq66
"I second violin"

:)

------
swah
Are they also SuperHot ?

------
dutchbrit
SSD's should result in huge performance boosts, especially with nosql systems
like MongoDB & CouchDB. Can't wait until the prices start to drop!

~~~
Nate75Sanders
I would have thought the biggest performance boost would have come for
databases based on B-tree-like structures on large datasets.

You have to read to read and read to write. Every depth of the tree that
doesn't fit into main memory costs you a random read, whether to read or
write.

~~~
jbellis
Write amplification (<http://en.wikipedia.org/wiki/Write_amplification>) means
that naive random writes like what you see from b-trees are going to degrade
performance on SSDs too (although not as pathologically as doing rotational
seeks). In other words it still pays to be smart about i/o patterns.

~~~
eru
Shouldn't you be able to do mostly sequential writes if you have a b-tree
structure and a copy-on-write filesystem (like btrfs)?

~~~
eru
Plus the occasional GC. But you can do this while idle.

