

Future of Cloud Storage: An inside perspective from an IaaS vendor - cloudsigma
http://www.cloudsigma.com/en/blog/2010/11/21/13-the-future-of-cloud-storage

======
timurlenk
I was wondering about what you guys think of NAS systems.

While I don't claim to fully understand the advantages brought by block
storage vs. file based storage, NAS storage should be usable for most
applications. If this is true than the clustered or scale out solutions
offered by IBM (SONAS), HP (IBRIX) and EMC (with the recent acquisition of
Isilon) might be a suitable off the shelf solution matching your scenario
(probably there are some open source solutions also that I am not aware of).

The way I understand these solutions you can basically scale capacity and
performance independently which in turn translates itself in the flexibility
of achieving the ideal balance between capacity and failure point
distribution/redundancy.

What do you guys think of this setup? Would it achieve your goals?

I'm coming from a corporate environment where having an off the shelf solution
that is supported by some major company is as important as the solution
itself.

~~~
wmf
The difference between NFS and iSCSI should be fairly small at this point.
Anything by IBM, HP, or EMC will cost at least 10x as much as local disk,
which is death for most cloud providers. Unfortunately, stuff like DRBD or
Sheepdog may be only 2x the cost of local disk but doesn't look ready for
prime time.

~~~
cloudsigma
Just to come back on the point regarding NFS versus iSCSI: \- proprietary
systems like IBM's, HP etc. are both expensive and complicated. This violates
our 'simple problems' mantra as well as having cost implications. \- our IaaS
platform has an open software layer with sole root access granted to the user.
That has huge security and privacy implications for our users i.e. they have
access to their data, we as a vendor don't (except for physical access which
is highly restricted and monitored). It means we see customer drives as block
storage devices. We can't see their file systems. NFS can we implemented by a
customer on top of our platform but not by ourselves. We do in fact have many
users of NFS in our cloud. Its our job to manage the storage at the block
device level. As a vendor we are a 'pure IaaS' provider. We don't reach into
the software and networking layers wherever possible. Its a fundamentally
different approach giving our customers the same sort of control over their
computing as they would with a dedicated set-up. \- DRBD has de-duplification.
For a public cloud like ours this will save us a lot of space, more than the
replication introduced. Why? We just RAID6 already so replication won't be
higher with DRBD (we will swap RAID6 for DRBD). We have thousands of identical
operating systems running concurrently with their own distinct copies (each of
our cloud servers is a complete stand alone OS on its own block device, again
for privacy/security reasons).

So, we believe that a move from our current set-up to distributed block
storage will actually REDUCE costs overall no increase them. We will of course
have the added convenience and elimination of single points of failure in
storage that distributed block storage entails.

Best wishes,

Patrick CEO CloudSigma

~~~
wmf
I was thinking more of the VMware approach of storing disk images (e.g. VMDK
files) on NFS; that's not much different than using iSCSI.

~~~
cloudsigma
OK fair enough. Most public clouds don't use VMWare for cost reasons. It would
be cheaper than some of the other commercial systems alluded to but still a
significant cost compared with other virtualisation platforms. Its also using
a proprietary format so you can't mix and match (whereas we are using RAW ISO
files). Finally, you'll need VMWare qualified staff instead of regular sys
admins. That's another incremental cost that isn't insignificant.

Best wishes,

Patrick

------
cloud-geek
Projects like Sheepdog should definitely kill off SANs and RAID. It can't come
soon enough, as the post says, current performance in the cloud for storage is
so variable. I moved away from EC2 for exactly this reason.

------
rworthington
Really interesting to read about SAN versus local storage and also distributed
block storage. The latter sounds v. cool, can't wait to see it commercially
available!

~~~
lsc
people keep telling me to use ceph... <http://ceph.newdream.net/> but it's
obviously not ready for prime time. Like the article said, local storage is
still the best price/performance/reliability balance. (I use raid 0+1 rather
than raid6)

When distributed storage systems like ceph have been around and in production
for 5-10 years, I'll re-evaluate. but for now, they present too much risk of
data loss in terms of bugs and admin error.

there's also gluster, which is much closer to my own standards in terms of
'time in production' but even so... local disk is simple, and when it fails it
fails in a non-spectacular way.

Of course, the other problem with distributed filesystems is that it makes
having a good network /much/ more important. Hell, right now I could get away
with 100Mbps, so on a gigabit network, I can have some pretty serious network
issues before anyone notices anything is wrong. For a widely distributed
storage network? even at my current scale, I'd at least need 10G interconnects
between the switches, and god help me if there was a network glitch.

~~~
cloudsigma
Yes I agree with your points. On our 1U boxes we run 8 drives in RAID6. On our
2U boxes we run 22 drives in RAID6 + 2 hot spares. So we use a similar
approach in having hot spares on the bigger boxes.

As pointed out in the article, using distributed block storage means you need
a pretty high performance storage network and low latency is just as important
as high bandwidth.

We already have a physically separated storage network that runs in parallel
to our public network. This is used primarily for drive traffic over iSCSI. We
have a standard gigabit redundant network for this. As a physically separated
network it means we don't get data integrity issues caused by DOS for example.

Moving to distributed block storage will mean we will upgrade our storage
back-end to either 10Gbps Ethernet or Infiniband. The advantage of Infiniband
is that as a cloud provider we will have essentially a large grid which is
what Infiniband is meant for. We can also use 40Gbps per port so with dual
networking can go up to 80Gbps relatively cost effectively. Just as important
is the super low latency. Suffice to say we are ahead of the software on this
one :-)

I think its important to point out that as storage moves to these sorts of
systems it will be come increasing difficult to replicate such a setup in a
redundant fashion on dedicated hardware. As a cloud provider we spread the
cost over many customers.

Best wishes,

Patrick, CEO, CloudSigma

~~~
lsc
I'm very interested in how infiniband works out for you. I've considered it
myself, as used infiniband switches are available for the same price as used
1g ethernet... but while I'm pretty comfortable with fixing ethernet when it
breaks... I'd be much less confident in my ability to fix infiniband by
swapping out hardware than I would be in my ability to fix gigabit ethernet.
As far as I can tell, infiniband is something of a dying technology vs.
ethernet, market wise.

On the other hand, infiniband is fucking awesome. DMA over the network,
anyone?

~~~
cloudsigma
Yes and no. The main issue we see with 10Gbps is networking topology. We don't
want to roll out a star type networking which has a big single point of
failure. The whole point of moving to distributed block storage is to
eliminate this. Its difficult to build a grid layout with 10Gbps without going
into silly money. With Infiniband it supports very well in grid configuration
which is ideal for a distributed storage network. As a technology it has at
least a few years left simply because it can offer 40Gbps and Ethernet won't
be there at a reasonable cost for a while.

------
ccomputinggeek
Yes agreed, distributed block storage is definitely the future. Nice to see
Linux pioneering this too.

~~~
Hoff
Without intending disrespect toward Linux, DRDB-like storage has been around
for a number of years.

There are examples that have been in production for at least a dozen years in
commercial operating system deployments.

The following is multi-host RAID-1 storage, known as host-based volume
shadowing (HBVS) storage:

[http://h71000.www7.hp.com/doc/84FINAL/ba554_90020/ba554_9002...](http://h71000.www7.hp.com/doc/84FINAL/ba554_90020/ba554_90020.pdf)

Getting DRDB working is certainly interesting and not a small project, though
dealing with the many and various error cases is the truly entertaining part
of the effort. Different devices and different failures can and eventually
will toss back errors, and with different timing. And from experience with
HBVS, these errors can and do shake out in production, and that's bad.

~~~
cloud-geek
Agreed. I think the post misses out the new file systems supporting the new
Linux roll-out. I believe ButterFS is extremely interesting with regards to
distributed block storage for example.

~~~
wmf
By "extremely interesting" do you mean "not distributed at all"?

~~~
cloudsigma
You can use ButterFS in conjunction with Sheepdog to achieve de-duplification,
snapshotting etc. There are a number of choices of combinations but ButterFS
is definitely one option.

Patrick, CEO, CloudSigma

