The idea of pooling large numbers of drives, dividing it up by space, and then sharing it over a network (IP or not) will never get you the kind of performance you get from dedicated locally-attached drives.
For server storage, it just makes more sense (IMHO) to have local disks with unquestionably reliable performance characteristics. The flexibility of EBS-type solutions is nice, but it comes at too high a cost.
Separately, it also seems that Amazon's architecture under EBS has at least some per-AZ single points of failure, such that certain classes of issues can take down storage for a whole (or at least large parts of an) AZ. There doesn't seem to be any a priori reason that this needs to be the case; I could imagine an EBS-like system that would have storage per rack, say, which would appear similar from the user's perspective except that it might perform better and would be unlikely to disable all of their instances at once if it failed. No idea how Rackspace's implementation works, but I doubt all EBS-like solutions are created equal in terms of reliability.
More importantly you probably don't want the full TCP/IP stack for disk access inside a data-center. At the OS level disk IO has become fairly abstracted just read up on Native Command Queuing http://en.wikipedia.org/wiki/Native_Command_Queuing or even TCQ to get some idea what's already going on. Which opens a lot of doors for optimizing and makes it hard to generalize when it comes to network failures.
I do agree on the "why tcp?" point. Funnily everyone is running the other way and wrapping up IP inside another abstraction layer. Nvgre vxlan etc are slipping yet another complication under your "simple" block device.
their cloud servers already have locally attached storage, but it's limited, because they're not just going to chuck disks in servers at user demand. if you want that you need to go with a dedicated server.
I really see this product as being more for people who want to use something like cloudfiles but who's software can't deal with it.
Hi there ehutch79, just a quick clarification. Cloud Files is object storage, whereas Cloud Block Storage is block storage. The performance characteristics of both types of storage are very different. One way I use sometimes to explain the difference is by talking about how they are accessed: Cloud Files objects are accessed using HTTP (think REST), whereas Cloud Block Storage blocks are accessed using low level OS I/O operations (think block read and write operations). Because of that, you could implement a database on Cloud Block Storage, but not on Cloud Files as it would not be very performant. Think Cloud Files when you have a website that needs media, large objects, application-specific content, files, etc., or when you need CDN for improved performance at the edge (CDN is a great feature of Cloud Block Storage). Think Cloud Block Storage when you could have used a regular old hard drive: you provision it, you format it, you may stamp a file system on it, or use the block storage for MongoDB or a relational database (with the difference that with Cloud Block Storage you can pick Standard drives or Solid State Drives).
I was under the impression that it wasn't wise to run a high read/write database on SSD because they have limited writes. But obviously people are finding it wise to do this. Is a 20GB postgresql going to work ?
I guess considering the number of times I've had to migrate servers it shouldn't be a problem lasting 5 years ;)
I'd posit that what most consumers actually want is a persistent filesystem. Clear consistency semantics is probably a bonus at this point.
Vending a fs shim to your domus eliminates a whole mess of abstractions.
So yes, I'm well aware that "compatibility" is the general reasoning for building a block device interface. But how many unique domu kernels do these providers support? Two, maybe three on the outside? And minus win32 theyre all posix with shockingly similar vfs interfaces.
Which comes back to Henry ford and his faster horse. Do customers actually want another layer on the abstraction fest so they can stack their mount option of choice on top? Or do they want a persistent file system with well defined consistency semantics.
I've never heard a customer actually request Yet Another Leaky Block Device Abstraction. And if that customers out there, what's their use case? Because building an fs shim to the dom0 seems to eliminate a whole mess of underlying infrastructure. So why isnt anyone doing that?
Remember Koenig’s Fundamental Theorem of Software Engineering: “We can solve any problem by introducing an extra level of indirection”? :-)
You would be surprised if I could tell you the number of customers that asked about Cloud Block Storage. This is one of the most requested services! Since all storage is block storage at the lowest level of abstraction on top of the device, it is always used by higher level abstractions. The question is whether that level of abstraction is appropriate for your application and your productivity as a developer if writing new apps. I see block level storage as the most versatile because it is the most fundamental and you can use it for anything: storage for emails, storage for tables and indexes of your favorite database, storage for file systems, archives, etc., by building that abstraction on top of blocks.
But I agree with the basic idea of your comment: as the most fundamental layer, then it may also be too low an abstraction layer to be used directly if you are writing a new high level app (think a business app for instance). That’s why there are file systems, databases, etc., all of which use blocks to do their thing and provide higher level functionality. I like to use the example of how most people don’t stop to think that systems really do not know about files and directories, or database tables and indexes. They just know blocks. But the file system abstracts blocks and provides the appearance of files and directories and hierarchy. And DBAs and developers using our Cloud Databases product are happy to do SELECT A, B FROM T WHERE C=’value’ without having to worry about blocks. But blocks are always there, underneath all that.
Whether you would want to stay at that lower level of abstraction depends on what you are doing. The higher the level the more specialized the abstraction, the more flexibility you gain at that level and the less flexibility you have to optimize at the lower level. For example, on average, 50% of the last block used by a file is wasted in many file systems. Nothing you can do about that. But I bet those directories and ability to find a file by name come in handy at that higher level.
I think that there are two cases where block is needed. The first one is to deploy commercial applications, databases, file systems, etc. for which you have no choice because you did not write the app. The second is for optimization of storage for infrastructure applications you are writing and have control of and where you want that optimization.
Many people writing business applications or web apps will be happy to use object storage or Cloud Databases for MySQL. No need to worry about block level because we abstract it for them. Many people will need a file system: they will just create a block storage device and format it as an fs and move on with their lives. Others will use it as a database. Others will write apps we cannot even imagine right now. But blocks will be there, humbly, doing their work.
I was assuming that it was using the same infrastructure as cloud files, and that the performance characteristics would be similar.
A word of caution, though: Rackspace is just getting into this ballgame of on-demand virtual block devices. There will probably be gaffes (though, hopefully not as bad as EBS of late), so, as per the virtualization commandments, build expecting failure.
Edit: Also, Rackspace had perfect timing for this announcement, the day after catastrophic EBS failure. Coincidence, or astute choice in launch dates? :)
Hey gtaylor, I can tell you that this was purely a coincidence. We actually were going to ship this a couple of weeks ago but decided to delay a few days to get some updates from OpenStack. I think what happen yesterday is unfortunate for all those customers who were affected and certainly not cause for celebration. We do believe however that we have a great block storage service to offer and are looking forward to competing.
Glad to see competition heating up. We all win from this.
Hi epistasis, here is what I can say. Let's talk about two things: what happens at provisioning time and then what happens at runtime.
Our provisioning engine is based on OpenStack Cinder. At provisioning time, we provision a SSD or Standard volume on our storage backend. This storage backend is a storage system we built (called Lunr) on top of standard Linux and commercially available hardware. Once the volume is created in Lunr it is then attached to the Cloud Server compute host, which exposes the volume to the guest as a virtual device.
At runtime, the volume appears as a regular device to the compute node over iSCSI. Snapshots are created against Cloud Files, our object storage service that is based on OpenStack Swift.
I hope that is useful.
We've always positioned rsync.net as a premium offering at a higher price, but it appears this is not really the case anymore - rackspace is 15 cents/GB, per month, and while that matches our 1TB annual package, our 10TB annual is down at 7.9 cents ...
Zero IO cost has been our policy from day one. No disagreement there...
 As most on HN know, it's not really a cloud storage offering, since we give people real UNIX filesystems with no abstraction layers between ... but "cloud" will do ...
 Live, ZFS filesystem ... so you can just go ahead and send us a single, 10 TB file if you feel like. No weird limitations...
- Rackspace standard storage vs AWS EBS (for 1TB of storage with 100 IOPS)
-- AWS US-East = $126/month
-- Rackspace USA = $150/month
- Rackspace SSD vs AWS PIOPS (for 1TB of storage with 1000 IOPS)
-- AWS US-East = $225/month
-- Rackspace = $700/month