Hacker News new | past | comments | ask | show | jobs | submit login
New High I/O EC2 Instance Type - hi1.4xlarge - 2 TB of SSD-Backed Storage (aws.typepad.com)
196 points by jeffbarr on July 19, 2012 | hide | past | favorite | 102 comments

What are good use-cases for on-demand high I/O servers?

At $3.10/hr, these instances work out to $2k/mo. There are probably many more cost-effective options if you want a 2TB SSD server.

Since the benefit of using EC2 is that you can provision instances elastically, what are the sorts of scenarios in which one needs to provision high I/O servers elastically?

[edit: A few minutes of Googling, and I can't find any dedicated servers with 2 TB of SSD.]

I do genome mapping where our indexes won't entirely fit in memory. It would be very handy to be able to spin up a few of these instances, load the indexes from an EBS volume onto the local SSDs, then run for a couple of hours or so. This is a very I/O intensive job that we need to run about once a week, but then the rest of the time could be idle.

SSDs would make our jobs run significantly faster. So much so that we've toyed with the idea of adding SSDs to our in-house cluster, but couldn't quite justify the costs. This might actually shift the cost savings to get our lab to migrate to EC2 as opposed to our in-house or university cluster.

We are also facing the same kind of problems in my company, regarding genome assembly and mapping.

That's definitely something we will look into :)

I'm working on a data visualisation app, which is getting a lot of interest from biologists and bioinformaticians. I'd like to learn a bit more about your work. Can I email you somewhere? Or please drop me an email at hrishi@prettygraph.com. Thanks!

How long would it take to read the 2 TB from EBS?

based on this, it shouldn't take more than half a day under the worst circumstances (single EBS drive with crappy performance), and if you Raid together enough drives, you can do it in about an hour. Correct me if I'm wrong, but you pay for EBS by size, not physical disks, so the more you can split up your data in blocks, the more performance you're going to get.


2 TB in one hour is about 4.9 Gbps. Cluster Computes have 10 GbE internally, but I'd be surprised if they have that all the way out to EBS.

Any chance we could get an example of the data set and the calculation that needs to be done?

You can get some of the data from the 1000 genomes project directly from Amazon, so you don't need to pay to download it. There's about 200TB of data there (so far).


What I'm working on is mapping those short sequences (50-75 bases) to the genome and then either looking for mutations or expression levels (how many of those reads map to a particular location). There are a couple of ways to do the mapping, but most these days use either a big hash table or a Burrow Wheeler transform.


And that's all just to get the data that you can then do something else with (gene expression, variation modeling, etc...).

Well, the raw output of a typical so-called "next gen sequencing" (which are actually very current gen) machine is around 1TB (at least, the ones we used here).

This is raw file though, so once processed (but not yet analyzed) I believe we have sizes around 50 to 100GB (but that's not really what I work on so don't quote me fully on this).

The next steps vary on what you want to do exactly, but it usually involves alignment of base pairs (basically, trying to tie together by their ends sequences of DNA but seeing if they "fit").

I said by their end but it can also be the full sequence, depending on the job

Essentially you sequence tons of short bits of dna and then either fit them together (assemble) or fit them to a reference (align). You can find example data sets in the Short Read Archive: http://www.ncbi.nlm.nih.gov/sra/

Cloudburst (a hadoop based aligner) has a good description of an algorithm: http://sourceforge.net/apps/mediawiki/cloudburst-bio/index.p... Though they can get much more sophisticated and there are a number of open and closed source implementations...I only link this one because of the quality of the figure.

The data sets we work with in my group can be up 400gb's of compressed text for the reads from a single individual.

Another example from biology with a similar computational profile would be searching through a hugh number of mass spectrometer outputs to identify the components in a new sample.

[edit: A few minutes of Googling, and I can't find any dedicated servers with 2 TB of SSD.]

I am the founder of SSD Nodes, Inc. (http://www.ssdnodes.com) and we offer lower-sized SSD-backed storage in addition to custom cloud and dedicated plans that range from 1-12TB of local SSD storage, at comparable performance. [/plug]

Given your "1-12tb" range doesn't list prices online, can you tell us whether your prices are comparable? At $1249/month for 2x 200GB SSD, it seems unlikely, but maybe I'm wrong?

Typically clients requiring that much SSD-backed storage have performance targets and a very specific workload, so this affects options along with pricing. With that said, ballpark for what Amazon is advertising is comparable to our pricing, except with us you are _guaranteed_ the resources whereas they are using a multi-tenant environment (you're not using the only instance on their host and your I/O is influenced by everyone else using that host).

EDIT: Downvoted? Please offer your point of view.

EDIT2: Sample pricing offered in comment below.

I haven't downvoted (and in fact couldn't if I wanted to since you replied to me), but personally my issue with your comment is that you're being as vague as your website's prices. Would be more interesting and relevant if you actually gave a price for a box comparable to the AWS specs.

I apologize, here's a sample pricing configuration:

8 x 3.4GHz E3-1270



6000GB Outbound BW

2 x 1 GigE Public/Private Interfaces

$2369 Monthly

That's just month-to-month, if a longer term were purchased we could give a discount depending on the term.

This assumes Amazon is using SLC and not MLC drives, which are considerably cheaper. Also, the quote above is from SoftLayer I believe, which is not the cheapest provider for high end hardware.

These particular instances are not multi-tenant at this time.

That is only one of the benefits of EC2. If you are not using elasticity, then you have to factor in the reserved instance pricing, which drops the prices down by 71% (as in, to 29% of the list price; and I mean even if you include the up-front cost: that's overall savings). You, like most people who comment on the price of EC2, do not seem to be taking this into consideration. :(

What are some of the other benefits, other than elasticity or on-demand ness, and not having to spend a huge amount upfront?

One example, which stems from "on-demand ness" (which you added: I was only responding to "elasticity"), is that you can do "test runs" of migrations and deployments without even thinking about it: you can rent, for just an hour, a setup identical to your existing one, often based on a consistent and atomic snapshot of your production machine, so you can try something "likely correct but possibly horribly wrong"; then, if it works, rather than replicating the change on your "real" machine, you can just cut over to the new one and shut down the old.

Way too many people seem to believe that the only benefit of "on-demand" is "elasticity", and then make bogus arguments here that "if you can plan your traffic you shouldn't be using EC2": EC2 is cheaper than people like to claim (and is in fact quite price competitive) and your ability to turn on/off machines on a whim changes the way you look at hardware so drastically that, in all honesty, it makes traditional ways of dealing with hardware seem draconian and only worth putting up with if you are dealing with some weird corner case or have horribly special requirements.

EC2 is cheaper than people like to claim

How often do we have to repeat this argument here on HN.

When running 24/7 then EC2 instances are 2-3x more expensive than the cheapest equivalent rent-server options and orders of magnitude more expensive than the physical hardware if you buy it.

The numbers have been recited countless times, I'm not digging them up yet again.

So, no, EC2 is not cost effective for steady loads at the mid-range. EC2 shines at the very low and the very high end and in specific workloads, i.e. it shines when the benefits can be quantified to an amount greater than the price difference.

> orders of magnitude more expensive than the physical hardware

Orders of magnitude? So, like 100x or 1000x more expensive? Really?

For us it was more like a 10x, but a few things went into that: - We found screamin' deals on hardware by snatching it up when it was available, not when we needed it. - We were at a fairly cheap colo, and haggled hard to get the cheapest rack possible. We went on a tour, noticed they had _tons_ of empty racks, and used that as some leverage. - We didn't add any additional ops overhead by having everyone responsible for ops.

We were in the 5k/month ballpark for EC2, and cut it to under $600 with a few grand outlay for hardware spread over the course of a quarter.

That said, all of my current projects are on EC2 for the provisioning flexibility, and because I hate having to drive down to a datacenter at 4AM to swap a drive.

Please tell me that you were including the cost of you driving to the datacenter at 4AM to swap a drive in the cost of the hardware in your price comparisons, as if you are just talking about the cost of the raw hardware and are not including the opportunity cost of all this time and energy spent haggling and performing maintenance, then this is simply a dishonest comparison: you could easily have been spending that time doing just about anything else, from working on new features for your product to improving your sales/investment pitch to simply sleeping (which will improve the quality of all of your work the next day). I'm also curious what your replacement plan is: are you intending to do this again next year, or are you intending to wait until all of your hardware starts failing and the operational overhead starts becoming painful? Finally, "having everyone responsible for ops" might mean that you didn't have to add a new explicit hire, but you can't claim that that isn't overhead: that is now state that everyone has to keep in their head and is a liability that could cause anyone to randomly get interrupted; it might even be cheaper in the long term to hire a new, more dedicated person than to reuse existing people.

Yes, 10x is common. 100x is a little contrived but possible when you spec out enough Ram in EC2 instances (>2T), then compare to a physical box over 3yrs.

I would be very interested in knowing what factors you are trading off for "equivalent" other than "on-demand". Many of my friends use co-location services for their businesses, and most of them purchase only on price, and their servers honestly suck: they have high latency, they are unstable, they don't have remote serial console access... they are living in a ghetto that burns tons of their time into "becoming server people".

If you find a company that has reasonable support, reliable servers, good datacenters, and the minimal features required to debug issues remotely, then you are looking at prices fairly equivalent to those offered by EC2 heavy-utilization reserved instances (and are going to end up with a similar contract length anyway). If this somehow doesn't work out: call Amazon AWS's sales division and see if you are compelling enough for them to negotiate with (they totally have a sales division, and they really do "want your business").

Regardless, your choice of quote is really bothersome: "people like to claim" that EC2 is as expensive as their on-demand list prices, and that's a fact clearly demonstrable by the person I'm responding to (who is quite clearly and obviously claiming EC2 is more expensive than it really is) and one that is not defensible as the price you should be looking at is the heavy-utilization reserved pricing; if you'd like to respond to my comment "and is in fact quite price competitive" then you should quote that and adjust your argument appropriately.

Honestly, the history of HN is not much better (as I scour around trying to find the "numbers" you claim "have been recited countless times"). It is actually difficult to find people who don't claim that Amazon EC2 is more expensive than it is; I'm almost wondering if you and I are living on different versions of the site...

"EC2 is about 10-20 times more expensive than dedicated hosting. Even if reserved instances save us 22% over 3 years, it still doesn't even come close." -- cmer

^ No, EC2 reserved instances save you 71% over 3 years.

"It costs $576/mo to run an extra large EC2 instance fulltime" -- stephenjudkins

^ No, even two years ago (before "heavy utilization reserved instances") you could drop this price by 66% to $195.84/mo.

"With EC2 prices at about $0.10 per hour, I can't imagine ever using a service with such a high premium." -- apinstein

^ Obviously: no, but the fact that this person is angry about the price of a small instance at $72/mo is quite telling; he isn't willing to go lower than $20/mo.

I found a price comparison by vladd from earlier this year, comparing a high-end VPS to EC2's largest offering, coming up with a nearly 10x difference, but the server is entirely useless: it is a consumer-level product running non-ECC RAM. Later comments claim the same hosting company has "competitively priced servers with ECC ram".

A couple months ago I found a thread that linked to a fairly detailed argument[1] stating that EC2 instances are 2-3x more expensive than a VPS. However, this person again is performing a comparison with non-ECC hardware. What damns this comparison, however, is that he is not taking advantage of 3-year reserved instances for a long-term high-end use case: his numbers seem to be based on 49% off, when he can easily get 71% off, nearly a 2x difference. <- Again, EC2 is cheaper than people like to claim.

[1] http://codemonkeyism.com/dark-side-virtualized-servers-cloud...

Seriously: I can't find anyone who is actually doing legitimate comparisons of Amazon's offerings. People either compare EC2 to "I spent a week of time negotiating a deal to take over a bunch of hardware from a failing company down the street" (which, for the record, will also give you a great deal on chairs and office furniture: comparing the cloud to a fire sale is inane), assume "a server is a server is a server" and find "the cheapest" option (which seems to always have unreliable RAM), or (frankly: "and") fail to take into account Amazon's reserved instance discounts.

That said, Hacker News has a really horrible search system, and I'm trying to find something kind of esoteric (as I want to search for a dollar sign, and thereby have to use proxies such as "expensive" and "cost"). I would thereby love to see an honest comparison, and am happily willing to believe that I missed it: do you have a link to such?

it is a consumer-level product running non-ECC RAM

Sorry to break that for you but EC2 instances are in all likelihood not running on ECC-Ram either[1]. If they had ECC-Ram then Amazon would probably prominently advertise that or at least respond when they are directly asked. If you can find a link to prove the opposite then I'll take that back.

I would thereby love to see an honest comparison, and am happily willing to believe that I missed it: do you have a link to such?

You have probably already seen any of the blog-posts I could cite here, so I'll instead just try to wrap your two claims up:

1. You claim that dedicated servers are more labor intensive (setup, hardware failures) and require more staff. This is not my experience at all. In fact the complexity and idiosyncrasies of the AWS platform are much harder to abstract in the beginning, and no less labor intensive in the long term. You're just trading one set of problems (hardware issues) for a different one (cloud issues). What you may save on the hardware management front you have to spend on adapting your application for a cloud-environment.

2. You claim that equivalent hardware to an EC2 instance (with comparable performance, good support, network, etc.) would be roughly the same price as an EC2 instance. Sorry but that is laughable, when have you last time benchmarked an EC2 instance? Even a cheap rented dedicated server (hetzner, leasweb, ovh) will normally give you twice the bang for buck on every key metric (I/O, Ram, CPU). And this quickly raises to beyond an order of magnitude when you start comparing EBS to a local array or a 256G Ram box to 256G Ram in EC2-instances. Where redundancy is a concern you can usually quite literally buy two of each and still be cheaper than EC2.

I'll say what I always say: EC2 does have its place. However for deployments in the range of 10-~50 servers you will in pretty much all cases save a lot of money by sticking with dedicated servers for the base-load. That is unless your app needs the cloud-flexibility, of course (most apps don't).

What makes you believe this flexibility would come for free anyways? As all things it comes with a price-tag, and actually quite a hefty one in this case.

[1] https://forums.aws.amazon.com/message.jspa?messageID=203167

They don't advertise it because it goes without saying that servers have ECC. EC2 uses Xeons and Opterons which only support ECC. It should only be a few percent more expensive, which is nothing when you consider the premium Amazon charges (which is something I definitely agree with you about).

because it goes without saying

I've been dealing for long enough with hosters and hardware to know that nothing goes without saying.

Xeons and Opterons which only support ECC

Have you actually checked the CPU models they use? All I know is that amazon uses a range of different CPUs, and some Xeon/Opteron models do accept non-ECC Ram.

only be a few percent more expensive

In the past ECC DIMMs used to be significantly more expensive.

Either way, as said, I don't know whether they're using ECC Ram. I agree it should go without saying, but I don't share your optimism that it actually does. I also wonder why they explicitly mention it for their GPU-instances when it goes without saying otherwise.

FYI, EC2's machines do have ECC ram. They don't advertise it, though.

Can you cite a source please?

A little more than an anonymous one-liner in a forum would really help my confidence...

Phoronix' benchmarking test suite has been able to detect underlying hardware: http://www.phoronix.com/scan.php?page=article&item=amazo...

No source aside from personal experience working with them, sorry. They avoid publicizing anything about the hardware/infrastructure if possible, partly so that they can change it without customer awareness and partly because they have secret sauce in places (no, ECC isn't secret sauce).

Okay, I guess I'll take that as another datapoint, although honestly (no offense) I won't be basing decisions on it. ;)

"if it works, rather than replicating the change on your "real" machine, you can just cut over to the new one and shut down the old."

Interesting! Any other hacks enabled by EC2 and the like that make life much easier than real dedicated hardware?

I do this too, not for risky migrations, but for daily updates. The app relies on a data service that's normally read-only, but gets fresh data daily. When everything was running on the bare metal, we had to schedule the updates for the middle of the night and carefully migrate the data in-place to avoid interrupting service. It would take 8 to 12 hours.

Since we moved to EC2, updating is simpler. The service runs on a micro instance. We launch a large instance to do all the CPU- and IO-intensive processing that prepares the new dataset, then launch a new micro instance, upload the dataset, run a few smoke tests, and if all is well, cut over to it. Because we're doing it off line, we were able to optimize the data processing for speed rather than low resource usage, and cut the runtime down to 45 minutes.

One thing that's often missed in discussions of IaaS versus bare metal is that the elasticity of a particular application can be affected by its design. When we were running on dedicated machines, we smoothed out the load to avoid idle hardware, but after moving to EC2 we concentrated it into spikes to get maximum productivity from running instances. In our case, spiky load is better from a business point of view, because serving data that's 1-25 hours old is better than data that's 8-32 hours old.

There are also the network benefits (pun intended). If the rest of your app does benefit from elasticity, you've had to choose betwen:

1. Keeping your app on EC2 and working around the lack of high I/O options 2. Keeping your app somewhere else and working around the lack of EC2-style elasticity

In TFA, Netflix had chosen #1, and they used to run an extra memcached layer + I/O on 48 instances. They were able to bring this down to 15 I/O instances with no intervening cache, and lower overall latency.

That said, I'd guess the on-demand hi1.xlarge won't get a lot of usage; I imagine they offer it just for orthogonality's sake (all other instances are available both on-demand and reserved), plus the ability to try before you buy.

What's really exciting is that Amazon clearly recognizes their lack of good I/O solutions. Maybe we'll see a whole range of options stem out of this... one can hope.

On-demand high I/O servers...

Say I have a batch process that has huge I/O requirements that has to run once per month and be finished within X hours of starting or SLAs are broken. (Plenty of these types of custom workloads exist in the enterprise)

I can either buy a server with specialist enterprise-grade SSD / Fusion-IO / whatever (> $20,000 most likely) for this once a month process or I can spin up one of these high I/O servers for 1 day per month for a grand total of $50.

In this scenario, this new server type is a godsend.

Monthly financial consolidation for large companies? My employer has a rack of servers for one application that are used very heavily for a few days a month then are hardly used for the rest of the time - the databases are currently being moved to internal Fusion-IO devices rather than Fibre Channel SAN drives.

The key consideration there is how many consulting hours will you need to figure out how to build up and tear down whatever Oracle/SAP/etc software that you need to do your financial calculations on the EC2 VM.

You may find that Fusion-IO is cheaper than $50 + 3 months of consulting time.

Well, I didn't think you'd be able to do it with existing consolidation applications!

NB Honestly, I think that is a business opportunity....

closest i can find is Hetzner's EX8S[1] which is EUR99 for the box itself without disks, and another about EUR80 for 4 240Gb SSDs... This is a little less than HALF of what Amazon give, but costs about EUR180 a month and EUR150 for setup... but you do loose the elasticity of Amazon... Mind you, it is 1/10th of the price...

[1]: http://www.hetzner.de/en/hosting/produkte_rootserver/ex8s

"Since the benefit of using EC2 is that you can provision instances elastically" That is certainly a benefit, but I think many would argue that its far from the only benefit.

And as with all EC2 pricing, reservations drop the price substantially. A 1 lease reservation for a "heavy utilization" instance comes out to $7280 per year + $0.621 per hour for a total of $12719

> $7280 per year or $0.621 per hour.

I believe that's $7280/year plus $0.621 per hour, or about a thousand dollars a month amortized over a year.

Good catch, thanks. Updated my original comment

A reserved instance is about $650/month. One possible use: run your master postgres on it and use streaming replication to transmit the data to something more durable.

(Don't process cc transactions this way.)

The one that comes to mind is that you want to do a map/reduce job and need a large and fast working space to do the reductions.

Otherwise, as noted below, the most likely use case is in a reserved situation running Cassandra or Postgres or something.

What sort of map-reduce space needs fast random I/O? The only things which come to mind are disguised table joins, where the right solution is to use sorting instead.

(Not trying to be argumentative, I'm just trying to figure out what you have in mind.)

It's not just random I/O that's fast, but sequential is a lot faster too.

But what I'm thinking of is the tokenization of a large corpus of text into bigrams and trigrams for example.

Good point, with all the focus on random I/O in the announcement I was forgetting about the sequential I/O performance benefit.

Hmm, that reminds me, I have an interesting toy problem which is sequential I/O limited...

Admittedly, it would be fun to pull in the Common Crawl and process it.

Note that there's a code contest for Common Crawl that recently was announced: http://commoncrawl.org/first-ever-code-contest/

I cant think of an example, but I'd guess this'd work out as a good cost-effective solution for people who need to do "big data" map-reduce type jobs, but with less utilisation than 1 week/month. Or, if you can solve the "getting the data in" problem (perhaps you've already got all your data stored on S3), you could use one of these for an hour or two at a time, perhaps running end of week or end of day batch processing, at a somewhat lower cost that having a similar sort of pay-by-the-month colo server.

We use ec2 within our telecom infrastructure which manages peaks of several thousands of calls with full duplex call recording on (think IO here) We are using the elasticity of ec2 to scale on demand.

This kind of instance is a nice addition for us as far as our auto scaling is concerned.

on-demand doesn't matter.

If lots of your infrastructure is in ec2, you may need a good db server inside ec2 that is used constantly. eg if you have a read heavy cassandra workload, you need ssds.

This is a game changer for big sites on EC2. The key word here is local: 2 TB of local SSD-backed storage.

In this video [1], Foursquare says the biggest problem they're facing with EC2 is consistency in I/O performance. They say that the instance storage simply isn't fast enough for them, and while EBS is fast enough when RAIDed, it isn't consistent since it isn't local (EBS is traffic goes over the network). Reddit has also complained about EBS, but they've been able to move onto the instance storage.

If you're willing to reserve the instance for 3 years, the average monthly cost becomes only $656. That's quite a good deal.

Foursquare says in that video they're planning to migrate off of EC2, in part due to I/O performance. I'll be interested to hear whether or not this instance type changes their minds.

[1] http://www.10gen.com/presentations/MongoNYC-2012/MongoDB-at-...

If you're willing to reserve the instance for 3 years, the average monthly cost becomes only $656. That's quite a good deal.

The only problem with reserving that instance for 3 years is that better hardware always comes along, especially with the cost of SSDs coming down significantly every year. Usually if you're in the big-data space, your hardware is likely retired after 24 months (12 months if you're well funded) so locking yourself in for 36 months might be a bad investment.

Has Amazon ever bumped the specs on existing hardware types? Or do they just create new hardware types? e.g. is it possible that if you get a 3-year reservation for an h1.xlarge, by 2015 h1.xlarge might have newer specs?

I had thought that EC2 reservations were upgradeable, but a quick check on the forums shows you're right, they're not. Of course, you can play your own "tiered usage" game, like laptops in IT departments, where the old h1.xlarge becomes cheap enough to use as a second-tier machine and you go reserve the h1.xxlarge for Cassandra.

If you sign up for a reservation, you seem to be able to send support a message in order to have them cancel it so that you can change to the new hotness. We had to do this for our three-year reservations when high usage reservations came out. We were getting shafted because our previous generic "Reservations" were converted to medium use, whereas we were using them as high use.

So it does at least appear that in some cases, they'll let you out of your reservation so that you may sign up for something similar. Or at least they let us do that.

Don't reserve for 3 years.

AWS cuts their costs at a relatively reliable rate. We've done the match an found that the 1 year reservations are absolutely worth while, but that the 3 year reservations are not. Granted that was for our specific workload / use case.

Money has a time value, and this stuff is getting cheaper fast.

Netflix got a huge performance boost for Cassandra using the SSD instances:


Reminder since it's a pair of SSDs and most people will probably look into using this for their DB store: If you use current generation controllers/software & SSDs, you're going to have a bad time if you turn on RAID and don't know exactly what you're doing.

TRIM ( https://en.wikipedia.org/wiki/TRIM ) isn't supported with RAID on SSD today on hardware controllers and most distributions of linux don't support TRIM on RAID out of the box if you're doing software RAID, so you're going to see performance plummet like a rock after you do one pass of writes on the disk. In many RAID configurations, you're going to zero-write the entire disk when formatting it, so performance is going to suck from the get-go. For this reason, even if you have a tiny database and don't expect to write 1TB worth of data, your performance might still suck. Personally, I haven't tried linux software md TRIM in production, the patch is pretty recent, so you're on your own here (if possible, scaling out horizontally may be a solution to consider for redundancy, I have no idea what Amazon using for SSDs, but recent Sandforce generations fail all the time, so plan for that).

If you don't know to look for this issue, you're going to be scratching your head when your RAID10 SSD configuration write throughput is worse than a single 7200rpm drive. On the other hand, IOPS on SSDs are AMAZING for databases/datastores. Amazon may have solved this for you already behind their visualization instance, and they might be running their own software striping behind whatever raid you're doing, so be sure to test it out fully first.

Don't assume that its actually a pair of SSDs under the hood. It's very likely that its actually ~8 SSDs md/LVM'd together as two volumes(JBOD, no raid). I'm pretty sure they've taken steps to ensure that TRIM is working, as they'd burn through drives/complaints too quickly otherwise.

That said, you're absolutely right about being cautious/not RAIDing the volumes. There are almost no RAID configurations that support TRIM at present, so it's definitely not a good idea to be RAIDing up these drives. Just go JBOD.

What's a good way to utilize the two disks for your datastore? Run mongodb for example one disk and backup to the other? Run one sharded instance on one disk and another sharded instance on the other disk?

With MySQL (and Oracle before that) it was common to simply move different parts of the database data files to different disks. I don't use Mongo so I can't speak to that, but the concept works pretty much universally. See here for more information about spreading your database around multiple disks: http://www.mysqlperformanceblog.com/2010/12/25/spreading-ibd...

Interesting. Do you happen to know if using LVM instead of mdadm is susceptible to the same issues?

In addition, keep in mind that MD (software raid) does not support discards. In contrast, the logical volume manager (LVM) and the device-mapper (DM) targets that LVM uses do support discards. The only DM targets that do not support discards are dm-snapshot, dm-crypt, and dm-raid45. Discard support for the dm-mirror was added in Red Hat Enterprise Linux 6.1.

Red Hat also warns that software RAID levels 1, 4, 5, and 6 are not recommended for use on SSDs. During the initialization stage of these RAID levels, some RAID management utilities (such as mdadm) write to all of the blocks on the storage device to ensure that checksums operate properly. This will cause the performance of the SSD to degrade quickly.


Thanks for that. Very helpful and it looks like I have a new thing to research. To date, I've only employed SSDs in laptops.

This has been a long time coming, but AWS has consistently been improving their service (as long as you can ignore the particularly bad reliability as of late).

It's telling that they have only enabled this for a huge (quadruple extra large) instance type. It's probably hard to make this work for someone who just wants a 10GB disk with great IO. The problem at the low end is that disks are larger and would thus have to be divided up to make proper use of them, leading to IO contention..

The high IO options will probably only ever be available for pretty large instances.

It's telling that they have only enabled this for a huge (quadruple extra large) instance type.

My guess (not based on any knowledge of EC2 internals) is that they don't have any way to do fair I/O sharing between guests. If they did, they could split these boxes into 32 small instances with 1 ECU, 1.7 GB RAM, and a 60 GB disk with 2500 random reads / 250-4000 random writes per second.

Xen offers easy ways of doing fair I/O sharing between guests. These servers they're using are most likely multi-tenant systems with 256-512GB of RAM and 6-12TB of SSD storage. Providers don't like keeping expensive systems around that aren't making money, especially when demand changes every hour, so I expect that they have at least 4 instances sharing the I/O of each host (especially when they mention broad ranges of expected I/O).

The most likely reason for not slicing these systems up to smaller instances is they want to maintain consistent, high performance I/O.

AFAIK, the largest tier in any AWS instance type has always been the full box. i.e an m1.xlarge is the whole box, an m2.4xlarge is a whole box, etc.

I would agree with you, but them listing such broad write IOPS ranges makes me think otherwise. I could be wrong though.

There's a technical reason for the range, explained in the blog post:

> Why the range? Write IOPS performance to an SSD is dependent on something called the LBA (Logical Block Addressing) span. As the number of writes to diverse locations grows, more time must be spent updating the associated metadata. This is (very roughly speaking) the SSD equivalent of seek time for a rotating device, and represents per-operation overhead.

It seems like they could spread a bunch of smaller instances over an array of SSDs, maybe just offer less space at a higher price? SSDs are naturally better at concurrent access, so while you might not get 150k+ IOPS, you could get at least 10k or so, with the expected low random access times. Imagine an SSD-backed EBS.

FWIW it costs $27,156 a year on-demand or $12,720 as a reserved heavy utilization instance.

For heavy analytics workloads I'd bet that Google BigQuery (https://developers.google.com/bigquery/) would be cheaper and faster and more reliable.

Anyone can report on the model of SSD they use (via ATA IDENTIFY)?

My guess based on perf characteristics is each instance has 2 x 960GB OCZ Talos 2 C Series SSDs: http://www.oczenterprise.com/ssd-products/talos-2-c-sas-2.5-...

It's very likely that they are using Intel SSDs (likely 320/330, maybe 510+) for a variety of reasons:

Intel has a good drive reliability history, and is very enterprise friendly in bulk purchasing. Intel has had excellent firmware for their drives, which at the datacenter scale is valuable--people who have dealt with raid controller firmware(including Amazon) know all about this.

Traditionally, Amazon has not used SAS drives in EC2, opting for lower cost SATA drives. It's also unlikely that Amazon is using small numbers of high capacity (>=500GB) drives because they still aren't perfectly price effective; price per gb is ok, but replacing a failed drive is more costly.

Also keep in mind that to get to today, Amazon has been rolling these drives in huge numbers out through two enormous data centers, so it's unlikely that Amazon has picked a brand new drive (say the latest OCZ Vertex 4).

There are other factors that Amazon has bumped into while testing drives, but they remain unreported and internal.

Would these be suitable to run a single big SQL Server on? I mean specwise they seem perfect for our size/use. They say data will survive a reboot, but what are the chances I would some day have to wake up in the middle of the night and having to restore the database to a new server? Would some of the recent amazon downtimes be one of those cases where that could happen?

That would be a really bad idea unless you have real-time replication to another server. SSD or not, you don't want to write data to the ephemeral store without a real-time backup, unless you're willing to lose it all.

Hopefully we will start to see some other providers offer SSD-backed storage since Amazon does it now. It would be nice if they offered it on some smaller instances too though.

Storm on Demand have done SSD storage for a while now, pretty solid IOPS numbers too. I'll run a benchmark comparison tonight.

Ran some benchmarks, really insane IOPS on this plan. Pretty average CPU performance (they're using fairly old E5620's).


I just ran the numbers on the new EC2 instance, and I'm pretty skeptical about the benchmarks above. I'm not sure that, for example, a half second of dd /dev/zero really tells us much.

When interpreting any benchmarks on EC2, it's important to understand that there is a 5-10% read/write performance hit on first use because AWS uses lazy block wipes between customer instance launches. See http://www.youtube.com/watch?v=IedaYaKsb-4#t=29m49s (should pre-cue, if not, skip to 29:49). This is referenced in the docs, but it's easy to miss: http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/In...)

So here you go, for hi1.4xlarge:


Summary for the impatient - After initialization (i.e., second-write), quasi-realistic I/O on the new SSD EC2 instances sustains writes @ 420 MB/sec and reads @ 6 GB/sec. The entire 8.6GB / filesystem copied over to SSD in 21 seconds.

Not bad.


    # df -h

    Filesystem            Size  Used Avail Use% Mounted on
    /dev/sda1            8.0G  1.1G  6.9G  14% /
    tmpfs                  30G     0   30G   0% /dev/shm
    /dev/xvdf            1023G   16G  957G   2% /media/ephemeral0

    (Note: /dev/xvdf and /dev/xvdg are just soft links to /dev/sdf and /dev/sdg respectively)

    Crude stats on first-use:

    # hdparm -tT /dev/xvdf

    Timing cached reads:   14788 MB in  1.99 seconds = 7446.69 MB/sec
    Timing buffered disk reads:  1066 MB in  3.00 seconds = 355.04 MB/sec

    Wipe the device:

    dd if=/dev/zero of=/dev/xvdf bs=1M& pid=$!
    while true; do kill -USR1 $pid; sleep 4; done;
    dd: writing `/dev/xvdf': No space left on device

    1048567+17 records in
    1048566+17 records out
    1099511627776 bytes (1.1 TB) copied, 1955.42 s, 562 MB/s

    Stats after zero-wipe (dd /dev/zero) to device:

    hdparm -tT /dev/xvdf

    Timing cached reads:   13260 MB in  1.99 seconds = 6673.05 MB/sec
    Timing buffered disk reads:  1124 MB in  3.01 seconds = 374.02 MB/sec

    hdparm -tT /dev/xvdf

    Timing cached reads:   11188 MB in  1.99 seconds = 5624.17 MB/sec
    Timing buffered disk reads:  1122 MB in  3.00 seconds = 373.99 MB/sec

    hdparm -tT /dev/xvdf

    Timing cached reads:   12930 MB in  1.99 seconds = 6505.78 MB/sec
    Timing buffered disk reads:  1124 MB in  3.00 seconds = 374.15 MB/sec

    Confirming Effect Of Pre-wiped I/O:

    hdparm -tT /dev/xvdg

    Timing cached reads:   11796 MB in  1.99 seconds = 5931.68 MB/sec
    Timing buffered disk reads:  1038 MB in  3.00 seconds = 345.87 MB/sec

    hdparm -tT /dev/xvdg

    Timing cached reads:   12658 MB in  1.99 seconds = 6367.41 MB/sec
    Timing buffered disk reads:  1050 MB in  3.00 seconds = 349.47 MB/sec

    hdparm -tT /dev/xvdg

    Timing cached reads:   12856 MB in  1.99 seconds = 6468.39 MB/sec
    Timing buffered disk reads:  1066 MB in  3.00 seconds = 354.80 MB/sec

    Pre- Vs. Post-wipe performance: 373.6 MB/sec vs. 349.3 MB/sec (6-7% speed improvement)

    Somewhat more real-world numbers:

    dd if=/dev/sda1 of=/dev/xvdf bs=1M
    8192+0 records in
    8192+0 records out
    8589934592 bytes (8.6 GB) copied, 19.7876 s, 434 MB/s

    dd if=/dev/sda1 of=/dev/xvdf bs=1M
    8192+0 records in
    8192+0 records out
    8589934592 bytes (8.6 GB) copied, 20.0365 s, 429 MB/s

    dd if=/dev/sda1 of=/dev/xvdf bs=1M
    8192+0 records in
    8192+0 records out
    8589934592 bytes (8.6 GB) copied, 21.4193 s, 401 MB/s

*Edit: formatting

Check out this long thread on WHT, started 2 months ago:


From my limited testing, a 1 Node instance from GridVirt at $30/month can compile gcc-4.6.3 within 15 minutes (not exactly an IO-intensive example...)

The pricing in the blog post is a bit unclear - the prices on http://aws.amazon.com/ec2/pricing/ are ... US East $3.10 for linux and $3.58 for windows. EU West $3.41 for linux and $3.58 for windows.

(Reserved instance prices are cheaper)

Another approach is something like OVH SSD servers (24GB ECC memory, 2x300GB SSD, £210/month) https://www.ovh.co.uk/dedicated_servers/mg_ssd_max.xml

If you are using MongoDB, you take 3 or 4 of them and shard and you backup with "conventional" storage for the replica set. You end up with a 6 node cluster for less than the price of this Amazon instance.

Lesson: You need to have a business which can benefit from a lot of start/stop of your instances for them to make sense from a pure financial point of view.

Some notes:

1. You can't order from OVH if you're from outside their list of approved countries. 2. You're not using any RAID and those are desktop grade SSD drives, and they tend to die out, sometimes without a clear warning, as they're not really intended for 24/7 server use.

1. The list of countries from where you can order servers is expanding on a regular basis. If your country is not there yet, it should come in the future.

2. You have 2 300GB with a RAID card (battery powered). So, you can put in RAID. For the reliability, I keep my fingers crossed (I have some of these servers) but no failures yet and an interview with the operators said that basically they an extremely good reliability. This is not marketing in this case, this is because they need it to be financially sustainable.

By the way, do you know what kind of SSD (SLC, MLC, real disks or cards?) are used by Amazon?

Being a storage admin, when you start to have gobs of hard drives, there's always one failing. You'll have dry spells where nothing fails, and then all of a sudden you're looking at 2,3+ failures on a system, although not necessarily on the same RAID group.

If your hardware provider needs to cut down on the warranties to be financially sustainable, I'd be concerned. It looks like these are rentals and not purchases, so why wouldn't these guys be warrantying to Dell/HP or the drive manufacturer directly? Are they buying gray market to reduce the cost, trying to pass that savings off to you, but then in turn run out of recourse when they need to replace a drive?

I'm just speculating; I have no idea if this company is good or not. I'm just concerned about the statement you made about the company, whether it's from your understanding or what they actually said.

Sorry, not being English my comment was maybe not really clear. They do not buy on the grey market (they are the largest hosting provider in Europe, 120k+ servers) but they carefully select the drives to have only the ones with the best reliability because the cannot afford to simply swap the drives of the dedicated servers to often as they operate on a low margin approach. They are not cutting down on guarantee, these are dedicated servers with guaranteed hardware, they change the drives in case of failure at no cost.

If you operate on low margin, you better have systems with minimal needs of manual operations, because as soon as you have one guy pulling a dedicated server, changing the drive and putting another one, you have lost a couple of months of your earnings on this particular server. If you do that too often, you are not happy at the end.

They're probably using SATA MLC drives. All their EC2 storage has been SATA so far, and its ridiculously difficult to get 1TB of SLC and remain cost effective.

It seems we all may be missing the "backed" part - which I did on my first read through. They don't seem to be revealing how much of the logical volume is actual SSD, which I think is why they're instead putting down IOPS numbers.

Still, a huge and significant improvement over anything previously available. I'm looking forward to playing with it.

It's very likely that the entire volume is SSD, otherwise the tp99 would be atrocious (and if you look at the netflix numbers its actually quite good). The reason for the broad ranges is more likely that the SSDs have some inherent performance variability due to wear leveling and GC processes. Additionally, as time goes on, multi-tenancy will start to come into play (if Amazon has smaller SSD instances or larger machines), which will stay within that range comfortably.

The whole volume is SSD.

Since I'm spending more than $1,000/month for an RDS master, which is backed by EBS, i'm intrigued at the idea of running our database off of these. Of course I'd lose all the awesome automated features of RDS, but worth considering.

Maybe AWS will release a High Performance RDS option that runs off of them. Wishful thinking.

This is nice, but they need a smaller instance that costs less that also has SSD. Until then, I'm not going to blow $5k a month on a system like this.

This would be very useful for elastic EMR workloads, will be good for killing I/O bottlenecks.

wow, would like to do a redis benchmark on one of those !

Amazon will find some way to make this slower than shit and less reliable than a campaign promise.

I've used them and they are quite performant. Definitely lives up to its promise.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact