

Evaluating Amazon’s EC2 Micro Instances at DocumentCloud - jashkenas
http://blog.documentcloud.org/blog/2010/09/evaluating-amazons-ec2-micro-instances/

======
seldo
From our point of view, the real value of micro instances is architectural.
For tasks where the load is small but important (queue servers, memcache
instances, monitoring), where before we might have had one small instance and
a potential point of failure, we can now afford to have two micros,
simultaneously providing redundancy and halving the price (micro = 1/4 small).

This is especially important in AWS, as the dirty secret of small instances is
that they occasionally vanish without warning -- in our case this has happened
4 times in 18 months.

~~~
lzw
When they vanish without warning, do they take the EBS with them?

Is having your OS on an EBS (as is required for micro) a dangerous thing,
because the instance vanishing without warning might result in a corrupted EBS
drive, making restoration more complicated than spinning up a new instance?

~~~
rbranson
I've had the same experience with smalls as the parent poster. No EBS loss
though. The larger instances do seem more stable. No quantitative data to back
that up though.

~~~
seldo
I don't know if 75+ instances over 18 months is large enough to count as
quantitative data, but we've never lost a large instance (they've had
temporary failures). All four of the permanent vanishing-acts were smalls.

------
babo
The most important characteristic of a micro instance is not the raw
performance per se but how the workload is distributed. It's not suitable for
a task where a a steady performance is required, micro instance designed for
short but intensive burst. AWS monitors micro instances and degred performance
when needed to allow other instances to run. Right now there are not that many
micro instances are in use, probably they are not as strict but this will
change at the future for sure. For more info check they documentation:
[http://docs.amazonwebservices.com/AWSEC2/latest/DeveloperGui...](http://docs.amazonwebservices.com/AWSEC2/latest/DeveloperGuide/)

~~~
jashkenas
Take a peek at the "Real" versus "User" numbers in the linked post. We were
expecting to see that Micros perform well for short but intensive bursts, but
that wasn't the case.

When the Micro went to OCR 51 pages of document text, it spent only 6 minutes
of actual CPU time to do it. Unfortunately, those 6 minutes of CPU time were
spread over a real wait of 52 minutes, due to either other users on the
machine, or Amazon caps. That's not so great for an intensive burst.

Perhaps it would work if the CPU burst lasted for seconds instead of minutes,
but I wouldn't bet on it. It would be a nice benchmark to try...

~~~
babo
In the AWS documentation an optimal burst is like in the 200ms range, not even
seconds.

~~~
jashkenas
Thanks for looking that up. I wonder what sort of intermittent CPU-intensive
tasks are a good fit for that size of a burst. Are you using the Micros in
this fashion?

~~~
babo
I haven't found an application which is following that pattern yet. Probably
running a proxy in front of a bunch of micro instances is a good idea, that
will distribute the load evenly why adding the required redundancy and speed
improvement over small instances.

~~~
saurik
A website with almost no users is going to need to execute every now and then
for only 200ms at a time.

------
conesus
Micro instances are great if you want to spend the least amount possible and
don't mind sacrificing raw speed. You are paying a penalty of having a quarter
of the processing power of a Medium instance. Oddly, though, the CPU in a
Micro is about 15-20% faster than the Medium instance, but when looking at the
CPU time versus Real elapsed times, you can see how Amazon is holding back the
power.

~~~
rufugee
Is there any benefit to using ec2 Micro instances over the smaller Rackspace
Cloud offerings
([http://www.rackspacecloud.com/cloud_hosting_products/servers...](http://www.rackspacecloud.com/cloud_hosting_products/servers/pricing)),
unless you need to scale the Amazon way? I'm deciding between both currently
to replace a dedicated server for a few of my low traffic sites.

~~~
psadauskas
Storage. AWS gives you EBS volumes, of practically unlimited size. With
Rackspace, you get the size you get, 40GB, 80GB, etc, and have to pay for a
bigger instance if you want more storage. There's cloudfiles, but its not as
useful as mounted disks.

~~~
phillytom
EBS is bounded at 1TB / volume. Then you have to stripe.

~~~
danudey
But the point is that you can do that. With Rackspace, you get X amount of
storage, and that's the end of it. If you're getting full, your only option is
to get a server with twice the RAM and disk at twice the price.

With EC2, I can have a server half the price of Rackspace's cheapest instance,
but with theoretically unlimited storage space. I could use that e.g. for
backing up server data, or as an NFS share.

The one major limitation of Rackspace's service is disk space. It's severely
limited, and there's not much you can do about it.

~~~
shykes
Small correction: there's a fairly low limit to how many volumes you can
attach to an EC2 instance. Available storage is very high, but not
"theoretically unlimited".

~~~
psadauskas
Thats why I said "practically unlimited".

You get /dev/sd[b-p][1-15] as mountable EBS volumes. Thats (15 * 15) * 1TB, or
about 225TB. You may even be able to go higher than "p", but I haven't tried.
I'm also not sure, you may even be able to attach somewhere other than
/dev/sd* .

Either way, after that, your EBS bill is $22,500/mo, just for storage, so you
might be able to figure something else out for that cost.

But you CAN attach that to a $14/mo EC2 instance, which was my original point.

Edited to add:

 _Note that the default limit is 20 EBS volumes per EC2 account. You can
request an increase from Amazon if you need more._
[http://groups.google.com/group/ec2ubuntu/web/raid-0-on-
ec2-e...](http://groups.google.com/group/ec2ubuntu/web/raid-0-on-ec2-ebs-
volumes-elastic-block-store-using-mdadm)

Still, 20TB is "practically unlimited", for most applications.

~~~
shykes
In my experience a volume can only be attached to /dev/sd[b-p], so it's a 15TB
limit per instance. If what you say is true, that would be great news. I'm
testing it right now.

Also, in response to your question: I can confirm that EC2 complains if you
try going above sdp, or anywhere else.

~~~
psadauskas
I know that the Centos and Ubuntu AMIs don't know anything above /dev/sdp, but
I haven't tried any others. I have tried /dev/sd1 through /dev/sdd15 and
/dev/sde1-4, and that works fine on Canonical's official 10.04 AMI.

------
lsc
the problem with benchmarking a virtual instance's CPU is that how much you
get varies quite a bit based on your weight and how much contention you have.

If you have a system with 512MiB ram on a box and a system with 2048Mib ram on
the same box, and they each have 1 vcpu, and they are weighted appropriately,
when the cpu is idle, the 512 system will have just as much cpu as the 2048MiB
system. When there is contention, the weighting will step in and give the
2048MiB system more CPU, though.

(of course, I believe that amazon does not mix instance sizes. Even so,
assuming an even ram to cpu ratio, which is a pretty good assumption, the
2048MiB ram system on a system full of 2048MiB guests will have much better
worst case performance than a 512MiB guest on a system with a similar ram/cpu
ratio, even though if they both have 1 vcpu their best case performance will
be about the same, assuming the cpu cores on each hardware are of similar
speed.)

Really, if you are going for imperial data, you should do it the way Eivind
Uggedal did it. He repeatedly ran a benchmark and recorded results over a very
long period of time. Doing it just once is going to get you a pretty random
result, and on average, is going to make a small instance look much better
(cpu wise) than it is in the worst case, and really, you usually have to
design for something fairly close to the worst case.

------
icode
Would be nice to have the output of some simple bash-oneliner that does some
simple cpu speed test. So everybody could compare the speed of their VMs on
different providers.

~~~
conesus
I provided the command that I used to test the Micro instance on
DocumentCloud's server. The only requirement would be to install Docsplit

    
    
        sudo gem install docsplit
    

Then just `time` the runtime and compare.

~~~
icode
"gem install docsplit" gives me this:

-bash: gem: command not found

~~~
conesus
It's RubyGems - <http://docs.rubygems.org/read/chapter/3>

------
Periodic
It was nice to see that someone could test these with real-world usage, but I
was disappointed by the quality of the results. I didn't see mention of the
number of runs performed on each instance or the standard deviation of the
results.

The article made it sound like they only ran it once on each instance, which
means there are many factors that could have affected the results such as disk
caching, other processes, or high load.

As one of my physics professors uses to say (paraphrased): A measurement is
meaningless without an estimate of the error (+/- 0 meaning).

