
Has Amazon EC2 become over subscribed? - blasdel
http://alan.blog-city.com/has_amazon_ec2_become_over_subscribed.htm
======
dangrossman
It really bugs me that Amazon rarely admits the many faults that go on within
their cloud. The status page just shows all green with no notes, even when you
see multiple major sites drop off the web, report problems on Twitter, etc.

Numerous times I've had EC2 and EBS go out of contact (or simply have huge
network latency, essentially the same). Since my instances run off RAID arrays
of EBS volumes, essentially everything dies until EBS reappears.

------
old-gregg
If true, does this sound like Amazon is violating the terms of use? I thought
they _guarantee_ to provide a certain level of service, such as CPU
percentage, _regardless_ of your neighbors.

I am not a virtualization wizard, but I always thought Xen and other
hypervisors have CPU usage caps. If you're not getting built-in protection
from noisy neighbors then VPS becomes just a crappier version of good old
shared hosting. True? False?

P.S. I've been Slicehost user for 3+ years and all instances I've ever had
there exhibited pretty much expected performance.

~~~
wmf
_I thought they guarantee to provide a certain level of service, such as CPU
percentage, regardless of your neighbors._

If they guarantee X performance but all along they were providing 2X and now
they're only providing X, customers will complain. There are different ways to
configure Xen so it's not clear exactly what Amazon is doing.

~~~
BearOfNH
_[...] so it's not clear exactly what Amazon is doing._

Shouldn't Amazon be providing customers with reports to this effect? It can't
be that difficult to say that within each 10-minute interval _i_ you got _X_i_
% of a CPU, received _N_i_ packets, etc.

This is the sort of data Amazon should collect themselves to analyze the
systems within the cloud, so the customer never (OK, rarely) sees a problem.

------
dotBen
The author mentions the "trick" of closing a poorly performing instance and re
instantiating it again in the hope of locating away from bad neighbor.

This sounds reasonable, but the method could be improved by instantiating a
new instance first and then removing the old one - this ensures you don't
instantiate in the same location as before.

I don't know EC2 architecture well enough to know, but there may even be some
ways of telling if your new instance is located in the same problematic
instance as the original one (perhaps by tracerouting the problem instance and
finding it to be v near by). If this happens presumably you can instantiate a
3rd and repeat until you find a suitable instance at which point you kill the
others.

~~~
swolchok
Actually, "Hey You, Get Off of My Cloud: Exploring Information Leakage in
Third-Party Compute Clouds"
(<http://people.csail.mit.edu/tromer/papers/cloudsec.pdf>) documents "the
tendency for EC2 to assign fresh instances to the same small set of machines",
and exploits this tendency in order to co-locate a malicious instance with a
victim instance. Rapid re-instantiation seems like a _bad_ way to improve
performance.

------
nas
Interesting but I don't see how he can conclude "[Amazon EC2 has] deep rooted
scalabilty problems at their end". One of the downsides of cloud computing is
that you don't really know what's going on with the lower layers.

~~~
nudist
Incidentally, not knowing what's going on with the lower levels is also the
primary benefit of cloud computing.

~~~
seiji
There's a difference between not wanting to know (lack of interest and/or lack
of education) and not being able to know.

------
timf
See also: <http://news.ycombinator.com/item?id=1048873>

------
Asa-Nisse
The irony is that his site is now offline.
[http://downforeveryoneorjustme.com/http://alan.blog-
city.com...](http://downforeveryoneorjustme.com/http://alan.blog-
city.com/has_amazon_ec2_become_over_subscribed.htm)

Cache: [http://alan.blog-
city.com.nyud.net/has_amazon_ec2_become_ove...](http://alan.blog-
city.com.nyud.net/has_amazon_ec2_become_over_subscribed.htm)

~~~
piramida
The author does not provide _any_ attempt at measurable difference between now
and then, and just goes by the "feeling" that it is becoming slower. While
human feelings are valid for a personal dislike of some product, I don't see
how this "article" can be even linked here.

Our "feelings" are different, and since the author does not provide a single
number and small ec2s still perform the way they always did I can only
conclude that their software or web app is becoming bloated, or they dont know
how to measure.

------
klon
We use EC2 for our site and our system monitoring that checks TCP connectivity
on our elastic ip every few minutes reports outages almost every day now. Have
tried the forums but no one seems to know how to troubleshoot this.

------
codexon
I have a feeling this is why reddit feels slower nowadays.

