
EC2 Maintenance Update II - jeffbarr
http://aws.amazon.com/blogs/aws/ec2-maintenance-update-2/
======
theonewolf
@jeffbarr I just wanted to say thanks for not only posting this, but also
sticking around in the comments section. Makes you + Amazon seem way more
human :-)

Also, in case you have any "cloud servers" you want to decommission:
[https://news.ycombinator.com/item?id=8373394](https://news.ycombinator.com/item?id=8373394)

~~~
jeffbarr
I am always happy to help, time and circumstances allowing.

Before joining Amazon I earned my living by consulting for startups. I could
always tell when they were about to run out of money when they would offer to
pay me in servers. This was always the cue to find my next gig.

~~~
samstave
I would like to request, that if you're going to suggest: ___" Run instances
in two or more Availability Zones."_ __as a fault-tolerant architecture method
(which everyone __ _SHOULD_ __do) - can we please have a flag or tag to say
that "Traffic between Zone-1-HOST-A and Zone-2-HOST-A" is specifically
resiliancy/HA/fault-tolerant connection traffic could benefit from a discount
in zone transfer fees?

We are moving many hundreds servers between zones every day and sometimes by
the hour, specifically to handle all sorts of constraints (Spot, capacity,
failure, load, etc)...

I have paid quite a lot to zone transfer fees, especially when dealing with
AWS network issues or spot price insanity.

I would love to classify some traffic as resiliency traffic and be charged
differently for that traffic as opposed to general traffic used to service my
user-base.

~~~
jeffbarr
There already is a tiered pricing model for this.

Data that goes between AZ's is charged at $0.01 / GB.

Data that flows from an EC2 instance to the Internet is charged at $0.12 / GB
(for up to 10 TB / month, with discounts from there, and there's also 1 GB /
month of free traffic).

Details are on
[http://aws.amazon.com/ec2/pricing/](http://aws.amazon.com/ec2/pricing/)

~~~
samstave
I know there is a tiered pricing, but I spend 5X on zone-to-zone data transfer
over what data needs to leave AWS to my userbase. So, even with the tiered
model, I still pay a lot just to get data around my app becuase I constantly
have to swap boxes between zones.

------
tptacek
This is a better announcement than Rackspace's: no spin, direct, to the point.

~~~
ericras
"An Apology":
[https://community.rackspace.com/general/f/34/t/4341](https://community.rackspace.com/general/f/34/t/4341)

~~~
tptacek
This sentence, early in that post, is where my spin shields went up:

 _Now that this issue has been fully remediated,_ without any reports of
compromised data among our customers, _I’d like to explain what happened, and
why._ [em mine]

------
freshflowers
"Pay attention to your Inbox and to the alerts on the AWS Management Console."

Especially with incidents like these (and other cases where instances are
scheduled to be taken down), it really annoys me that AWS doesn't offer any
push alerts besides emailing the account owner.

~~~
jeffbarr
What kinds of alerts would work for you? Let me know and I will pass them
along to the team.

~~~
danmactough
I would like to be able to have alerts go right into SNS.

~~~
dlg
More generally, it would be cool if every account automatically got two free
read-only SQS queues: one for events and one for upcoming events. Every known
event (startup, terminate, permission change, network error, etc) could be
published to the queue.

Not high-priority for us but the parent comment sparked the idea (I'm not in
devops so maybe this exists via a different mechanism.)

~~~
jeffbarr
These are both great ideas. I will share them with the team today. Keep them
coming!

~~~
mh-
seems like realtime CloudTrail->{SQS,SNS} could satisfy his request and
several that I have.

(instead of waiting for events to batch from CloudTrail to S3)

------
danmactough
We noticed that all of our instances -- in 3 different availability zones in
US East -- all cycled at the same time. That was pretty disappointing. Kind of
defeated the purpose of having things in different AZs.

~~~
jeffbarr
We did a zone-by-zone reboot. If you want to send me the instance ids I will
ask our team to see what happened. You can find my email in my profile.

~~~
voidlogic
How much delay is between zones? Some peoples services don't come up
instantly. Perhaps when zone 3 when down, the users services in zone 1 hadn't
finished coming back up.

~~~
zwily
They did a different zone every day, so ~24 hours.

------
rubyrescue
We had 50 servers reboot and 15... never came back. It was painful. 10 were in
one AZ. We're putting servers in more AZs now, but preparing for a reboot and
preparing to rebuild a redis slave, 2 zookeepers, 10 cassandras, and an API
server are different things.

------
captainchaos
This explains some weird EC2 stuff we saw yesterday. A good reminder to check
up that a box starts up as expected (mounts disks, starts services, etc). It's
easy to get lazy about that sort of thing when you're moving fast.

~~~
nnx
Same here, the VM rebooted and restarted all services correctly but the
connectivity was failing intermittently. Not fun.

Concluded we better just start a new VM... and it worked.

Next time there is such scheduled event, I guess being proactive and creating
a new VM beforehand is the better solution.

------
lubos
One of my instances was scheduled to be rebooted between 22pm-2am.

But the reboot has never actually occurred.

It's not that I feel left out but did anyone else experience the same?

------
thom_nic
So this had nothing to do with Shellshock then?

~~~
kbenson
Shellshock, being a bash bug (and affecting systems that use bash for shell
commands, etc), is OS and/or application level, and the responsibility of the
individual customers to handle (as they are responsible for administration of
the instances they run). The Xen bug affects how the hardware is virtualized,
and this Amazon's responsibility.

------
amaks
No live migration huh?

~~~
toomuchtodo
If you've built your environment properly, you don't require live migration.

~~~
amaks
That's not true. Hardware where VMs are hosted requires repairs from time to
time, so VMs do require migration.

~~~
toomuchtodo
Stop instance, start instance, land on new hardware (AWS EC2, with EBS
storage). If your application requires a hot migration, you've failed, as it
can't tolerate failures inherent to a virtualized environment.

