

Migrating from AWS to AWS - mikeyk
http://instagram-engineering.tumblr.com/post/100758229719/migrating-from-aws-to-aws

======
legohead
We just migrated over to VPC as well, and came across a really weird "bug".

We auto-scale EC2, and _randomly_ when auto-scaling, the new server couldn't
connect to memcache (ElastiCache). Note that when you migrate over to VPC you
have to migrate everything -- launch new ElastiCache servers in VPC, EC2
servers, RDS servers, etc.

Back to the bug.. I'd ssh into the EC2 server, and when I telnetted to
memcache, it wouldn't connect. I terminated the EC2 server, and a new server
comes up and can connect fine. I made a forum post in AWS forums and got zero
responses. We then bought into AWS support and I submitted a ticket.

The problem: I launched my ElastiCache servers in the same subnet as my EC2
servers. Apparently the ElastiCache servers by default remembers servers in
the same subnet by Mac address. Since we were cycling EC2 servers, eventually
we'd get one with the same Mac address but new internal IP address, and I'm no
networking guy but apparently this caused a routing problem.

Solution: create a new subnet and launch all the ElastiCache servers in that
subnet. I did that, and it fixed the problem. The AWS support rep said if the
ElastiCache servers are launched in their own subnet it will force them to go
by IP instead of Mac address.

Anyway, hope this helps someone out ;)

~~~
mmmooo
This sounds like arp cache. And can be "fixed" by arping (forced arp responses
without a who has)

~~~
matthavener
Yup, EC2 instances should be issuing a gratuitous arp on startup. I've seen
the same thing on subnets with a lot of DHCP churn due to constantly rebooting
embedded devices.

------
roncohen
We also migrated a while back (opbeat.com). While we run a smaller setup, I
imagine that might be the case for most readers. We run a pretty standard
setup with ELB, Postgres (master/replica), webservers and job processing
servers.

This recipe details what we did (as i recall it):

    
    
      1) Prerequisites: Running at least two of everything in
         separate AZs and expertise (or courage) to fail over
         to a replica DB.
      2) Boot up instances of everything in the VPC
      3) Set up a new ELB inside the VPC, add the web servers inside to the ELB
      4) Make sure your instances inside the VPC can talk to
         which ever service they need outside (replica database
         and web servers needs to reach master outside +
         memcached). Use `telnet` to make absolutely sure :)
      5) Make sure web and job servers can reach the
         replica db and memcached inside
      6) Test out the new VPC ELB from outside
      7) Switch DNS over to the new VPC ELB, wait for it to
         propagate.
      8) Do a failover from your master to the replica inside.
      9) Same for memcached
      10) Shutdown everything in EC2 classic.
      11) Drinks
    

EDIT: formatting

------
Someone1234
I'm used to AWS pricing but I'm still a little fuzzy on how AWS VPC pricing
works.

It is $0.05 per VPN Connection-hour. But in this context what is a "VPN
connection?" Do you literally just set up your cloud for "free" (aside from
paying for instances, etc) and then only pay $0.05 for every hour you spend
connected to the private cloud externally?

Does the VPC have any external visibility aside from the VPN connections? And
if it does, what is stopping you just setting up your own VPN server and
bypassing the $0.05/hour rate?

~~~
ceejayoz
VPC is free (and actually, new AWS accounts launch instances by default into a
default VPC Amazon sets up for you).

The $0.05 per VPN Connection-hour is only if you want to connect your own
network to your VPC.
[http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_VP...](http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_VPN.html)

> Does the VPC have any external visibility aside from the VPN connections?
> And if it does, what is stopping you just setting up your own VPN server and
> bypassing the $0.05/hour rate?

Some do that, but the ~$30/month you're saving is likely eaten up by the cost
of setting up and managing it and the software VPN instance you have to run.

~~~
iancarroll
One would assume the VPN also has a higher network throughput then any
instance under $30.

------
dantiberian
This has to be pretty high up the list of "Most creative uses of Zookeeper".
I'm really impressed with the ingenuity involved with building such a glorious
hack.

------
giovannibajo1
The details of the migration are juicy and it's probably a good idea to do it
anyway, but I can't stop thinking that the initial stumbling block (clashing
of private IP addresses) wouldn't have happened with IPv6.

------
wldcordeiro
Is Instagram still using a modified version of Django 1.1 (if I recall that
was the version they were on) or do they stay on the newest version now?

