IPv6 adoption could not happen soon enough.
Anybody who ever tried the VPC+ElasticIP+VPN braindeadness once should immediately file a feature request for ipv6; it's just that they don't probably think of it
The number of times I found myself attempting to VPN into a clients network, only to find it conflicted either with my home network, or whatever coffee shop I was sitting in, was ridiculous. Depending how many hosts you need to run on your network there are huge numbers of possible subnets you could use for an internal network - do yourself a favour and keep off the ones set up be default on every router sold.
At my Old Job I demanded we keep a "registry" of the RFC1918 address space we allocated to Customers. We never allocated Customers in overlapping address spaces. It made VPN connectivity to Customer A while on-site with Customer B much easier. It also helped out in one case where one Customer acquired another.
One downside of IPv6 on AWS is that the IPv6 tunneling protocol -- protocol
number 41 -- is only available in VPCs. The EC2 classic network allows only 3
IP protocols -- TCP, UDP and ICMP -- to pass through it.
Can you explain why I wouldn't to do this or why I should evolve my understanding of ip6 better?
Even non-ULA IPv6 addresses need not be long. 2600:3c00:e000:6c::1 is the address of my server over at Linode, and I don't find that bad at all.
It won't work properly across routers, at least not out of the box (tried that when configuring Tinc VPN), but maybe this would be a good direction?
“Zeroconf”¹ is a name for the sum of two interacting standards, namely mDNS”² and “DNS-SD”³. Avahi⁴ is a free software implementation (for Linux and BSD) for a service where programs can register Zeroconf services (name & port number) and have Avahi announce them on the network. The other major implementation of a daemon of this kind is from Apple, and it is called “Bonjour”⁵.
This often gets confused, so, again: Zeroconf = standard. mDNS and DNS-SD = component standards. Avahi = A specific free software implementation. Bonjour = A specific proprietary implementation.
Edit: here are the slides from Facebook's presentation to the IPv6 World Congress in March about their internal IPv6 use. If IPv6 interests you, they're definitely worth a read: http://www.internetsociety.org/deploy360/wp-content/uploads/...
Interestingly enough, last week when they had their 45 minute outage worldwide with some kind of routing problem, it was still up and running on that address
That's what he's suggesting, though (I think). Except because it's IPv6, Amazon's big pool of addresses would never conflict with Facebook's big pool of addresses.
That's what Linode does. Everyone gets a /64 to use as they please.
It's been one of the best architecture decisions I've ever made. At this point we only use one public IP address. (If direct access to a machine is needed then you can connect via VPN running on the one bastion host with the public IP address, and this gives your machine access to the local IP addresses of instances running inside the VPC.)
All the machines in our cluster are protected inside local VPC address space, with the access by the external world being ELB to expose public service endpoints like the API and website. I can't think of any good reason why you wouldn't be using VPC in the first place. Having public IP addresses for private machines sounds like a recipe for disaster if you ever accidentally miss a port in your security rules.
I think you guys did an exceptional job to tackling a really difficult problem (I've been in the same position, migrating EC2 to Datacenters) and we determined that EC2 -> VPC -> Datacenters is really the only way, and Neti solves it surprisingly well.
Going forward, hope that acquired companies opened their AWS accounts late enough that Amazon forced them to use VPC.
In any case, the migration is daunting even at our size, although our devops team size is 1. I do wish they had VPC when we started.
Plan to change just the bare minimum needed to support the new environment, and avoid the temptation of “while we’re here.”
Good engineering is knowing how to act with surgical precision when necessary. This is what allows a craft like programming to operate in the confines of a business.
That is essentially what Neti does, except instead of static mappings, its dynamic and software configurable (which is pretty much the only way to go when you're entire environment is virtual and the underlying network equipment is out of your control).
This is a tough problem, Neti is a heck of a lot better than tons of VPN connections everywhere.
I'm impressed out how fast they got this migration done, considering how massive the scale they operate at is.
Maybe you're mixing things up with Glacier?
And yet every time I read about this kind of stuff I think, how glad am I that we are building a DISTRIBUTED social network and will never have to solve problems on this massive scale! We won't have to move millions of other people's photos here or there if everything is distributed from day 1. People will be able to move their own stuff easily wherever they want.
^ That wouldn't've happened in GCE (i.e., they should have been acquired by Google).
For example, if you want to assign 10.1.1.1 specifically as a network address to a virtual machine instance, you can create a static network route that sends traffic from 10.1.1.1 to your instance, even if the instance's network address assigned by Compute Engine doesn't match your desired network address.
Meaning they could have avoided conflicts using this mechanism.
At any rate, Instagram's been around since 2010 and GCE didn't exist until June 2012 (and wasn't generally available until this past December).
They mentioned Neti but didn't dig into details other than "a dynamic iptables manipulation daemon, written in Python, and backed by ZooKeeper." and they mentioned the ip blocker which is an issue on almost every migration.
Also taking into consideration that they didn't write a post in the past 10 months, I am sure that they can do it better.
Run in multiple clouds from day one. Take the pain. It gives you flexibility. Basic vendor management 101.
While "taking the pain" may yield flexibility in the long run, the most important thing in the short run is making sure that you are building something that people want, listening to users, and iterating the tech side of things as quickly as possible. I suspect that most devs have enough trouble dealing with a single cloud provider and that trying to work with multiple would could a significant decrease in iteration speed. I think that approach would kill most startups because of the technical overhead incurred.