
Load Balancers need static IPs - bpuvanathasan
http://blog.pagerduty.com/2010/08/31/load-balancers-need-static-ips/
======
js2
Here's how we solve this.

We host our DNS with DNS Made Easy, but you could also run your own DNS
servers. (We actually have a pair of EC2 instances that we configured as DNS
servers and then shut them down, so we're paying a minimal amount for them,
but can spin them up very quickly the next time DME is hit with a DDOS.)

We query the ELB CNAME periodically and check its IP. If the IP changes, we
update our corresponding A records with the new IP. It's a small amount of
code and a cronjob that runs every 5 minutes.

Elegant? No. Get the job done? Absolutely.

~~~
forkqueue
Great, except for when your users DNS resolvers cache the DNS entry longer
than they're supposed to (many resolvers ignore TTL), and are unable to reach
your site.

~~~
js2
No. Amazon gives you an A record and asks you to make a CNAME for it. So when
they change an ELB's IP, they update their A record. So in our case, we just
make sure our A records follow along. Either way, a DNS cache holding onto a
stale record too long would cause a problem.

Which is why Amazon keeps the ELB active on both the old and new IPs for a
period of time.

------
datums
If you could use a cname for your MX record, what are the benefits of using
ELB over what's already in place (weighted mx priorities, with the ability to
relay mail away automatically on failure from AWS) ?

Also there's additional cost involved in using ELB.

I would prefer having a good dns provider, low ttl, EIP, nginx w/haproxy.

~~~
agmiklas
(Blog author here) I totally agree -- I don't think there's ever a good reason
to use a LB for mail, for exactly the reasons you mention.

However, the problem with using CNAMEs for LBs is that you can't both host
mail and a site at the same subdomain. Ideally, what I would want to do is set
up redundant MX records for .pagerduty.com to our mail servers, and also set a
CNAME from .pagerduty.com to the ELB for to handle the web traffic. The DNS
spec doesn't allow this though (it would be ambiguous).

I've thought about using round robin DNS with a low TTL instead of an ELB.
Problem there is you don't get all the fancy auto-scaling stuff. I've also
heard rumors that some ISPs have their DNS servers configured to put a floor
on the retrieved TTL values...

------
kordless
I think you are blaming Amazon for a problem that is inherent in DNS, or
possibly your approach to handling your DNS based features.

Unless I've drunk too much AWS Koolaid, I'm fairly certain you can run
wildcard DNS for just the MX records for your domain. That entry would look
something like:

*.example.com. 3600 IN MX 10 mail server.example.com.

Your mail servers can take the DNS synthesized domain from there I think.

BTW, I agree serving naked domains is a bit of a PITA, (appengine problem
too!) but you can solve that by assigning a few elastic IPs to a few web heads
and use RR DNS for them, with some code to take them out if one fails. Zerigo,
for one, supports doing something like this IIRC.

302 anyone using the naked domain to the www. I doubt it matters much load
wise as it sounds as if your running subdomains for your app like we do at
Loggly.

~~~
kordless
I just realized you are probably running CNAME wildcards for the subdomains as
well, which wold conflict with the MX one. How about running separate records
for each subdomain?

~~~
agmiklas
Even with separate records for each subdomain, I think you'd still have the
problem.

You'd need to do: acme MX (mail_server_ip) acme CNAME (ELB hostname)

... but that isn't allowed. The problem is with the records conflicting, not
with the wildcard.

------
loup-vaillant
I'd rather say "load balancing needs SRV records"
<http://www.anta.net/nic/draft-andrews-http-srv-01.shtml>

Is it widely implemented yet? Why not?

------
mike-cardwell
There's no need to use a load balancer for MX. The protocol it's self handles
the problem of machines being offline. I agree with your point about not being
able to point the root of your domain at an ELB though... I wasn't aware that
you can't use a CNAME at the root.

~~~
agmiklas
The problem though is that you can't put MX and CNAME records at the same
point in the DNS hierarchy. So if you want to host a site on an ELB at
acme.pagerduty.com, you can't then put in MX records for acme.pagerduty.com,
because they'll conflict with the CNAME you need to put there.

This all happens because it creates ambiguity. If you want to look up the MX
record of a name that has both MX and CNAME entries (say acme.pagerduty.com),
should the name resolver:

a) Grab the MX record at acme.pagerduty.com; OR

b) Do what is usually implied by a CNAME record, and pull the target of
acme.pagerduty.com, and search for an MX record at that name?

Because of the potential for conflict, the DNS spec simply forbids CNAME
records from existing alongside most other records.

~~~
mike-cardwell
Of course. That's a pretty poor oversight on Amazons side.

------
pegmanm
I am not sure I understand the issue here. Sure, the AWS ELS offering is
lacking and the issues re DNS are known. But just like any of their other
offerings why try and shoe horn the product into your requirements. ?

Our service required a load balanced service and just like you we identified
the issues and decided we could not live with the limitations.

So we brought up a small instance and run HAProxy on it to do all our load
balancing. We assign an Elastic IP and we get to retain control, avoid the DNS
issues, do https etc.

Essentially avoiding all the limitations of the Amazon offering.

AWS themselves use HaProxy for Elastic Load balancing and its solid.

------
lox
Does the IP of the AWS load balancers ever change over time? Does anyone know
the technical reasoning behind the requirement?

~~~
agmiklas
Yup -- they most certainly do. I think they switch around the name -> IP
mapping in response to traffic rates -- if you suddenly get a surge of
traffic, they'll move your virtual ELB instance to a physical load balancer
that is currently lightly loaded.

Basically, they are doing load balancing for their load balancers. :)

~~~
rbranson
Does this actually cause a disruption of service though? It seems like it
would be trivial to just keep the configuration in place for 72 hours even
after changing the DNS entry.

~~~
agmiklas
They actually do that -- they keep the old IP live for a while to prevent
problems with clients that might have cached the IP.

~~~
lox
So could we not update the A records IP address every couple of hours and
still get roughly speaking the benefit of load-balanced load-balancers?

------
maccman
Yes, Amazon have been ignoring this problem for a while now. Here's a post I
wrote on their forum over a year ago:
[http://developer.amazonwebservices.com/connect/thread.jspa?t...](http://developer.amazonwebservices.com/connect/thread.jspa?threadID=32044)

------
points
Is this a joke or irony? (The blog seems to be down right now).

~~~
javanix
The title may be a little broad (his concerns are mostly in certain use cases,
not in general) but he definitely raises some real concerns.

~~~
points
Let me be clearer - the link does not load for me. <http://blog.pagerduty.com>
is dead here. <http://pagerduty.com> loads fine though.

