TCP Anycast – Don’t Believe the FUD [pdf]

namecast · on May 28, 2015

Meh, I'll bite, since it's been posted....

1) This is from a Nanog talk back in 2006. Nearly a decade later, it's appropriate to ask 'how do I properly do IPv6 TCP anycast then?', and the answer is, sorry, it's a quirk of IPv4 that anycast ever worked in the first place. The presentation mentions that the authors plan to deal with IPv6 anycast by dying before IPv6 is implemented.

2) The presentation lays out how TCP anycast works across 3 points of presence. Try anycasting a prefix out of 70 PoP's and your mileage will vary. There are definitely applications where TCP anycast has the same MTBGRW (mean time before global routing weirdness) as, say, geo-aware DNS directing to regional load balancers, but with every new PoP announcing the same prefix you're taking on a non-linear (possibly exponential) decrease in MTBGRW.

3) The one bullet point that needs to be re-re-re-re-emphasized: 'Effective BGP communities from upstreams is key' [sic]. Yeah they are, and the diligence and cooperation of your upstream providers are the difference between a working anycast mesh and turning the global routing table into your own personal split brain hell.

TL,DR: Don't believe the FUD but let's not start hyping things either. I'm only aware of three ASN's announcing TCP anycast prefixes in the past decade since this presentation, and Edgecast was one of 'em. TCP anycast didn't take off, most likely because ASNs and IPv4 prefixes are way too scarce.

clinta · on May 29, 2015

First of all, I don't see how any of your comments are specific to TCP. Routing is all Layer 3, if you're going to have routing weirdness, or quirks that are specific to IPv4 and you don't have good cooperation with your upstream providers it's going to make anycast an issue with any layer 4 protocol you run over it.

Secondly I don't see how anycast is an IPv4 quirk? Announcing a prefix out of multiple PoPs certainly isn't a quirk, announcing out of a PoP then routing over a dedicated WAN is in practice no different than announcing over multiple ISPs in the same datacenter. These are all normal and not quirks. So why would it be an IPv4 only quirk to terminate that connection at every PoP on a local server? I don't see how it makes any difference if your anycast network is an IPv4 /24 or an IPv6 /48 (which has the benefit of being much easier to get).

Anycast in general didn't take off because IPv4 prefixes are scarce, but the abundance of IPv6 prefixes may be a good reason to expect future expansion of anycast.

namecast · on May 29, 2015

None of my comments are TCP specific, but as I said - at 3 PoPs, you won't see routing weirdness; announcing the same prefix from 70 PoPs will cause the likelihood of route instability to increase. For the case of UDP, hopefully no one will notice, as your lost packets will just go to the next available route; for TCP connections, a route withdrawal sending you to the next nearest PoP will lead to a broken connection and clients seeing badness in their browsers.

Anycast is an IPv4 quirk in that it wasn't designed; it "just works" because of the BGP route decision process. Yes, you can announce your prefix out of multiple PoPs, but how people choose to use the routes you announce is where things get "quirky".

In theory there's no difference between announcing a /24 IPv4 and asking us all to accept it, and doing the same with a /48, but in practice you'll burn an unacceptably large amount of IP space, IMHO, and if you tell ARIN that you need a second /48 block because you burned your first with less than 25% utilization, they simply won't let you have it.

Disagreed on your last point. The abundance of prefixes isn't the problem IMHO; it's the acceptance of route announcements from upstreams. If upstreams were to start accepting and passing along more granular routes, maybe we'll start seeing more anycast deployments, but I doubt it.

clinta · on May 29, 2015

Agreed that if doing TCP, you either need to limit the PoPs, or be doing it with a an app that is highly resilient and capable of dealing with unexpected loss of a session and reconnections. The same is true of UDP. For stateless short burst things like DNS and NTP it's not a big deal, but it is for video conferencing and other stateful protocols built on top of UDP. It's not a layer 4 issue as much as an application issue.

As to the second point, ARIN could not possibly require 25% utilization. That's 3*10^23 addresses on a /48. Or if they're counting utilization as the number of subnets, that 16,384 /64 networks.

Look at some sample companies, Valve has a /44 from ARIN. They're only even announcing 6 /48 prefixes right now. Netflix has a /32, they're announcing less than 20 /48 prefixes. I worked at a small MSP who provides ISP services to less than 50 clients, and they were allocated a /32 years ago and have yet to announce a single prefix.

Burning a /48 for an anycast network is not a big deal, and getting another /48 is trivial. Getting a /24 for an IPv4 anycast network is much harder. So yes, for IPv4 anycast to take off we need the ability to advertise longer prefixes, but /48's are abundant enough that one can be sacrificed.

tcannon · on May 29, 2015

I do it. Two data centers, east and west coast. For better or worse I peer with the same providers on each side, which doesn't do much for your traffic so much as juggling ISP maintenance that might affect you. I've not heard a single complaint due to anything relating to anycast.

Considering that you're getting a damn robust global scale load balancer for free, and it mostly maintains itself -- it's hard to argue against it,

justizin · on May 28, 2015

Super informative, love the detail in the slides!

Would be more impressed if the IPv6 plan wasn't:

  The plan consists of being dead by the time customers demand v6.

Why aren't these brave heroes of anycast leading the charge to ensure it works with a protocol suite that is now something like a decade old?

I don't expect the story to be:

  We are just as confident with this as what we've been running for years on IPV4.

But, maybe:

  Except for the bits that we rely on custom hardware for, it's looking really promising.

Getting IPv4 allocation from ARIN, even if it is conserved by anycast, isn't going to be possible forever. I've been meaning to get ahold of an ASN and a small block for just this reason.

Therefore, the techniques outlined here may not be usable to new companies or on new networks in a couple of years. Even if I can get IPv4 from my hosting provider, I need ARIN/RIPE/etc.. for anycast, afaik.

Still, great read! :)

namecast · on May 28, 2015

It's a protocol layer thing, mostly down to "what prefix sizes can I reasonably expect other providers to accept?"

For IPv4 the answer was a /24; for IPv6 the answer seems to be a /64 at this point, and burning an entire /64 to get a couple of DNS servers (the typical use case for anycast) to resolve slightly faster is overkill. It's also probably a great way to annoy your RIR when they ask 'how the hell did you burn through your initial IPv6 delegation with only two hosts' ;)

EDIT: I meant /48, not /64! Whoops-a-doodle.

detaro · on May 28, 2015

It's also probably a great way to annoy your RIR when they ask 'how the hell did you burn through your initial IPv6 delegation with only two hosts'

Since /56 seems to become the "easily available standard size" for allocations, using a /64 doesn't seem like to much of an issue, especially since larger orgs should easily get /48s? (But are /64 really accepted as routes? I would have imagined that to be also limited to /56 or /60 or so)

namecast · on May 28, 2015

Depends on who's doing the accepting ;) I actually meant a /48, mea culpa!

I'll leave this here:

https://labs.ripe.net/Members/dbayer/visibility-of-prefix-le...

It's from 2010-ish but still sadly relevant.

detaro · on May 28, 2015

Ok, yes, "wasting" a /48 is a problem ;)