
TCP Anycast – Don’t Believe the FUD [pdf] - luu
https://www.nanog.org/meetings/nanog37/presentations/matt.levine.pdf
======
namecast
Meh, I'll bite, since it's been posted....

1) This is from a Nanog talk back in 2006. Nearly a decade later, it's
appropriate to ask 'how do I properly do IPv6 TCP anycast then?', and the
answer is, sorry, it's a quirk of IPv4 that anycast ever worked in the first
place. The presentation mentions that the authors plan to deal with IPv6
anycast by dying before IPv6 is implemented.

2) The presentation lays out how TCP anycast works across 3 points of
presence. Try anycasting a prefix out of 70 PoP's and your mileage _will_
vary. There are definitely applications where TCP anycast has the same MTBGRW
(mean time before global routing weirdness) as, say, geo-aware DNS directing
to regional load balancers, but with every new PoP announcing the same prefix
you're taking on a non-linear (possibly exponential) decrease in MTBGRW.

3) The one bullet point that needs to be re-re-re-re-emphasized: 'Effective
BGP communities from upstreams is key' [sic]. Yeah they are, and the diligence
and cooperation of your upstream providers are the difference between a
working anycast mesh and turning the global routing table into your own
personal split brain hell.

TL,DR: Don't believe the FUD but let's not start hyping things either. I'm
only aware of three ASN's announcing TCP anycast prefixes in the past decade
since this presentation, and Edgecast was one of 'em. TCP anycast didn't take
off, most likely because ASNs and IPv4 prefixes are way too scarce.

~~~
clinta
First of all, I don't see how any of your comments are specific to TCP.
Routing is all Layer 3, if you're going to have routing weirdness, or quirks
that are specific to IPv4 and you don't have good cooperation with your
upstream providers it's going to make anycast an issue with any layer 4
protocol you run over it.

Secondly I don't see how anycast is an IPv4 quirk? Announcing a prefix out of
multiple PoPs certainly isn't a quirk, announcing out of a PoP then routing
over a dedicated WAN is in practice no different than announcing over multiple
ISPs in the same datacenter. These are all normal and not quirks. So why would
it be an IPv4 only quirk to terminate that connection at every PoP on a local
server? I don't see how it makes any difference if your anycast network is an
IPv4 /24 or an IPv6 /48 (which has the benefit of being much easier to get).

Anycast in general didn't take off because IPv4 prefixes are scarce, but the
abundance of IPv6 prefixes may be a good reason to expect future expansion of
anycast.

~~~
namecast
None of my comments are TCP specific, but as I said - at 3 PoPs, you won't see
routing weirdness; announcing the same prefix from 70 PoPs will cause the
likelihood of route instability to increase. For the case of UDP, hopefully no
one will notice, as your lost packets will just go to the next available
route; for TCP connections, a route withdrawal sending you to the next nearest
PoP will lead to a broken connection and clients seeing badness in their
browsers.

Anycast is an IPv4 quirk in that it wasn't designed; it "just works" because
of the BGP route decision process. Yes, you can announce your prefix out of
multiple PoPs, but how people choose to use the routes you announce is where
things get "quirky".

In theory there's no difference between announcing a /24 IPv4 and asking us
all to accept it, and doing the same with a /48, but in practice you'll burn
an unacceptably large amount of IP space, IMHO, and if you tell ARIN that you
need a second /48 block because you burned your first with less than 25%
utilization, they simply won't let you have it.

Disagreed on your last point. The abundance of prefixes isn't the problem
IMHO; it's the acceptance of route announcements from upstreams. If upstreams
were to start accepting and passing along more granular routes, maybe we'll
start seeing more anycast deployments, but I doubt it.

~~~
clinta
Agreed that if doing TCP, you either need to limit the PoPs, or be doing it
with a an app that is highly resilient and capable of dealing with unexpected
loss of a session and reconnections. The same is true of UDP. For stateless
short burst things like DNS and NTP it's not a big deal, but it is for video
conferencing and other stateful protocols built on top of UDP. It's not a
layer 4 issue as much as an application issue.

As to the second point, ARIN could not possibly require 25% utilization.
That's 3*10^23 addresses on a /48\. Or if they're counting utilization as the
number of subnets, that 16,384 /64 networks.

Look at some sample companies, Valve has a /44 from ARIN. They're only even
announcing 6 /48 prefixes right now. Netflix has a /32, they're announcing
less than 20 /48 prefixes. I worked at a small MSP who provides ISP services
to less than 50 clients, and they were allocated a /32 years ago and have yet
to announce a single prefix.

Burning a /48 for an anycast network is not a big deal, and getting another
/48 is trivial. Getting a /24 for an IPv4 anycast network is much harder. So
yes, for IPv4 anycast to take off we need the ability to advertise longer
prefixes, but /48's are abundant enough that one can be sacrificed.

------
tcannon
I do it. Two data centers, east and west coast. For better or worse I peer
with the same providers on each side, which doesn't do much for your traffic
so much as juggling ISP maintenance that might affect you. I've not heard a
single complaint due to anything relating to anycast.

Considering that you're getting a damn robust global scale load balancer for
free, and it mostly maintains itself -- it's hard to argue against it,

------
justizin
Super informative, love the detail in the slides!

Would be more impressed if the IPv6 plan wasn't:

    
    
      The plan consists of being dead by the time customers demand v6.
    

Why aren't these brave heroes of anycast leading the charge to ensure it works
with a protocol suite that is now something like a decade old?

I don't expect the story to be:

    
    
      We are just as confident with this as what we've been running for years on IPV4.
    

But, maybe:

    
    
      Except for the bits that we rely on custom hardware for, it's looking really promising.
    

Getting IPv4 allocation from ARIN, even if it is conserved by anycast, isn't
going to be possible forever. I've been meaning to get ahold of an ASN and a
small block for just this reason.

Therefore, the techniques outlined here may not be usable to new companies or
on new networks in a couple of years. Even if I can get IPv4 from my hosting
provider, I need ARIN/RIPE/etc.. for anycast, afaik.

Still, great read! :)

~~~
namecast
It's a protocol layer thing, mostly down to "what prefix sizes can I
reasonably expect other providers to accept?"

For IPv4 the answer was a /24; for IPv6 the answer seems to be a /64 at this
point, and burning an entire /64 to get a couple of DNS servers (the typical
use case for anycast) to resolve slightly faster is overkill. It's also
probably a great way to annoy your RIR when they ask 'how the hell did you
burn through your initial IPv6 delegation with only two hosts' ;)

EDIT: I meant /48, not /64! Whoops-a-doodle.

~~~
detaro
_It 's also probably a great way to annoy your RIR when they ask 'how the hell
did you burn through your initial IPv6 delegation with only two hosts'_

Since /56 seems to become the "easily available standard size" for
allocations, using a /64 doesn't seem like to much of an issue, especially
since larger orgs should easily get /48s? (But are /64 really accepted as
routes? I would have imagined that to be also limited to /56 or /60 or so)

~~~
namecast
Depends on who's doing the accepting ;) I actually meant a /48, mea culpa!

I'll leave this here:

[https://labs.ripe.net/Members/dbayer/visibility-of-prefix-
le...](https://labs.ripe.net/Members/dbayer/visibility-of-prefix-lengths)

It's from 2010-ish but still sadly relevant.

~~~
detaro
Ok, yes, "wasting" a /48 is a problem ;)

