
IPv6 privacy addresses crashed the MIT CSAIL network - anderskaseorg
http://blog.bimajority.org/2014/09/05/the-network-nightmare-that-ate-my-week/
======
ay
The "issues with IPv6" are with education, operation, configuration.

I personally ran WiFi networks with 8000+ wireless clients on a _single_ /64
subnet (my employer's CiscoLive conference), and assisted/consulted in running
the networks with more than 25000 clients on a single /64 subnet (Mobile World
Congress).

The default kinda suck, and the bugs may happen. but the statement "IPv6 is
not ready for production" is wrong.

I'd be happy to volunteer a reasonable amount of time to work with the OP or
others having a network of >1000 hosts, to debug the issues like this, time
permitting, vendor independent. (glass houses and all that).

There are bazillion knobs in IPv6 and a lot of things can be fixed by just
tweaking the defaults (which kinda suck).

Network of <500-700 nodes generally don't need to bother much. It's not
optimal a lot of times with the defaults but it will work.

EDIT:

the seeming "charity" of volunteering the time isn't. I want to understand
what is broken, so we can bring it up back to IETF and get it fixed + make
better educational publicity to prevent folks shooting themselves in the foot.
It'll make it to the stacks in another decade, but it will. IPv6 is powering
nontrivial millions of hosts _now_ \- so the correct words to use "needs
tweaking for my network", not "not ready for production". Let's see what the
tweaks are and if we can incorporate them into the protocol, if necessary.

~~~
MichaelGG
Why add such knobs, if they weren't there in the first place, haven't been
tested, and obviously are causing nontrivial problems?

~~~
calinet6
It seems to me that the presence of a "bazillion knobs" is a bug in itself. A
stable system can not depend on puny humans for its operation—nor should a
design be able to blame puny humans for its failure.

~~~
api
My philosophy for a long time has been "every tunable parameter is a design
weakness, and every one that must be tuned is a bug."

It's hard to achieve this in practice, but I keep it in mind as a goal.

~~~
pshc
The maxim "every tuneable parameter is a design weakness" applies for top-down
design, but not so much for something as foundational as the IP layer.

In the 70s, TCP's inventors imbued it with solid fundamentals, but could not
have anticipated how the protocol would need to evolve in the future w.r.t.
security and performance. For IPv6, there are unknowns that will only
encountered at scale, and implementors are hugely motivated to play fast and
loose with the spec. (TCP Fast Open converts!)

Bottom-up systems have inherent wiggle room, and shipping them with sensible
defaults is only possible in the short term. That's not to say we should
design over-complicated systems, but just that fine tuners are not a weakness.

~~~
api
I sort of disagree.

Every tunable parameter ought to be _computable_. If it isn't, it's because I
haven't been smart enough to figure out how to auto-tune it.

At least that's the stick I bash myself over the head with. :)

------
smutticus
The title should probably read,"Bugs in JunOS caused network downtime."

This isn't really news. There are bugs in all routing and switching OS's.
That's why they hire support people. This isn't me trying to rag on Juniper. I
know lots of people who work for JTAC and they're incredibly smart folks. I'm
sure they'll get this sorted out and fixed in JunOS and, like any bug, merged
into upstream releases.

This isn't me trying to point out that IPv6 is infallible. There might be some
design choices that the IETF made with IPv6 that were stupid, but mostly they
got it right, and it's too late now to change most of them anyways.

This is just reality with new software. New software has bugs.

You can blame it on MLD, but really MLD is no more complicated than IGMP. You
can blame it on NDP, but really NDP isn't much more complicated than ARP.

At a minimum IPv4 required ARP to function. However, in reality it also
required atleast IGMPv2. Since without IGMP, or some way to manage multicast,
how are you going to get something like VRRP to work. Link-layer multicast is
not new to IPv6.

~~~
ay
++.

A small nit to add: IPv4 did not have 1+ multicast groups per every host.
That's a dramatic difference in terms of capacity on the middle-gear which
escapes if one thinks of IPv6 as "IPv4 with bigger addresses".

~~~
smutticus
Thanks for the upvote :)

And you're right about the additional mcast addresses required by IPv6 NDP.
It's called the 'solicited-node multicast address' that every host must join.

My historical guess about why multicast was chosen over broadcast for NDP was
due to NBMA networks. In the 1990's NBMA networks(frame-relay) were much more
common than they are today. And NDP just makes more sense than ARP over NBMA
networks. This is just a guess.

Someone else suggested DOCSIS was the reason that multicast was chosen instead
of broadcast. I doubt this. I'm not that familiar with DOCSIS, but I think it
has a broadcast type link-layer. Also, IPv6 predates DOCSIS.

It could well turn out that multicast was the right choice for NDP once we get
over the inevitable roll out problems. NBMA networks could return in 20-30
years. You never know.

~~~
ay
You're right, DOCSIS has nothing to do with it.

The "classic" rationale to choose multicast over broadcast was to try and
limit the amount of time to process NDP traffic vs. the time the hosts have to
process the ARP broadcast traffic in IPv4. 15 years ago that took a nontrivial
amount of host resources, and with the way NDP constructs the solicited node
multicast, even if you just flood every packet on the link, you still can
filter the packets that are not for you at the NIC HW level.

And since there are 16 million unique solicited node multicast addresses, in
principle the scaling is pretty impressive.

Multicast is a definitely a good choice in the long term, though the "here and
now" interaction with some protocols is a bit tricky - e.g. 802.11 WiFi
([http://tools.ietf.org/html/draft-vyncke-6man-mcast-not-
effic...](http://tools.ietf.org/html/draft-vyncke-6man-mcast-not-efficient-01)
and [http://tools.ietf.org/html/draft-yourtchenko-colitti-nd-
redu...](http://tools.ietf.org/html/draft-yourtchenko-colitti-nd-reduce-
multicast-00)) - though it's not the only offender and frequently not the
biggest one (service advertisements may consume a comparable and bigger amount
of bandwidth on the volatile network).

On NBMA: you can have such a network today in a public WiFi scenario where you
do not hosts directly talking to each other but want them to access the
internet. In the wired case the moniker is "Private VLANs".

With IPv6 you can clear the on-link bit, and make NBMA work quite elegantly.
But, depending on the exact details,
[http://tools.ietf.org/html/rfc4903](http://tools.ietf.org/html/rfc4903) does
list quite a few interesting challenges - quite an informative read.

BTW, if you are interested more in the "why" rather than just "how" of IPv6,
take a look at this:
[http://www.internetsociety.org/deploy360/resources/ebook-
ipv...](http://www.internetsociety.org/deploy360/resources/ebook-ipv6-for-
ipv4-experts-available-in-english-and-russian/)

It's a gem: free, very good quality material, and written as if you are co-
designing with the author by solving various problems you see on IPv4
networks, and the protocol evolves into what is the IPv6 today.

~~~
smutticus
Thanks again for the excellent response, and for the history lesson on why
multicast was chosen over broadcast for NDP.

I've done tons of work in private vlans over the years. I was part of a months
long effort to find and document bugs in private vlans at a vendor, where my
primary focus was to find and document bugs only in private vlans.

Residential ISPs love private vlans because it prevents different customers
from seeing each other's broadcast traffic. DHCP Snooping to prevent things
like ARP spoofing, in combination with private vlans to limit broadcast
traffic will mostly lock things down pretty well.

It's funny you should provide a link to a resource from the Deploy360
Programme. I just finished doing a stint with the Deploy360 Programme as a
writer working on, among other things, IPv6 resources. My real name is Andrew
McConachie.

~~~
ay
Very nice to meet you!

Totally agreed on the residential ISPs loving PVLANs.

And thanks for your work with deploy360 and helping to get better information
to people !

------
praseodym
We've also hit this Intel Ethernet driver bug, even though we don't have IPv6
deployed. Linux will send MLD packets on bridged ports by default, triggering
the Intel driver bug on Windows machines.

With only two Windows machines saturating their Gigabit Ethernet connection
whenever they went into standby, we managed to crash the university's switches
big time (we're a group with our own VLAN within the university's network, so
we make use of their network equipment).

Naturally, because the issue only occurs during standby, and usually users
don't log off thus preventing Windows from sleeping, we first hit the bug
during the Christmas holidays (2013). The culprit hosts were all in use for
just a couple of months. In the end, it took a couple of hours to reproduce
this bug during working hours!

We fixed it by using different NICs (we didn't want to rely on the Intel
driver to be updated after a clean install; Windows Update doesn't have the
fixed version), and by disabling MLD snooping on the Linux hosts, since we
aren't yet using IPv6 anyways. This prevents the Intel bug from being
triggered in our environment.

------
tgflynn
As someone watching from the sidelines I had no idea there were such major
issues with IPv6. It seems like IPv6 has been out there for a long time (about
10 years) in terms of being supported by OS's and networking hardware, if not
ISP's. So I would have thought that cutting edge institutions (like MIT) would
already have years of experience with it and have worked out most of the kinks
by now.

If this is not the case what does it mean for more widespread IPv6 adoption ?
If such adoption is significantly delayed or stalled what will the
consequences be, both for current Internet growth in the face of IPv4 address
depletion and for new technologies like IoT ?

~~~
ChuckMcM
I haven't been on the front lines of new protocol deployment for a long time
now, but the pattern then (and it appears unchanged) was that larger
deployments brought out 2nd and 3rd order issues with the protocol. The old
joke was "How can you tell someone is a pioneer?" answer, "Count the number of
arrows in their back." which expressed that folks who adopted new protocols
bore much of the burden of their failure and revision. Sounds like CSAIL has
made some great progress in this respect.

~~~
ay
A very astute observation!

But...

There are already quite large both enterprise and service provider networks
already using IPv6. It's more in-between the "pioneers did not document the
thorns they hit, so the others would not" and "the future is not evenly
distributed yet, so we don't know about it" territory.

I've volunteered myself to understand which of the two and to do whatever is
actionable.

------
AaronFriel

        I used Ubuntu as an example, but it is hardly the worst offender. We have seen
        Windows machines with more than 300 IPv6 addresses
    

Wow! I don't operate a very large network, but I do operate an IPv6 network
and I've never seen one of our machines use more than 2 addresses. I feel like
they've got some other configuration option or oddity going on that's causing
a lot of these problems, but I am guessing they're much smarter than I am, so
I don't know what to say.

Could someone elaborate on this? I've never seen this behavior on an IPv6
network, and I'm just running a server or two with radvd and no custom switch
configuration.

~~~
rjsw
Do you use privacy addresses ?

I don't, the /64 is identifiable as "my house" so there is no reason to hide
which particular machine is doing something.

~~~
mgbmtl
A house may have multiple people living there, multiple devices, visitors,
etc.

You could track a person connecting through multiple locations by using their
MAC address used to assign their IP. i.e. if I am connecting with my laptop
from various places, my IP address will be roughly the same, only the network
prefix will change.

~~~
rjsw
A house could have multiple users, mine doesn't.

I would maybe see turning on privacy addresses as part of a general "leaving
home" script that also turned on other stuff like the firewall.

~~~
ay
You are completely right. Privacy IPv6 address are an illusion when used in a
household that has effectively an active /64 per subscriber.

(Those who say "but IPv4..." last time my IPv4 address changed, was half a
year ago). Anyway this is mostly a political give-in, that happens to also
help ruggedize the rest of the stack (the relatively rapid change of privacy
addresses uncovers more bugs than we'd otherwise find - so from that
standpoint they're to be advocated). But privacy.. huh.

Last IETF there was precisely this discussion that one might want to force a
new /64 on themselves, and privacy addresses have nothing to do with that.
(DHC WG, FWIW).

~~~
X-Istence
Comcast is more than willing to send you more than one /64 prefix if you ask
for it!

~~~
ay
"effectively an active /64". Comcast folks did a ton of testing with the CPEs,
and very very few could ask for anything than an active /64, that's why this
qualifier :-)

There's active (and pretty cool) work in the HomeNet IETF workgroup, with
running code and all
([http://www.homewrt.org/doku.php](http://www.homewrt.org/doku.php)) to make
the multi-subnet home network a sane reality.

But a different /64 from within the same /56 helps not much, and indeed to
correctly reflect the spirit of the discussion in the DHC working group, would
be to say "I want to be able to press a button and release my currently used
allocation and get some completely new one, be it /48, /56 or whatever".

Thanks for the correction!

------
mrb
An issue glossed over by people is:

 _" the entire TCAM space dedicated to IPv6 multicast is only 3,000 entries"_

Some mid-level switches normally have TCAM space for hundreds of thousands, or
millions of entries, IPv4 or IPv6. Maybe their vendor artificially crippled
their line of switches, or maybe the switches were deployed with a
configuration error. It is probably the former though. Network vendors like to
make you believe some features cost _a lot_ to implement and that you _really_
need their highest-level gear, when in fact even the biggest TCAM in silicon
cost a few tens of dollars, at most.

------
ghshephard
Nobody on this thread really seems to be talking about one of the issues
brought up in the IPv6 analysis (though I'm not sure if it caused their
outages - any time I read a post mortem with the phrase, "bridge loops" I
usually don't look any further - that alone is enough to bring a network down)

If I read the post correctly, one of the roots of their problems seems to be
that either (A) Traffic flooding causing excess traffic as a result of
multicast packets being flooded over their network, or (B) If they used MLD
snooping to reduce the flooding, the switches they have only support 3,000
entries for multicast groups - which are quickly exceeded with the privacy
IPv6 addresses that are generated by the hosts (each of which creates it's own
multicast entry, and some of their hosts had 10+ addresses)

Other than turning off privacy based IPv6 addresses, and moving to something
like RFC 7217, is there a solution? Increasing the number of multicast entries
on the switch to something larger, say, around 30,000 entries combined with
reducing the length of time in which a privacy address is valid (and therefore
requiring a group) from one day, to say, one hour?

------
MichaelGG
What's the reasoning for dropping ARP? It seemed like a simple architecture.
The post seems to indicate IPv6 requires a ton more hardware resources. And if
Juniper doesn't have a basic feature like MLD snooping after all this time,
uh? Shouldn't practically designing a high-volume switch be part of creating
such a fundamental protocol? (I know designing 2 elegant implementations of
other protocols would have fixed a ton of things in nasty protocols like HTTP
- dumbass things like line folding and comments-in-headers.)

Is this a case of idiocy seeping through the IETF because they can? It's
pretty easy to write something down on paper if you don't have to implement
engineering and product management on the result. Or because you're out of
touch with reality, like the source routing feature which was kept in IPv6
despite it only ever being a problem in IPv4? Or is this a case of the
protocol being superior and vendors just being very lazy?

~~~
akira2501
It seems like they really wanted to push multicast as a "first-class" mode of
the protocol, and by weaving it into the core of IPv6 they've forced everyone
to have a somewhat exercised implementation of it in their stack.

~~~
MichaelGG
So pushing something no one uses, and has significantly narrow uses, and still
isn't deployed publicly, into the core... How is that anything but self-
serving protocol designers? How do they get away with that?

~~~
FeepingCreature
I'm a private customer, and I use IPv4 Multicast. My ISP uses it to distribute
IPTV streams.

Don't generalize from "I don't know anybody who uses" to "nobody uses".

~~~
jewel
A good portion of the bandwidth being used on the Internet could be done over
multicast if it were widely supported.

For example, popular content on Hulu could be multicast. You'd download the
beginning of the video over HTTP but simultaneously tune into the multicast
that started the most recently. Once the two streams meet you drop the HTTP
connection. If widely deployed this would reduce costs for both Hulu and ISPs,
and for places where multicast doesn't work it can fall back to HTTP.

------
p1mrx
Here's a proposed algorithm for making privacy addresses more manageable:

[http://tools.ietf.org/html/rfc7217](http://tools.ietf.org/html/rfc7217)

Essentially, the suffix is hash(secret | prefix), so your address is stable on
a given network, but changes as you roam between networks.

~~~
nly
A small domain encryption scheme, rather than a hash, would make more sense...
that way there'd be no collisions. SDE isn't that much harder to put together
in hardware.

~~~
p1mrx
The subnet space is 64 bits long. Collisions between pseudo-random values only
become likely when you have billions of nodes, so it's not worth the effort to
preemptively avoid them.

This would also require all nodes to implement the same algorithm, which means
it's not incrementally deployable.

Edit: The actual RFC7217 algorithm contains a DAD_Counter field in the hash
input. In the event of a collision, the counter increments, generating a new
address.

------
spindritf
_the random address is changed regularly, typically daily, but the old random
addresses are kept around for a fairly long time_

I don't understand this part. I have Ubuntu machines in a network which is
technically /48 but only one /64 prefix is announced by radvd and they all
have only two addresses, one derived from MAC and one private/random changing
over time. They certainly never have eight.

Are those previous addresses not visable in ifconfig or ip -6 addr show?

~~~
justincormack
I think this may have changed (in 14.04?) as I remember having more addresses
in the past but don't now.

~~~
spindritf
I have 14.04s and OP talks about 14.04s.

 _Thus, a typical machine — say, an Ubuntu 14.04 workstation with the default
configuration_

------
s_q_b
This is now the second major network that I've heard ran into this exact
problem. The update to Windows caused the end-user nodes to send out lots of
IPv6 packets The access layer switches went to full CPU utilization, and you
ended up with packet storms across the network. There really should be an
advisory about this.

------
walshemj
It would have been interesting to see a network diagram.

------
acd
Advocate the engineering principle fault domains to isolate the problem with
L2 broadcasts with L3 routers between the L2.

------
windexh8er
Wow.

I have no idea where to even start - this article was written by someone who
has no large scale IPv6 deployment experience. There are errors upon, back-to-
back, errors in what's assumed and the expected results and assertions with
the vendor (Juniper) and the protocol operation (IPv6).

I'm not surprised that it's towards the top of HN but it shows the relative
understanding of the HN crowd with regard to complex network related topics.

~~~
ghshephard
I'm pretty familiar with "complex network related topics" \- and I think this
is the most interesting IPv6 related post on HN in 2+ years (there may have
been others that got by me).

I"m curious - are you someone with "large scale IPv6 deployment experience?"
\- in particularly, how would you have approached their issues regarding MLD
snooping and TCAM exhaustion?

~~~
windexh8er
It may be interesting, but my point was that there is nothing in this article
to learn from with the exception of problematic code from a network vendor.

Yes - I am. I deployed a 4 state IPv6 overlay servicing 250k subscribers on
one of the very first (fully rolled out) DOCSIS 3 HFC networks in the US. I
was responsible for the security and performance of the architecture. This
rollout started in 2010. The TCAM exhaustion was self inflicted based on the
hints of design throughout the article and the original authors understand of
MLD is, just generally, incorrect unfortunately.

