

Juniper bug takes down core Internet routers around the world - webnzi
http://www.silicon.com/technology/networks/2011/11/07/global-outage-takes-down-sites-and-services-across-the-internet-39748193/

======
eyeareque
You can't really blame Juniper for these crashes. They released a fix for the
issue months ago. These customers must not have upgraded. See the juniper
software alert here: <http://pastebin.com/HBWiH92j>

~~~
trout
Can't really blame a manufacturer for a bug that crashes routers by easily
passed BGP parameters? It's not possible it's their fault this wasn't caught
in dev-test?

I've seen many environments where it's a multi-year process to roll out new
code, particularly in service providers. I wouldn't classify asking a service
provider to change all their code in a matter of months a reasonable request.
The amount of testing they do before rolling code is very time consuming. It's
not uncommon for service providers to run on 3-4-5 year old code.

~~~
eyeareque
Running 3-5 year old code shows that the provider/company probably doesn't
prioritize or fund testing as well as they should IMO. I'm sure Juniper would
work with any company who asked for a patch or custom work around if they
cannot upgrade now.

I see your point, Juniper missed a serious bug in their testing. But you can't
hold them completely accountable when they've already announced and released a
fix that corrects the issue.

~~~
marshray
The problem is, by releasing the patch they alert attackers to the existence
of the bug. I don't know this to be the case here, but for things like Windows
vulnerabilities the time between releasing the patch and it being exploited in
the wild is only a few hours.

For systems (like core routers) that are simultaneously too critical and too
available to permit timely maintenance cycles, the only solution is to not
have any bugs ever.

~~~
baq
good luck with that.

if something is too critical to be taken offline, it should have a hot
standby, right?

------
pcvarmint
It was infected with Bob Muglitis!!!

------
zdw
The frailties of monoculture are well known.

This is one of the reasons I'm hesitant to embrace the various NoSQL options
at this time - often there's only one implementation of an API, and it's tied
to that code.

Compare this to the various message queueing options that all support STOMP or
AMPQ, or programming languages that have multiple implementations.

Networking needs to define a format spec for routing and switching, and then
have vendors meet the spec. Fortunately we should be getting something like
this with software defined networking projects like OpenFlow.

~~~
davidu
"Networking needs to define a format spec for routing and switching, and then
have vendors meet the spec."

Please check out the IETF (www.ietf.org) -- This is exactly how it works.

But BGP has no security, is complicated from an implementation standpoint, and
you are right, there is a bit of a software duoculture. Juniper and Cisco.
That's it.

This has happened before... too bad the routers didn't crash BEFORE
propagating the bad BGP updates. :-)

~~~
smu
I'm just wondering, is Alcatel-Lucent still a player or are they no longer
relevant?

~~~
muppetman
They still make some great gear, as does Redback (now Ericsson) and a bunch of
others. But for direct, Internet facing devices that manage the full global
routing table, the preferred option is still Cisco or Juniper.

