
What I've learned about scaling OSPF in Datacenters - signa11
https://elegantnetwork.github.io/posts/What-Ive-learned-about-OSPF/
======
vii
This article challenges partisans in the BGP vs OSPF debate but it's deeper
than just pointing out that the wire protocol is quite superficial. Obviously,
at scale you have to control OSPF to avoid update floods. But BGP also needs
to be controlled so bad entries aren't trusted and propagated; for example,
[https://blog.cloudflare.com/how-verizon-and-a-bgp-
optimizer-...](https://blog.cloudflare.com/how-verizon-and-a-bgp-optimizer-
knocked-large-parts-of-the-internet-offline-today/)

The deeper message in the article is linked
[https://elegantnetwork.github.io/posts/Network-Validation-
wi...](https://elegantnetwork.github.io/posts/Network-Validation-with-
Vagrant/) and it's about simulating infrastructure. This is so important for
large scale systems as complex aggregate behaviour easily emerges from myopic
error handling strategies that only take into account the local view from each
actor. This is true for any large scale system not just networks!

The wire protocol is the surface; what information is propagated is essential
and what lies underneath. Simulation illuminates these depths. Whiteboard
discussions are great for debugging and communicating specific cases - for
example, aggregate responses to retry backoff rules that are locally sensible
can cascade into catastrophes if many actors follow the same process. However,
it's hard as human beings to find these cases without the help of tools. Tools
are so important!

~~~
tptacek
The BGP problem you're linking to owes to it being run at Internet scale by
diverse teams, not as an interior routing protocol.

~~~
vii
Yes.

I linked to it because it touched on BGP optimisation techniques, and
illustrates the dangers of automatically passing on each update.

Is there a better link to a public postmortem that illustrates these problems
in a datacenter context? When we were operating the Facebook Moonshot project
we found that even internally in controlled datacenters, at a scale less than
Amazon's, there is plenty of diversity

------
karambahh
>I think whiteboards are the most important tool for network design (...)I
can’t even tell you the number of disasters averted by 2-3 great network
engineers arguing over a whiteboard.

I strongly agree with this statement and think it holds true for a lot of
IT/dev/ops fields.

That's something I miss when working remotely. Has anyone found a tool that is
seamless enough to allow for true interactivity, as much as be in the same
room and sharing a whiteboard?

~~~
apple4ever
Absolutely agree as well.

I've solved a lot if IT problems with a whiteboard, from networking
engineering to server engineering to software engineering

------
m-app
> If you are running a small Clos network and don’t have IPv6 or EVPN [..]

... seem like two relevant qualifiers to choose BGP over OSPF.

Other than that, being religious about a certain technology seems like a bad
mindset to hold on to indeed. When you have a certain scale it starts to make
sense to question the efficiency of the primary technology in the market for
your specific use case.

Edit: essential qualifier for the mentioned mindset added

------
traceroute66
I would suggest the author also left out the consideration of skillset.

In the sort of environment described, its pretty much guaranteed you'll be
already managing EBGP sessions to the outside world.

So why not capitalise on the BGP knowledge you have and use it internally too,
instead of bringing another protocol (OSPF) into the mix.

Jack of all trades, master of none. As the old saying goes.

~~~
windexh8er
Most network engineers are very well versed in OSPF. Having come up through
the ranks of Cisco's early NetAcad curriculum in the early 2000s, when BGP was
very rarely used in internal networks, thr content was 80% focused on IGP and
large L2 design and troubleshooting. It seems to me the IGP has fallen out of
favor partially due to simplicity over the right tool in the right context.
I'm sure, as you state, part of that has to do with the responsible engineers
lacking experience and shying away. But in many cases BGP can be slower and
less forgiving where a focused IGP can add value and, most importantly
control.

I was part of a team running a multi-state network for an ISP around a decade
ago. Even then we had a lot of design and architecture reviews of spreading
BGP as far and wide as possible. But the IGPs, at regional node levels (HFC
type networks) always had functionality for end user provisioning that allowed
us to more easily provision and troubleshoot.

While I don't agree or disagree that BGP can't be the right solution what it
really comes down to is understanding the use cases compared to the protocols.
I find that most engineers today only loosely understand BGP and OSPF. Fully
vetted network engineers seem to be fewer and further between. I moved away
from that as a role years ago but my experience in large networks and
understanding those protocols, although admittedly not as well as when I was
studying for the CCIE labs, has been a fundamental skill I still use day in
and out. And as we move more to software defined overlays I find it coming up
more and more these days.

------
jlgaddis
Is this why AWS doesn't support IPv6? Because they're running OSPF (v2) across
their datacenters?

After all the work they (apparently) had to do to get OSPF to work, I can
understnd why they'd not want to repeat all of that to test and validate
OSPFv3 too.

\---

For the uninitiated, OSPFv2 _only_ supports IPv4. IPv6 would also require
deployment of OSPFv3. BGP, which is typically used in these situations,
supports both IPv4 and IPv6 (and many other features which are often desired).

~~~
jvolkman
AWS does support IPv6.

~~~
jlgaddis
That's a relatively recent change, no?

