Hacker News new | past | comments | ask | show | jobs | submit login
Do we really need network automation? (mirceaulinic.net)
22 points by mirceaulinic 66 days ago | hide | past | web | favorite | 8 comments

Back when I lead Network Engineering at Square, we had a global network of production dataceters and offices. Nothing of the scale of Cloudflare, but we still have many hundreds of devices. They had all been built manually with copy and pasted configs. The trouble with this was it made changes very risky and terrifying because there could be subtle inconsistencies between sites, or huge differences. So it was always very challenging to reason about the impact of a given change.

Thanks to a ton of grit by the team, and the insistence of one engineer in particular we built a config management system and started tracking the total percent of our global network config that was managed by our config management system.

That metric was regularly presented at the VP level to hold us accountable to getting the percentage to 100.

It was months and months of boring work to remove inconsistencies and templatize configs. But in the end, I believe it resulted in a much more reliable and ultimately safer network to operate. I'm also happy that my management chain saw the value in this work.

I'm quite proud of the work the team did.

Some side benefits were that once we started going through audits like SOC2, we had a really good story to tell about how we reviewed and pushed changes to production.

What did you use to automate it? Ansible? Or what. Thanks!

We wrote our own thing because at the time people were not all doing that with Ansible.

I really like way Avaya and Cisco DNA do automation. It’s all centrally managed and it uses VXLAN so no vlans have to be present. Just say you want vlan 20 here and here and it takes care of the rest

Avaya's previous network architecture (their networking business is now owned by Extreme) wasn't at all centrally managed. I'd go as far to say SPBM is the anti-thesis of centralized networking management. It didn't use VXLAN either.

So you wanna SDN.

What open source tools scale easily to manage 100‘000 devices or more via CLI|SSH?

From what I heard (I never had so many devices to manage myself), Salt can scale nicely when managing a very large number of devices - see for example this story from LinkedIn: https://s.saltstack.com/saltstack-at-web-scale-better-strong.... However, they're using it in the "classical" topology which is agent based. For non-agent base, you may consider looking into Salt-SSH: https://docs.saltstack.com/en/latest/topics/ssh/. Hope this helps!

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact