Hacker News new | comments | show | ask | jobs | submit login

Never mind when you start adding fault-tolerance and failover setups/configs to the mix.

It gets really complex really fast, and even the slightest lapse in attention to extreme detail can be disastrous and a nightmare to troubleshoot.

For instance, I once had a router reboot for some unknown reason, and the firmware/config wasn't properly flashed, so it reverted to the last point release of the firmware. That was enough to cause the failover heartbeat to be constantly triggered, and the master/slave routers just kept failing over to each other and corrupted all the ARP caches. Makes for a lot of fun when things work and then stop working in an inconsistent manner.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact