Hacker News new | past | comments | ask | show | jobs | submit login
That packet looks familiar, and that one, and that one (rachelbythebay.com)
10 points by zdw 3 days ago | hide | past | web | favorite | 1 comment

At an internship we had this exact problem (though for us spanning tree was disabled because it would take new hosts half an hour to show up on the network so instead of solving it properly it simply got disabled.)

The way this eventually got diagnosed is by the activity leds on the switches. Of course on a busy network you expect blinkenlights but having every activity light blink at full tilt on every port is fairly suspicious. You'd at the very least expect some ports not using any network just by chance.

From there the conclusion "something is generating a crazy amount of packets" was fairly simple to make. Less simple was finding which piece of equipment it was. In the end we did a binary search by unplugging internet cables, the network was down anyways so it couldn't hurt. That narrowed it down to a building real quick, and a specific office not long after. A short trip to the netwerk port that seemed to cause the traffic later and we found a really cheap switch in the corner with a patch cable forming a loop. Someone had just moved office and just plugged everything in -- including that loop. Problem solved!

Of course we ran into way more problems with this setup later. At some point after installing a new set of computers the switches would start randomly rebooting, one after the other in a cascade. New switches got bought for a single rack without really diagnosing the issue. The new switches had the exact same issue. Turns out switches have limited memory for routing tables, and having too many machines will balloon the size of the routing table to the point of the switch simply giving up. There was a critical amount of computers that could be on at the same time at which point booting up one more would get the switches into a reboot loop. As I was leaving the internship there were finally VLANs being set up, which I assume solved the issue in its entirety.

I haven't got much of a conclusion to this story, It was a good internship. I learned a lot and had quite some autonomy when doing my job. I had great colleagues and a lovely time. I do suspect that if I had gotten an internship at a place with more IT experience I wouldn't have learned nearly as much as I did here.

If someone can point me to some resources about how to conclude stories that'd be great, but that's neither here nor there.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact