I'm sure there's a cute name for this specific variety of irony, but I don't know what it is.
Not really, unless that someone doesn't know about the structure of networks. My first two hypotheses after reading the problem description were routing anomalies and timeouts.
That other story about the car and vanilla ice cream, on the other hand, is a great example of unintended correlation != causation.
The relationship didn't last, so the morale of the story ended up being always trust software defects when it comes to relationships. This turned out to be especially true later when a pretty egregious Myspace bug introduced me to my wife :)
Printers are such unimaginable sources of pain, such a crippling tax on everyone forced to interact with them in any way, that it's hard to imagine anyone working to produce them in any standard, structured way.
Well either you've a severely lacking imagination or have had a really comfortable life. I hope for the latter :)
Kidding though, yeah printers are frustrating and cobbling together a printer-fax-copier-scanner into one device is even more terrible. HP was really serious about process though! The test group was literally bigger (more people) than the dev team.. by probably a factor of almost 2.
HP was really big on Model Based Testing  and we read books and maybe even had a class on it when joining. The QA team was all in all pretty high functioning, if memory serves.. but the products would still get shipped with four figures of open bug reports (I am not kidding or exaggerating).
The sort of bugs at the start of the cycle were downright terrifying, too. If your printer has never caught fire, or vomited all of it's ink on your desk in the middle of the night... thank your friendly HP QA employee!
Mostly in my era the problems with the products were that they were done on a pretty tight schedule (6-12 months) for what tended to amount to more complex/new hardware than folks would initially expect. The testing cycle also started pretty late because the initial "get it barely working" cycle ate up a couple-few months, and the non-firmware software (drivers and crapware utilities) got very little attention by the a-players on the team.
More testing: not just this file, some other file transfers from unrelated places as well - and always at the same places within the file (but different across different files).
Take the tcpdump on both sides - always the same segment of data within each of the file transfers does not make it through the PIX. Take the offending segment and convert it into a small file of its own - this file is impossible to download through, gets dropped.
Needless to say, this all was observable only on that particular setup - not in the lab.
Finally I noticed that the CRC error counter on the inbound interface increments by one every time I try to push through the offending small file.
Replacing the Ethernet cable connecting that interface had solved the problem with all of the "hanging" transfers.
We did not do any further research into the root cause (the user did not want to put back the previous cable), but the working theory was that the initial cable was made bad, but not bad enough to not work at all - and the fault only showed up on particular sequences of data.
Given that neither 10BaseT nor 100BaseT used scrambling, this seemed plausible enough of a theory, but was quite fascinating nonetheless.
1) Problem report was: "I can login to remote host fine, but my interactive session hangs when I type 'ls -l'." Some diagnosis indicated the hanging only occurred when packets were above a certain size. This only happened when a command was run that had a large amount of streaming output. Turned out our provider's frame relay network had become misconfigured. Had to reduce the MTU at both ends as a workaround until I was able to convince our provider that something was wrong with their frame relay.
2) My home Linux box unable to talk to a Verizon pager gateway, but my Mac was fine. Turned out Verizon's firewall was dropping packets that had the ECN bit set. Had to disable ECN on the Linux box.
3) A Solaris host that was sporadically reachable over the LAN. It could always connect to other hosts, but orher hosts could only sometimes reach it. Turns out you can disable ARP replies on Solaris by deleting the "publish" entry for the box's own MAC addr from its ARP table. And someone or something had done that by accident. So the box was only reachable on the LAN till its MAC timed out of the other hosts' ARP caches. However, if the host in question first initiated a connnection to another host, the destination host would then learn its MAC again.
Another "fun" (and hard to debug for the first time) is the problems triggered by duplex mismatch - though the asymmetric ~1/10 difference in performance quickly becomes a signature once you met it once or twice. Also good anchor where you can go into many types of discussions.
The very similar to (3) I've seen manifested in IPv6 - due to Neighbor Discovery working over multicast, and potentially asymmetric ways of multicast propagating within the L2 segment (different L2 switches on two sides, etc.)
But very interesting to learn this asymmetricity in Solaris stack, thanks!
Another spooky issue was Skype on my iPad stopped working every 10 days. It turned out that I accidently set the WiFi switch to a static IPv4 address on a LAN with DHCP server (and I powered off the WiFi switch every night). So every few days another device on the network got the dynamic IP address (from DHCP) that I statically set also for the WiFi switch, ups.
Seeing as the error happened on the same place each time, it might sound like you were getting errors that didn't _always_ get caught by the checksum algorithm (i.e the error "happened" to match the checksum) (maybe).
Sounds like another good reason to do separate checksumming of files after download (read: sha1/md5).
Needless to say the whole operation made the corruption checksum-neutral from the viewpoint of the "real" checksum - so it was not caught by the TCP checksum, as a result the files were sometimes silently corrupted.
Took a lot of work to catch and debug, the symptom was that Gentoo's packages appeared to be corrupt and failed the SHA1 checksum check.
So, indeed, cryptographic checksumming of files is a good thing.
The system would of course refuse any single job that took you over quota but, if you wrote a script that pounded a few jobs slightly below quota right after each other, two or more would get through, underflow and you'd suddenly have 65k pages left.
I know when you left the school you could get reimbursed for any remaining pages left on your ID but I wasn't crazy enough to try that. 65k * $0.05 = a chat with the police, I'd guess.
"Printer won't print on Tuesdays" :http://mdzlog.alcor.net/2009/08/15/bohrbugs-openoffice-org-w...
"A member of the famous Black Team manages to create a sequence of operations that topples over the tape drive":
> The Black Team link inside
> your link is broken ...
The "in front of the press" bit is almost certainly an urban legend, but the concept did happen: http://www.catb.org/jargon/html/W/walking-drives.html
After much investigation, we traced the problem to a Rock Band drum set foot pedal. It has an AC adapter plugged into the wall, and also acts as pressing L on the PS3 controller when it's triggered. Somehow, pulling the cord on the ceiling fan made the foot pedal act like it got kicked, which would rewind the movie playing on the PS3. Unplugging the foot pedal or drum set solved the problem.
edited for typos
Why would anyone who has used a printer more than once think it would be impossible to reliably jam a printer...
One of these sensors was a simple conductive sensor which sensed metallic tape on the floor of the arena.
One of the teams had a freezing issue that they couldn't get to the bottom of, until they realised that it froze when their back roller (a sphere in a cage so it can rotate in 2 directions) went over the metallic tape.
The theory goes that the moving lego parts generated static electricity which was then conducted somehow down to the metallic tape, which caused some fault in the single board PC we used.
I wouldn't have believed it unless I'd seen it. Replacing the metal roller with a lego piece cured the issue.
One of them memorably had a wire or a resistor or something that wasn't connected to anything else in the circuit. So they removed it.
The circuit stopped working.
They put it back in.
It started working again.
The best anyone could figure was that that bit of metal was interacting electromagnetically with the rest of the circuit in some immeasurably small way that the entire thing depended upon.
Shorter article: http://www.damninteresting.com/on-the-origin-of-circuits/