Hacker News new | past | comments | ask | show | jobs | submit login

The radio version: https://www.npr.org/2019/08/07/749135286/episode-931-the-it-...

The climax of this story involves the belief that “Linux servers don’t just crash”.

If he was familiar with those servers, he probably had a sense of how likely it would be that they would just "crash" out of the blue. I've seen Linux servers crash, but that's because they were setup by a dumbass and were doing all sorts of insane shit that kept filling up the disk and memory. If a server isn't doing anything stupid, it's pretty unlikely it's going to simply crash. He was working for criminals, and I think they said one guy was wanted for murder, so I can see why he'd be super suspicious.

In the linux 2.2 days, my home linux had a kernel crash. Every attempt to open a tty caused a stack trace in dmesg.ssh telnet xterm, you name it. Amazingly, the system worked perfectly apart from that. I could troubleshoot the problem, save my work and shut down

The point being: Windows would just BSOD in that kind of situation. Not that continuing to run with corrupt kernel data structures is a good idea, but there is something grandiose about the OS stubbornly refusing to die when it's raining kernel crashes.

Also the fact it crashed on a Sunday when no one was supposed to be at work.

I refuse to believe this wasn't just a way to add drama/tension. You can't work in IT and actually believe this.

Well, he was afraid he was going to be murdered by criminals. Under those circumstances even slight suspicion would be terrifying.

And he'd been working there long enough to know that this wasn't a common event. He even qualifies it, "You know, it's very hard for that to happen."

I mean, it depends. Services crash, sure, that might be what they meant by "the server crashed". But it's very, very difficult for an actual modern kernel to crash and also not auto-reboot and resume whatever it was doing. Hardware failure is always a possibility though.

maybe the disk filled up?

That would cause any disk-write-dependent service to crash, yeah, but the kernel isn't one of those. Syslog will stop and a bunch of things will complain, but I've always been able to SSH in and free up some disk.

Did you read/listen to the story? A non-tech person called him and said "the server crashed." I am pretty sure they would say that in case of "any disk-write-dependent-services crashed, but you could ssh in and free up some disk." The criminal boss wasn't gonna "ssh in and free up some disk", that's why he had an IT guy!

Anyway none of this is particularly important for the story, but I don't think the guy telling it is lying for dramatic effect, I think he's probably being honest that the boss saying "the server crashed" made him suspicious cause that server never crashed, and I too found it amusing for this to be included in the story (as a sort of by-the-by advertisement for linux). (Also, though: it turned out the server really did "crash" in some way).

I just figured he was talking about an OS-level "crash", considering how he went on about how Linux never crashes.

Non-IT people may say it's a crash no matter what actually happen. "Crashed" "Freezed" "app not working" "can't print" is the same thing for them often.

BTW I've seen an error where app server couldn't write files to a directory but only for specific filenames (that wasn't already created). Turns out if you have dozen thousands of files in one directory the hash table has collisions and some files you can create while some other names you cannot. It was lot of fun to discover that :)

And customers described it as "server doesn't work" but when we connected the randomly generated names it was trying to write were different and it worked.

This has happened to me on a Sunday albeit with a hosted server running that awful cPanel version of linux.

The backup was what killed it, it ran out of disk space and the box keeled over. I could not believe the backup program was that stupid to back up twice as much stuff as it had space for and then to kill off the important processes to keep the backup running until zero bytes were left.

I have also had a close one with mySQL replication, it took the disk to fill up before I configured it to purge the logs. My own stupidity is to blame for that one.

Log files are going to be the killer, run a linux box for long enough without any log file rotation and the disk is ultimately going to fill up. I can't imagine that a decade ago when this server was built that there was a rack of terabyte SSDs in there.

Email is also an area that just grows and grows. The email doesn't even have to be used, just your system message stuff.

It might have been that the specific server in question usually didn't crash because of the applications being run. Also it could have been that the bad guy wasn't giving any detailed information about what happened so it seemed like a setup.

They don't "just crash"

You need to run stuff you care about on them for that to happen. If you don't they run flawlessly for decades. I know of a switch that was up for 11yr (not a server I know, still a unix/linux based OS though) which is as much a testament to UPSs and backup generators as it is to the OS.

I set up a couple Linux boxes at work a couple of years ago for some new software we were building. A year later, I logged into the box, checked the up-time and realized it hadn't been rebooted since I installed it. So if that's his experience, I understand him.

OTOH, we also have an NGINX box that is automatically restarted once a week, because otherwise our API gets really slow. I understand that this isn't good practice, but we've spent too many hours debugging already and at the end of the day, this works.

"That's the one day I saw a linux server crash..."

Most unbelievable part of the story. Cracked me up.

It happens. I sent a dev server into a kernel panic when I kill -9'd a hung ddd debugger. This was about 15 years ago, on a 2.x kernel. Pretty sure that issue has long been resolved, but Linux isn't bulletproof.

I cringed so hard when I heard that...

Spoiler: their Linux server actually crashed.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact