Hacker News new | past | comments | ask | show | jobs | submit login

Looks like a hard drive stopped working. We switched to the failover server.

Sorry everyone!




If you had an auto scaling kubernetes cluster with multiple redundancies using rust and 3 JS frameworks this wouldn't happen. ;)


That being said, is the actual production infra archi for HN described somewhere ? Curious how simple it can afford to be.

We laugh at people piling layers and layers and artifacts on their sites, all in the hope of adding redundancy, handle "webscale" load, and avoid an outage (ironically increasing the chances that _something_ will break).

However, if a single hard drive crashing somewhere can cause your site to be down for minutes or hours, some non-tech people (managers, shareholders, customers) will wonder if the site is "professionnal" enough - and I can sympathize with them.


From 2018: <https://news.ycombinator.com/item?id=16076041>

> We’re recently running two machines (master and standby) at M5 Hosting. All of HN runs on a single box, nothing exotic: CPU: Intel(R) Xeon(R) CPU E5-2637 v4 @ 3.50GHz (3500.07-MHz K8-class CPU) FreeBSD/SMP: 2 package(s) x 4 core(s) x 2 hardware threads Mirrored SSDs for data, mirrored magnetic for logs (UFS) We get around 4M requests a day.


> FreeBSD/SMP

Good choice. :)


"Why are they using a freeware OS?" (: /s


> If you had an auto scaling kubernetes cluster with multiple redundancies using rust and 3 JS frameworks outages like this wouldn't surprise your users anymore.

FTFY


Irony aside, what's the point? In theory, yes, it could work better. In practice though, HN with its two baremetal boxes has better uptime than 99,99% of the Web, including the biggest ones - just because complexity has its price.


Imagine a Beowulf cluster of those!!111!!eleven!!1!


Or a simple RAID array (but of course the controller should keep working).


Personally I haven't seen a server without a RAID since time immemorial[0]. Of course HN has it, too, as explained here:

https://news.ycombinator.com/item?id=32024989

[0] early 90s, that is


... but many other horrible things might


Three.js on HN? Interesting thought.


It could be like SGI's fsn[0] file manager but for tech news.

"It's a Unix system, I know this."

[0] https://en.wikipedia.org/wiki/Fsn_(file_manager)


Raymarched SDF pyramids implemented with GLSL shaders for the voting arrows! They could be so shiny!

(It's a wonder that anyone lets me near the frontend of their websites really.)


It wouldn't surprise me if someone managed to implement a 3d voting arrow in less than the 407B transfer size of the existing .gif


huh, all this time I thought the arrow was just a unicode character


HN Dashboard in 2012: https://i.imgur.com/oymP2UW.jpg

Does this still exist?


Alas no.


Well, that's ok... thanks for being up to fix it.

It's not an actual spinning hard drive, is it?


There is a good chance that it is (or was!) an actual spinning hard drive. Whatever it is, it lives in one of our boxes at M5 and it's in their hands for the moment.


It was an SSD. A 1.6TB SAS3 SSD. (M5 CEO here)


Stop making stuff up guys, I just know that someone at the YCombinator HQ tripped over the power cable of the Raspberry Pi you're hosting this on.


> one of our boxes at M5

Read that as MI5 and it gave a chuckle!


People guess the origin of our name often. Maybe this will give you even more of a chuckle. I was not aware of the name of this computer when I named the company. https://en.m.wikipedia.org/wiki/The_Ultimate_Computer


Probably a 2.5 MB one-platter Diablo hard disk drive cartridge running on a restored Xerox Alto.

https://en.wikipedia.org/wiki/Xerox_Alto

Diablo Systems Incorporated Series 30 Disk Drive Maintenance Manual

http://bitsavers.org/pdf/diablo/disk/model_30/81503-02_Serie...

Restoring Y Combinator's Xerox Alto, day 4: What's running on the system (righto.com):

https://news.ycombinator.com/item?id=12197591

http://www.righto.com/2016/07/restoring-y-combinators-xerox-...

Xerox Alto Restoration Part 16 - our disk goes down, the Alto connects to Google and draws fractals

https://www.youtube.com/watch?v=adEr2aRwHnI

Our Diablo disk goes on the fritz, but who needs a disk when you can netboot? Ken demonstrates the Alto network capabilities, connects to Google, and has the Alto calculate and display a Mandlebrot set. Ken's in-depth blog entry including the fractal demo source code is found here:

http://www.righto.com/2017/06/one-hour-mandelbrot-creating-f...

Xerox Alto Restoration Part 1 - power supply restoration, disk drive surprise

https://www.youtube.com/watch?v=xPyqQXFC2yw

We begin our very gentle and progressive power up of the seminal Xerox Alto. No magic smoke, but one power supply is faulty. Opening it up reveals that it had a tough life, having suffered a catastrophic short of some sort, hastily repaired, and some traces almost entirely corroded through. But the source of the malfunction seems to be a somewhat classic case of bad electrolytic capacitors, way too far gone for any hope of reforming. After replacing them and repairing the supply, we turn our attention to the Diablo disc drive and cartridge, and have a bit of a surprise.

Many thanks to my CHM restorers colleagues Ron Crane, Ken Shirriff, Carl Claunch and Luca Severini.

See previous video introducing this historically significant machine:

https://youtu.be/YupOC_6bfMI

For much more details and references, see Ken Shirriff's blog entry corresponding to this video here:

http://www.righto.com/2016/06/restoring-y-combinators-xerox-...

A 1970s disk drive that wouldn't seek: getting our Xerox Alto running again

http://www.righto.com/2018/03/a-1970s-disk-drive-that-wouldn...

Identify It Challenge for 7-26-2012 Answer

https://reinventingscience.wordpress.com/tag/diablo-systems-...

ARTIFACT DETAILS: Series 30 disk drive

https://www.computerhistory.org/collections/catalog/10266694...

Description: "Not working cards missing heads may be bad" is handwritten on black marker on a sticker attached to the top of the machine.


You need to rewrite Hacker News in Rust to prevent these sorts of things!


No apology necessary, but I'm curious how a hard drive failure caused an outage. No RAID or mirroring? No hot spares? No clustering or distributed systems?


It was part of a mirror of identical SSDs on an LSI MegaRAID RAID card. We see occasional "spectacular" drive failures that take the machine down with a single disk failure. Usually it's just a reboot to come back up, and a disk replacement, then some hours of time to rebuild the array and get back to situation nominal.


I’m curious as to what time zone you’re in. Or if there’s multiple people behind your account. It’s pretty impressive how omnipresent you are :).


For two hours I thought I was finally blocked.


And just as I was on the train to work. Worst. Website. Ever.


If this site being down was that impactful to your commute, it seems like you actually feel like it has a lot of value.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: