My router at home blocks Reddit and HN except from 5-7pm and 11:45-midnight (and all day Saturday). Worked amazingly well to cut back the addiction. Now if I could just get HTTPS blocking to work on FB and Twitter.
I don't know why but I find myself opening new tabs without thinking about it. On Facebook? Ctrl+T, fa, down arrow, enter. Why? Who knows, I was already there... That's when I noticed that I have some major problems.
Also, people underestimate the power of serving out of RAM. It's not unreasonable to serve 20-30K QPS off a single server if the work it needs to do is limited to minimal request parsing and fetching some data from main memory. That's about 2.5 billion requests/day, fully loaded. Granted, I'm thinking something more like memcached than a fully-formed webserver, but an in-memory webserver that stores its data in hashtables (like news.yc) and has a really fast templating languages, or just writes output directly to the socket, could probably come close.
I use redis for this exact reason -- I prerender over 2,000 page templates twice a day, and store them in RAM. The app server has to do a little processing before sending the pages to users -- it picks a different template depending on whether the user's logged in or not, and then substitutes the user's info into the template (for logout/profile links). The session info is also stored in redis. This lets me reboot the server and be ready to serve pages again almost as soon as it's back up. With all the data, redis uses about 300-400MB RAM on a 64bit Debian VM.
I use a VPS for my site, and on a VPS, the only thing you're allocated that you can depend on always being available is RAM. The processor cores might be shared with a busy user, and you can't always depend on high disk I/O speeds.
If it was always worse then every developer doing this must be stupid. Here are some ways in which a filesystem is "better":
- Zero administration
- Only configuration setting is the directory
- Trivial to test
- Trivial to examine with existing tools, backup, modify etc
- Works with any operating system, language, platform, libraries etc
- Good performance characteristics and well tuned by the operating system
- Easy for any developer to understand
- No dependencies
- Security model is trivial to understand and is a base part of operating system
- Data is not externally accessible
Many existing databases have attributes that aren't desirable. For example they tend to care about data integrity and durability, at the expense of other things (eg increased administration, performance). For a use case like HN, losing 1 out of every 1,000 comments wouldn't be that big a deal - it isn't a bank.
Consider the development, deployment and administrative differences between doing "hello world" with a filesystem versus an existing database. Of course this doesn't always mean filesystems should be used. Developers should be practical and prudent.
- Has average latency over 500ms when not under load
- Performs quite poorly under load (I hate to bring it up, but the most recent example was Aaron Swartz's passing. Anyone who used HN then to get news knows how poorly HN performs under load)
- Is restarted every week or two because it leaks memory
- Keeps XSRF tokens in memory and loses them across restarts
- Doesn't have a full markup language
HN is quite poorly-featured compared to typical commenting sites. People use HN because pg is here. He could remove half the features on the site (bold & italics... what features are there even to remove beside nested commenting?) and retain 90% of the audience.
Well, I guess we can agree to disagree. HN is is popular for me because of the participants. pg, as epic and central as he is to ycombinator, doesn't play that much of a role on HN in terms of moderating and directing conversations, or even, in recent years, participating that much.
With regards to the commenting site itself, I can think of no more viscerally enjoyable a forum I've ever participated in, with the possible exception of *Forum on MTS. There is nothing whatsoever that I would change about it, with the one possible exception of tweaking the markup so you could add fixed-width text/lists that wrapped over multiple lines. It's the only additional feature I've ever wanted out of HN. There is beauty in it's simplicity. [Edit: Okay, I would also move the upvote/downvote arrows a bit for mobile usage. It's almost impossible to hit the right one without a lot of zooming]
And, with rare exception of a MSM hit, the performance is more than adequate for an environ that should be encouraging reading, digesting, and composing.
Well, no. There are some exceptions, but most databases add a whole lot of bloat you don't necessarily need. Simple files can be just as fast or even faster than using a big database - which is the most important metric to me.
Even if that was true, I'd still tell people to start with flat files for a new project. It's like the advice to do a job yourself before hiring for it: You'll be better equipped to judge how well a database is managing your data if you've already done it yourself.
I find it interesting how easy it is to criticize a functioning system based on some aspect of non-conformance with some hypothetical ideal. The idea that HN wouldn't run on a database is no less astounding than the idea that much of facebook runs on PHP. Design of real systems is often messy and imperfect and deviates from the ideal due to necessity of optimizing one or another factor that may not be obvious.
Given that you've got performance issues and a fairly limiting deployment model, I never understood why you didn't get the most absurdly overpowered machine possible. (I assume you're not, because if you were, you'd be upgrading every ~6mo or so as faster single-core machines come out)
If I were specing a machine for HN, naively, it would be a competition between a Xeon "enterprise" CPU with huge cache and memory bandwidth (interleaved up) and a gaming/desktop CPU with maxed-out single-core performance. Xeons can do single-core turboboost now, so E5-2690 which goes up to 3.8GHz is probably the best bet, but a desktop i7-3970X 4.0GHz might be an option if you don't need ECC (which also gets you slight speed improvement on memory).
This is just idle speculation but are there any stats on HN traffic? I have always wondered just how many folks read it, how often etc? I heard a million accounts being bandied around at one point and I cannot tell if that is excessive or not?
Because the single server that runs it is being turned off and another one is being turned on? Its probably not worth the time to write something to sync processor state from one to the other or clone some sort of vmotion type thing.
It's Friday night. My favorite band (http://www.theanatomyoffrank.com/music) is playing a house show in my town, which is the best kind of show because it's BYOB. Then, my best friend from Texas is in town for the night, and we're meeting up. I'm definitely going to go tear it up and create some memories.
And yet..... all I can think of is to stay up all night creating a "replacement HN" just for Saturday morning. I just started using Django and it would be perfect for this. Must. Resist. Must. Live. Real. Life.