Hacker News new | comments | show | ask | jobs | submit login
Lessons Learned While Building Reddit to 270 Million Page Views a Month (highscalability.com)
43 points by ai09 on May 17, 2010 | hide | past | web | favorite | 15 comments

"By far the most surprising feature of their architecture is in Lesson Six, whose essential idea is: The key to speed is to precompute everything and cache It."

That made me LOL. Is this really surprising? A symptom of years of high-level web development or similar... ?

This is /the/ classical optimisation strategy - since the day of pen and paper computations (anyone remember "log" books?).

> Steve says a lot of the lessons were really obvious, so you may not find a lot of completely new ideas in the presentation.

He did also state that he went straight from Uni into this. So while some of the lessons might be obvious they may not have been to him to start with.

I think twitter does the same.

It appears they should abandon their relational database if they are not using any of relational features.

I think they use Cassandra as there database so they already have.

I believe they're using Cassandra a sophisticated caching layer. It's not the "canonical" data store, but replaces Memcachedb.

I don't know why people keep using reddit as a reference for how to do scaling. Reddit is a slow website with frequent outages. It should be a how not to example if anything.

Well why not? They know much more than I of the problem because I don't have a website in the top whatever that generates that much traffic. I would guess that neither do you so it's worth listening to what he has to say. Each story is different and you'd be silly to ignore someone's experience on the matter if it is relevant to you.

For example who wouldn't like to hear the stories (possibly horror stories) of the early Twitter days? Even though there was lots of downtime precisely because of that reason there is a lot to learn.

You see this with any service as it scales up, that's where the learning occurs.

As the number of requests climbs you reach limits, those limitations manifest to the end-user as suckage, slowness, errors, and in the worst cases lost data. You can't plan those limits out ahead of time, you discover them as you go.

I'm not sure they are using Reddit as a reference for the correct way to scale.

>I don't know why people keep using reddit as a reference for how to do scaling.

I dont know if people really use them as a reference, but Reddit is one of the few sites in the sweet spot of getting a lot of traffic, and being pretty open about their architecture. Any presentation done on behalf of Reddit (pycon '09, '10, this) is great fodder for the technical Reddit users, such as myself.

Not sure if you know this, but Alexis Ohanian is a member on this forum.

Not sure why that matters.

Indeed - especially since I've retired from reddit. Though, I do still feel obligated to respond to complaints/criticism; in this case, though, I can't comment on technical decisions - except to say that reddit's recent scaling problems are likely more to do with their dev count (4 - yes, there are four developers working on reddit. That's 1 dev for every ~2Million uniques a month) than technical decisions.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact