Hacker News new | comments | show | ask | jobs | submit login
Poor Man's Scalability (codypowell.com)
174 points by greenshirt on Oct 15, 2011 | hide | past | web | favorite | 61 comments

Cross posting my reply from another site:

> We were now rendering our site index in 1.5 seconds or less, compared to 3 seconds before.

This is the real WTF. What kind of queries take three seconds to run, that is ridiculous. Even with indexes 1.5 seconds is insane.

Deciding against premature optimisation isn't permission to throw away good, efficient design. It's clear from the rest of the article that these guys really have no idea what they're doing. Quoting the results of all their optimisations:

> In the end, we did not go down. The last round of load tests with httperf showed us handling 200 GETs a second indefinitely, with an average response time of about 1.2 seconds. We served thousands of requests without even a hiccup, and we did it without spending any money.

This is really poor performance, and it shows in the site when you change page or select a different category/filter. I'm guessing that they also chose a nosql solution for the wrong reasons, as it seems everything there could be done much faster/better by using an ACID compliant database.

Thanks to the author for taking time to write up his experience. That said, this is really basic stuff. How in the world could a home/index page have so many queries that it took 9 seconds? Oh, no indices... Then how could you have built a DB-backed web app and not considered indexing? A better title for the post might be "Scaling Mistakes I Made While Building Our Website and How I Fixed Them".

I'm very very reluctant to call those mistakes. From the author:

> I didn't get too upset here; I expected us to do poorly. After all, we'd spent all of our development time in actually building something people want, not on scaling up for thousands of imaginary users.

Perhaps it would be relevant to the conversation if the author could chime in with how many projects or ideas they went through that failed. Perhaps they had to do consulting on the side for income.

Premature optimization is a mistake. It's time wasted on delivering minimal value to you or anyone else.

Indices on a database aren't "premature optimization". They're part of understanding your data model, how it will be requested, and--more in a SQL database than a MonogDB database, but still--constraints on the data stored in your database.

Claiming basic development practices to be "premature optimization" is a fantastic way to paint yourself into a corner because of stupid decisions in your haste to "get it out the door." Your MVP isn't V if it still takes a second to render your homepage to users, because they'll leave. (Site responsiveness is a huge factor in bounce rate, even for sites that people actually want to look at.)

It's not hard to apply an index after the fact. You don't paint yourself into a corner by leaving them off initially.

And very often, you can't understand your data model and how it will be requested until you've actually built an app and gathered some data. Users will surprise you, and do things you never expected, and probably render 80% of your app (oftentimes, the 80% you worked hardest on) obsolete immediately.

I can attest to that. We have two main "sides" of our app, one side that allows for editing of rules and another that just applies those rules on transactions. It took us much longer to build the editing side than than the transaction side, and when we hit users, we found out that they didn't even WANT to edit the rules. That was more than half of our app that users weren't using, even though in initial mockups/user tests they indicated that they would edit the rules.

I consider Indices to be premature optimisation. Moreover, they're very low level, potentially dangerous premature optimisation. Imho, they should be the very last trick you use to optimise.

First, you try and figure out if you can reduce the number of repeated queries. Then you try and figure out if you can get rid of chunks of code that spawn lots of queries altogether, by tweaking the algorithm to do things differently, without needing that data. Then, finally, once you've used every trick to make it all faster, then you apply indices to speed up that handful of remaining queries.

If you apply indices first, then you won't spot the other potential optimisations, and when your indices stop covering up for your poor coding it will be much later, and much harder to fix.

No, caching should be the last trick you use to optimize.

Indices are standard way to achieve basic performance levels of a database. They may have their downsides but "potentially dangerous" is dramatically overstating the case. Furthermore, the dangers of premature optimization are about taking extra time or adding complexity to something that ultimately doesn't matter, not about using very basic features in a sane way.

The correct optimizations to make first are ones that make the biggest impact percentage-wise as well as being the most elegant in the code. Indices typically fit both categories very well. Unless you are doing a lot of stupid things, there's not going to be much lower-hanging fruit, but even if there is, after you apply sane indices that will be when your profiling will start to reveal the real interesting possibilities for optimization. The idea that indices are a good final optimization does not show much interest in real performance.

I would take swombat's advice to look at query patterns, eliminate repeated queries, and tweak your algorithm to require less data well before I thought about adding indices (or caching, which I agree is the last thing you should do).

You want to make the riskiest, most invasive changes first, because those are the ones that the rest of your codebase builds upon. If you've changed your query patterns and the app still isn't fast enough, it's relatively trivial to add indices on top of that. The speed benefits are cumulative, and none of the work you've done examining your data-access patterns has been undone by making that data access faster.

If you add indices first, however, and it still isn't fast enough, you have to examine your query patterns anyway. And this time, the work you did is undone by your further optimizations. The benefits of indices depend a lot on how you access your data: they slow down writes to speed up certain queries. If it turns out that the queries you're speeding up don't occur frequently anymore, your index is counterproductive.

(That said, since adding indices is often a task that takes 5 minutes of your time, you might want to give it a try and see what sort of results you get before investing a lot of effort in other stuff. If it doesn't work, you can back out the change and then start looking at your query patterns.)

Disagree on caching as the last thing you do.

If you're building an app that will hopefully be bigger than you expect, build cache into your data layer, it's not complex and will pay off sooner than later.

You should be thinking about indices as you design your data model. You're going to put them in eventually so the ad-hoc performance testing you're doing as you're building the site should at least somewhat reflect the final, real-world scenario. Better to know sooner rather than later that your data model is so broken that even extensive indexing can't make your queries fast.

That's just silly. Indexes are a fundamental component of databases and give the DB clues about how the data in your table is going to be used. You might as well saying having different column data types is premature optimization.

Should you spend days analyzing things and creating a million indexes? No. But you should have some idea how the tables will be used and setup a few of the obvious ones, at least. Read any book on DB admin.

I agree with you, but he was using MongoDB and AFAIK, indexing is a must with this kind of db. Actually, I hope that he does not expect a big write load because he seemed to have added the indexes pretty quickly without thinking too much about it...

Just to reiterate, with a RDBMS with a decent query planner, indexing early is premature optimisation.

Would someone care to explain the downvote?

Ahem, this is so wrong it's not even funny. I'm just calling it out because there are people here who are just learning how to code. You always create an index, on pretty much every query. That's as basic as commenting your code.

I'd hate to think how long your writes take...

Better a write with an index than a read without an index, my grannie always said.

Indexes can most certainly be premature optimisation. There's a cost associated with an index.

The cost increases significantly by doing it later on. Indexing when you have little data is easy. Indexing on a live site with millions of rows and lots of users can be much harder.

Taking it to that extreme is not a valid argument. There's a lot of room between not doing it first and only doing on a live site with millions of rows and lots of users.

That might be true for some esoteric indexes, but if we're talking about indexing customer_id on the invoices table, it's not premature -- it's freaking inevitable.

Yes--and these are the overwhelmingly most common case for a query in most applications. From reading the blog post, it seems like the blog author was either doing pretty frighteningly complex stuff just to render his homepage or he didn't think ahead enough to add fairly standard queryable indices to his collections.

Neither strikes me as terribly smart, and the latter strikes me as writing it right the first time, not "avoiding premature optimizations."

>Premature optimization is a mistake.

Wholly agreed and I often say the same thing. But I do not consider basic indexing on a database to be "premature optimization". I do consider it to be a a mistake since you almost certainly find performance issues (unless the table is small enough that indexing is a mistake), then you'll have to track down the cause of the issues (commonly called "a bug") and then will have to add indexes.

Oops, I think I didn't express myself very clearly. When I was sending 20 requests a sec in a load test, the average response time for all the requests was 9 seconds. In a normal case, the page was loading in a fraction of that.

Also, we didn't lack all indexes, just a couple of important ones. We've been iterating quickly on the site and we weren't analyzing the performance as our queries were refactored.

Please stop blogging and start reading about how to construct an application properly.


It's fairly basic stuff this, but important none the less.

Kudos for recommending Yslow (or PageSpeed) - it's indeed brilliant. You didn't mention much about http caching, which I think is probably as important as the other things you mentioned; Not only does it improve performance for the user, but it also reduces load on your server and it enables you to put up an edge side cache for extra performance.

As for the problems mentioned with the load balancer - You could have simply provisioned a new instance and installed your own load balancer. There's dedicated packages like haproxy, but you could also just put up nginx or lighthttpd. This machine can later double as your edge cache (Squid).

Btw. 1.5s to render the front page? That's way beyond acceptable, in my book. But I suppose it depends a bit on what the web site does.

> Btw. 1.5s to render the front page? That's way beyond acceptable, in my book. But I suppose it depends a bit on what the web site does.

Agreed. This will have an extremely negative impact on your user engagement. Now if this is measuring end user response time rather than server-side page generation it might not be quite so bad, but generally I think most common pages should target < 150ms.

So what would you both consider "the next steps" in page speed optimizations?

If you really do have a legitimate 1500 msec process occurring on the home page, try to flush as much of the HTML to the browser before you start the heavy lifting. Ideally, you could have the entire HTML shell delivered immediately, and the content elements slotted in at the very end (using DOM manipulation).

If you can put all the content inside a <script/> tag at the very end of the HTML, it will at least give the browser a chance to find and download all the page's resources (css, Jquery, logo image, etc etc).


  <head><link rel="stylesheet"></head>
    <h1>Home Page</h1>
    <p id="content1">cheap content</p>
    <p id="content2">place-holder content</p>
    <p id="content3">more content</p>
    $("#content2").html("expensive content");

Make sure that http cacheability is good (send proper http headers, e-tag and expires; serve assets from cookie-less sub-domain). But if the page renders in 1.5 secs, I would try to address that also. That's highly specific to the application and its technical platform, so I couldn't really say how to approach that - But break out a profiler and go for the big chunks for a start.

Impossible to say without looking at the profiled request. One thing I can say it's I would look at any page-level / HTTP caching until page generation is at least under 300ms for the most common requests.

Because i'm in a particularly pedantic mood: I don't think ELB (Amazon's load balancing) is free, which is a little misleading.

In fact:

* $0.025 per Elastic Load Balancer-hour (or partial hour)

* $0.008 per GB of data processed by an Elastic Load Balancer

Source: http://aws.amazon.com/ec2/pricing/

Obviously these are problems loads of people face / hope to face, so it would have been nice to get some more 'meat' in this post.

Edit: formatting

Neither is a second EC2 instance, or were they running that already? Was it just not running a web server?

No CDN? Adding CloudFront is really simple and the costs are insignificant.

CloudFlare is easier. And free.

Paying for CloudFront doesn't leave me wondering what their revenue model is and how it might affect how they serve my content.

Your story closely resembles the challenges we had scaling sommelierwine.ca before a media event at our latest launch location. I knew that our site would receive an increase in traffic, and the increase would only be temporary.

As a rails site hosted on Heroku, this gave us the ability to scale our site using the gambit of widgets that Heroku offers; but, we didn’t want to spend the money and cheat ourselves from the satisfaction of scaling our site.

Using New Relic (a gem available on Heroku) I identified our first performance bottleneck - database queries. Certain queries were taking over 30 seconds to complete! These database queries all involved a table of BLOBs (on the order of 20MB) that needed to be searched and sorted. I tried adding indexes but that only marginally reduced the query time. The solution we found included moving the BLOBs to their own table while keeping an index to them in a table of their attributes [1]. Doing this, we were able to reduce query times down to less than a 100ms.

The remaining bottlenecks we found using YSlow has helped us reduce the overhead in loading pages substantially.

Even though we were able to weather the storm of visitors to our site, we did leave some tasks for the next one including implementing caching, background processing, and remote file storage. All in time.

Does anyone have other wallet-friendly Rails/Heroku scaling stories share?

[1] This is database 101 stuff - a class I never took.

> [1] This is database 101 stuff - a class I never took.

Yes, yes, a thousand times yes.

The best and most important way to improve database performance is to have a sensible schema. Query planners can do magic with stuff that's sensibly normalised and with things that are sensibly denormalised.

Where they choke is on schemata where everything is just squished into a bunch of tables with fields holding internal datastructures.

For the new cedar stack, make sure you serve static assets from some sort of CDN, as each static file served ties up one of your dynos. With Rails 3.1, setting up cloudfront takes about 5 minutes and it's relatively cheap.

The next step after ensuring you're doing the basics (cdn, minimizing data over the wire, etc) for us was to throw the entire site behind varnish. Set a "don't cache" cookie if users login, add a comment, etc. so that their changes are visible immediately.

We took a slow site running on hardware, put it on EC2 (slower hardware obviously), and with varnish are saving about 60% cost of hosting, and decreased page load time on some of the popular pages by upwards of 50%, all for minimal engineering effort.

This seems like basic stuff. I'm more interested in the story about how you got this to the top of the front page.

Edit: Less obnoxious.

Basic stuff is valuable to people who need to learn the basics. Not everyone on HN is an experienced programmer, much less an experienced web programmer.

Thank you.

As I was scanning the comments on this post, I was a bit surprised by the all the "This post was too basic. You are clearly an amateur and wasting my time" comments. Well, to those authors, not everyone on HN is an l337 hax0r such as yourselves.

Yes, the author still has some work to do. Maybe if he spent another couple weeks just on this, he could get response times down. Maybe he should use a reverse proxy or maybe he shouldn't. As with most problems of this type, the answer is, it depends.

To a young programmer with a few years experience who may be working on his/her first high traffice website, I thought the blog post was fantastically written. It was clear, concise, explained well the low-hanging fruit of optimization, explained the difference between performance and throughput, and decisions-making/tradeoffs made when preparing for a traffic spike. Well done.

If you're a good enough programmer to know how to build the site, it's likely you know about profiling. Im surprised there are enough people who find themselves in this middle spot (can build complicated site but are still unaware of this info) to push the article up to the top page.

Having said that, nothing I submitted ever made it close to the top of HN. Maybe my understanding of the community is off.

Blackmail. I have pics of pg and Steve Ballmer sharing an ice cream cone, and I'll make them public unless I get the proper upvotes.

I actually don't know what happened here either. I was at a kid's birthday party, and I look on Twitter to see one of my cofounders had submitted this and that it was shooting up the front page.

My tone was meant to imply there was some funny business going on but more in jest than reality. Clearly people are finding this article helpful.

Nice job.

Cody, just one note on your website - I couldn't tell at all what Famigo offers without clicking on something. Instead of having "Famigo helps families find and manage mobile games and apps" in a hover-over pop-up, have a similar tagline somewhere people can see easily.

Thanks for pointing this out. We JUST pushed a change this week where we removed a huge banner saying "We help families find and manage blahblahblah" and replaced it with a few other things. Maybe we need to revert that!

"Moving the Javascript to the bottom" - wouldn't the best thing to do would be to invoke the JS from onLoad? And put the code in a separate minified JS file, so it can be cached?

Yes, but moving the actual script tag that loads that minified JS file to the bottom of the HTML can help in some situations. Less so now that most UAs have decent speculative parsers, but it used to be that seeing the script tag would block any resources below it from being requested until the script was done loading.

And being cached won't help for first-time visitors to the site, of course.

Nice writeup and valuable experience. Just curious why no extra money was spent? Spinning up more EC2 instances costs money, right?

With the rise of cloud services, and their relative inability to serve naked domains, there are a couple of easy fixes for this.

The first is to create a vhost in your webserver that serves an index.htm page on your NON-www url. The contents of that page would simply redirect users to the www.yourdomain url.

If that seems like too much work, an even easier fix (truncated for brevity) is to do something like this:

  var url = window.location.href;
  if(url.indexOf("http://www") != 0) {
You'd obviously want to capture the remainder of the querystring and add it in, but that's the general idea.

A more efficient and search-engine friendly approach is to perform a 301 redirect. For a single site, in Apache this would look like

    RewriteEngine On
    RewriteCond %{HTTP_HOST} ^example\.com [NC]
    RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

Why not plain old HTTP 301/302 redirect?

Please, please use a redirect. The GP's suggestion is not something you want to do.

Even better, just use http://wwwizer.com/ (look for the free redirect service banner on top).

ooh, this looks really problematic on multiple levels (why wait for JS etc).

A simple solution, which I've used--and I think the OPs situation is similar--is to just have an additional A-record pointed to the ELB, this should work unless the OP is doing something really fancy. No need for route-53, even go-daddy will let you do this.

However, route-53 gives you much better TTL, so if you ever need to re-route..

ELBs change IP if they have to scale up. You really shouldn't point an A record at one.

...which is why they recommend you never use the ip, instead AWS provides a hostname, which is invariant...

Which is why you can't use an A record, as I stated.

The other thing to do is to specify your canonical page so search engines know to use the www subdomain:


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact