In addition, I allocate the workers to their own instances so that they can grow independently. Also I use Capistrano to automate deployment. Whenever additional resources are needed, simply clone the instance and add it to the load balancer.
I use 2 Nginx workers and 8 Unicorn workers in a single app instance, and they serve in HTTP at 80 and 443 (yes they respond in HTTP via port 443 and the app should assume it's HTTPS). The app instances are firewalled in a private network and they use private IP addresses.
The load balancer has two networking interfaces, one public-facing and one private-facing. It's basically a reverse proxy serving traffic in both HTTP and HTTPS (which will be forwarded to HTTP:443). So the load balancer handles the encryption entirely.
The assets get compiled during deployment, and everything gets cached. I bundle all the CSS and JS into a single package, so that there's nothing to load other than the HTML and images (natively async) in subsequent visits. The website is simply way more responsive when deployed this way. I also use CloudFront to host the assets by simply pointing the cloud distribution to the website itself as the origin. Rails will handle the assets host automatically as well.
Whenever I need to deploy or manage the database directly for any reason, I will need to connect to the private network via VPN. It's secure and just neat.
If I need to build a local mirror, I'll install another nginx box, connect to the VPN, and reverse proxy HTTPS requests to the HTTP:443 directly. This reduces the HTTPS handshaking latency while still maintaining the security. And it just works.
EDIT: Added a link to make it easier for the curious people.
this does not seem to be "ready for horizontal scaling in the future".
I don't know whether you have multiple slaves, and TFA itself says that they'd scale the database _vertically_, I'm just stating that there's a pretty important mismatch between what you say and what the article says.
Of course, it's easy to add a master-slave replication to a single database instance when the need arises, even possible without downtime. So I assume most RDBMSs to be horizontal-scaling-ready.
The difficult part is the app stack. We need to be able to deploy more instances in minutes, not hours or days. If at the beginning we cram everything (Nginx, Unicorn and MySQL) into a single dedicated server it can be really difficult to increase the capacity in the future without downtime or additional expense. Instead, even if you don't need the power of several dedicated servers, it's worthwhile to virtualize a single server and split into a few boxes and make them horizontal-scaling-ready. Then you can clone, resize and migrate the instances.
This is what I do to increase the capacity of my app in a few minutes:
1. Create a new instance, and load the pre-built app image
2. Assign an IP address and add it to the DNS (like app-3.nameterrific.net pointing to an internal IP)
3. Add the app server to deploy.rb (for Capistrano)
4. cap deploy
5. Add the IP to load balancer
Really? I think the database is usually the bottleneck in most web apps. That's why you cache data: to avoid hitting the database.
I'm using Nginx as the load balancer, and there're no session problems for me because I use the built-in secure cookie store.
Cache me if you can! If you're a high-traffic site and you are not caching static content (I recommend Varnish highly), then your are throwing money or good user experience out the window.
I think in a growing site with a lot of custom per-user content (like a social network) the extra complexity of a cache layer and managing expiry is more pain than it's worth while iterating the product quickly. If you're mostly a content site, it's definitely the #1 thing you should be doing.
Realising that we're both at the same time, depending on the user or page, means that sometimes the cache stuff is the right thing to do, and sometimes not. I was leaning too far towards not, and happy now with the balance we've picked.
(We have a separate cookie that is present for signed-in users, so the fronted knows whether it should fire the annotation request.)
The result is that we can serve a sudden influx of unauthenticated users (e.g. from Google News or StumbleUpon) from nginx alone, which gives us massive scale from very little hardware. It's likely that the network is actually the bottleneck in this case, and not nginx.
An extra AJAX request grabs the users logged in status, CSRF token and similar data as JSON and then modifies the page so the user sees what they expect (a logout button, a comment form, etc).
If you're on Django I think cache-machine does the same thing. There are some things you won't get cached this way that you could manually (functions and procedures), but I think they're both conservative enough that you won't return stale resources.
One of the great things I love about Rails 3 is how easy it is to start caching. It can, and does, cause issues so make sure you're including caching in your test suite. I'm definitely looking forward to seeing how Rails 4 pushes this forward.
And I agree completely. Varnish is goddamn magical and a fascinating exercise in a principled approach to kernel-oriented memory management (a mindset I don't normally undertake).
This sort of information helps people building the next group of startups save both time and money.
Keep up the great work!
It read more like marketing for the company than anything of much worth.
Don't use the words "web scale".
It is a meaningless term. How many requests per second and how many guest and user sessions for how long of a time is "web scale"? Does "web scale" just mean Christmas shopping traffic on your crappy e-commerce site that no one visits, does it mean you can survive being top link on HN on a Friday morning, or slashdotted, or DoS attacked by anonymous, or survive a fucking flood and the data center lifts off the ground and enough flexibility and strength in the trunk lines to handle a tsunami?
Don't just throw users at it.
Unless testing is very costly and you need every user's eyes on your untested code as possible, that is just stupid. Look at Tsung, Grinder or JMeter, or the many other ways you could generate load as a first step before you do that.
Don't gloss over the details.
Sure you said you were using Rails 3.2 and postgres and a tad bit about the architecture, but who in the hell doesn't know that you need to load balance, need to put the DB servers on different VMs/servers than the apps. Although- having everything on both and some stuff just not turned on and live is not a bad idea for emergency fault tolerance, and you didn't mention that.
1. Is it possible to spin up two apps on Heroku?
2. What load balancers are available with the above?
3. Anyone have a link to a run down of how to back up your Postgres DB periodically?
"Pgbackups" answered this question.