This is what it always come down to for me. I know that when we can get our app running efficiently with reasonable memory requirements on a single core, then scaling it is really just about launching processes and load balancing them. Scaling out boxes is always easier than scaling out code -- for better or worse.
I'm not talking about load balancing servers mind you, but applications on any given host. Two layers of load balancing. When I'm comfortable with the resource requirements I launch as little or as many of the app in a predictable manner across any number of nodes. It's comforting to know that given X, Y core and ram, I can run N services.