

Ask YC: Scaling questions. - izak30

Ok, so, from most posts on the subject, it seems that people around these parts like AWS.  Contrary to that opinion I chose (mt)(gs) to build my application on, this was partly due to funding, and the fact that mt is solidly $20/mo, every month.<p>The question: I'm growing, and I need to know how scaling _should_ work.  I have a hosted, subscription based cms, there is one version of the toolset, hosted on the same server as a number of domains using the tools, I also provide e-mail via webmail and IMAP and POP (via (mt)).<p>As everybody knows, (mt) is a little slow, and a little prone to failure.<p>My current direction (needs work): When I start to max out one (gs) account, get a second one, run two copies of the program. (repeat as necessary)<p>This, to me will cause many headaches if I have more than say, two servers. (could be quite soon)<p>So, if my option were AWS, how would that change? I can't just add resources to an already running server, right? this just-in-time stuff is pretty new to me.<p>Would it be running the same way, but n servers could pull the app and client files from S3?<p>Could I have n number of servers pull an app, and files, and server x runs the app or starts a new server based on load?<p>How do the DNS issues work for AWS?
What if my users want webmail, should that all be on a separate server instance, or can I scale  that too?
======
mpc
Have you considered re-architecting to leverage the processing power of the
client? Shifting where some of the computation takes place can help you get
over some serious scaling problem.

Case study from past experience: imagine a nested commenting system such as
this. Imagine it with 10x the data and 20x the usage. By switching from
building everything on the server and sending a massive html file, one could
simply request a json file on load and then build the comments dynamically on
the client. An architecture shift such as this would result in a serious
improvement for network bandwidth, cpu, and ram.

~~~
izak30
Yeah, I am actually already doing quite a bit client side, I have no idea what
got you downmodded, btw, but as far as scaling for overall bandwidth, I still
can't use one server forever, no matter how the software is architected.

------
showerst
I think as opposed to looking at how your hosting should be structured, you
need to pay more attention to how your application is structured.

As carpal said, where are your bottlenecks? Is your app structured with one
Web server talking to 1 or more replicated database servers, or multiple 'App'
servers that can all take many requests or ... ?

In my (rather limited) experience the most common scaling problem for apps is
database performance, so it may be in your best interest to get a host where
you can change your DB config to up the (rather conservative) default resource
limits. (And make sure to take a good hard look at your indexing and queries
first)

If you're having RAM or CPU problems, you may just need a beefier server, and
if it's bandwidth or IO then perhaps a scalable service like AWS may be a good
solution.

AFAIK The key for most small growing apps is structuring so that your
bottlenecks are linked to sharded databases, so that you can just throw more
hardware (virtual or physical) at the problem when you hit some limit of
users, possibly even on the fly.

I think i may be wrong about this now, but at least from early trials it
seemed like most people using S3 and AWS were using it more for backups and
backend processing/data mining, not on the fly work, due to data integrity
concerns.

I'd also recommend eventually moving to a more full-featured host, once your
business can sweat the $300-500/server a month fees you'll run into with a
host like Rackspace.

Having great backups and the ability to get someone on the phone at 3am with
no hold is essential when your server the only thing paying your bills =P.

~~~
izak30
>AFAIK The key for most small growing apps is structuring so that your
bottlenecks are linked to sharded databases, so that you can just throw more
hardware (virtual or physical) at the problem when you hit some limit of
users, possibly even on the fly.

Joshwa's link at the bottom of the page suggests just the opposite, using many
cheap servers, and distributing well, so it's not only a 'program
architecture' issue, but also an issue of where your data is stored and its
redundancy there, in 'shards' of cheap servers.

~~~
showerst
That's actually exactly what i was trying to say.

You set up your app so that you hit a limit of users (or whatever your main DB
service is) and so you add another cheap slave to the pool, as opposed to
having one or two super important servers that need hardware upgrades to
expand.

------
joshwa
Using AWS doesn't magically make your application 'scale'. You still need to
architect it in such a way that you can add additional resources (app server,
db server, web server, etc) to accommodate your load.

What AWS (or slicehost/other VPS) _does_ take care of is the server
provisioning/bootstrapping process-- if you have images set up for each of
your server roles, you don't have to go through the headaches of calling up
your server company, ordering servers, doing burn-in testing, etc. You just
whip out your CC, and 10 minutes later you have a working server. (or, in the
case of AWS, you can automate the whole provisioning process and scale in
near-real-time according to load)

For your situation, where you're hosting domains, you probably want to have
one or more frontend/"master" boxes hosting the domains, but internally
forward the traffic to whatever server is actually hosting that instance of
the app.

~~~
izak30
I wouldn't suggest that anything is 'magic' about any web service, and AWS is
just an example, This is more of a web theory question.

Most startups with low funding start out on cheap servers and then scale to
multiple cheap servers (google), if that is the right answer, then multiple
(gs) accounts _could_ be better than a single (dv) acct or even multiple AWS
accounts, but if there is something about another service that makes Scaling
easier with that service, or if there is some way that you architect your
program so that you can easily scale across multiple servers (and not just
have multiple instances of the program on multiple servers), i.e. to divide as
necessary rather than to pre-divide your resources, is that the way to do it
from the start, or does everybody start very very small?

~~~
joshwa
>if there is some way that you architect your program so that you can easily
scale across multiple servers (and not just have multiple instances of the
program on multiple servers)

See this article:

[http://highscalability.com/unorthodox-approach-database-
desi...](http://highscalability.com/unorthodox-approach-database-design-
coming-shard)

In fact, see that whole site. There's a lot of stuff there.

~~~
izak30
Thanks.

------
izak30
Ok, so same question, different form.

Is anybody managing servers in-house? When PStamatiou points out that you can
have a cheap, low power PC for less than $200 (I'd make the majority of them
diskless, for less than $150 now), what kind of break-even point is there? Has
it been worth it to you to deal with hardware headaches for the $$? At that
sort of price, You're paying amazon or linode or something to deal with the
hardware problems, to have generators, etc.

What if I would host, on my in-house datacenter 1 other app, in exchange for
having some other SysAdmin from YC host my app?

The DNS issues with using EC2 as a webhost were really (EC2 servers are not
directly addressed) pretty unfamiliar to me before I wrote this parent post.

I think using in-house cheap processing and S3 for storage/major bandwidth
isn't a bad idea at all. Any suggestions?

------
jey
What are mt, gs, and dv?

~~~
PStamatiou
Media Temple (company) Grid Server (low-end server offering) Dedicated Virtual
(mid-range server offering)

~~~
Zak
To be a bit more accurate, GS is fancy shared hosting that lets you do things
like host Rails apps on Mongrel on a separate server from your static site. DV
is a managed VPS - nothing more, nothing less.

If you're not afraid of server administration, there are more cost-effective
VPS options. That said, I have a customer on MT, and their service and support
are great.

------
webwright
Link to the site? If it's a content play, 1 server at ServerBeach could last
you a long time.

If it's a video sharing site, not so much.

------
PStamatiou
"As everybody knows, (mt) is a little slow, and a little prone to failure."

I think that's a slightly ignorant pov. You can't judge all of (mt) with your
(gs) experience. I've been on a mid-range (dv) for years and it's been nothing
but kittens and daisies. Which brings me to this point - why get 2 (gs)'s and
not just get a more powerful (dv)?

~~~
izak30
Thanks for the insight, "as everybody knows" was a bit presumptous, but it
seems to be the feeling around here, and it's been my experience that once the
west coast wakes up, my times get much slower. I know from your other posts
that you've overall had a very positive experience with them, but, one (dv)
still poses the same problems as above with multiples once I need them.

~~~
llimllib
Can I have a glossary of what the heck you're talking about?

~~~
izak30
AWS - Amazon web services (cluster server and simple storage)

(mt) media temple hosting

(gs) media temple grid server (their lowest offering, akin to mosso)

(dv) media temple dedicated virtual

------
carpal
It depends on what your bottleneck is. RAM? Processor? Bandwidth?

If it is RAM, you'd be better off getting fewer servers with more RAM on each.

If it is CPU, you'd be better off trying some software configuration options
before scaling horizontally.

If it is bandwidth, you'd be better off getting a server with a fatter pipe.

------
dawnerd
If you call mediatemple, they can help you setup a system that works exactly
the way you want. The grid is a good starting point though.

~~~
izak30
>If you call mediatemple, they can help you setup a system that works exactly
the way you want. The grid is a good starting point though

Yeah, after an hour on hold (with no real person at all talking to me) I hung
up. I can't deal with an hour of hold time for tech support.

------
agentbleu
I just set up a new linode, This is like 30usd a month for 512Mg ram / 300G
BW, support rocks as its via irc and other users, and you have total control
over the setup, plus it supports rails although I use LAMP. But its great
having the control and worth learning how to set up the server as you like. I
also use vitrualmin to as a control panel, makes very elegant sys.

~~~
rms
That's the cheapest 512 meg VPS I've seen... I might switch instead of
upgrading my VPS on slicehost

