

Scaling: what to worry about first? and then? and then? - niels_olson

Our goal is 'making med school easier, one less click at a time'. We have no business model, just trying to make our own lives easier. (tmedweb.tulane.edu). We went from bluehost to our own server on the university's network, and we stayed with CentOS along the way. The box is a spare dual-core dell desktop (highest load I've seen was about (0.30 0.14 0.10)), we have a 10 Mbps card and 100Mbps card (our current bottleneck), a little 500 GB RAID from G-Tech (total used ~ 8.4 GB, but if we take on lecture audio or video, this will get eaten fast), UPS (enough to drive in if needed), etc. We end up working a lot with the university's very helpful netsec guy for ports, bandwidth throttles, that sort of thing. We get about 500-600 visits a day.<p>I'm also getting ready for a different project, to take on the military's medical records system, which could get huge fast. So the tmedweb project is also a bit of a sandbox for what to do for that bigger project.<p>Gigabit ethernet is probably next on the todo list; what else should we be getting ready for? In what order to people generally arrive at which scaling issues? From where we're at now to "big". Not Google big, but, let's say, Navy big.
======
keefe
Here is my advice having struggled with scalability for the last two years,
please take with a grain of salt... Regarding the previous discussion of EC2
etc - EC2 just gives you headless linux boxes that you can do with as you
please. You have to configure the disks specially to gain persistence of data
or use S3 or SimpleDB. There is a lot of background knowledge of linux that is
going on. For scalability, I would first concentrate on the engineering
quality of your codebase. Get lots of tests and get them running nightly and
automatically. Automation of tasks is your friend, so if you don't already
know ant spend the 2 days it takes to learn it. To really scale well, you are
going to need a cluster of machines - this means your app is going to be
running in multiple address spaces so you will need to handle that issue - if
you haven't already done so, look up memcached. I think you should focus on
stability first, make sure your app can run "forever" - the two big blockers
to you there are memory usage and concurrency issues. Memory usage means
cleaning up after yourself and avoiding memory leaks, instrumentation and a
profiler are your friends here. For concurrent issues, you should do a
thorough study of thread management and make sure your locks are consistent.
Once you have all these stability issues under control, I think it is about
having good interfaces for key components of your code, identifying
bottlenecks in performance (again, instrumentaiton) and then rewriting these
bottlenecks. There is also the issue of making sure that you are transmitting
the minimum amount of data required and strategies for scaling up. If you are
doing work for the Navy, be prepared for fairly serious scrutiny of your
security practices as well - security, like scalability, it much harder to
tack on later. Last but not least, remember that premature optimization will
kill your time.

------
fusionman
Maybe you should focus on building your product and utilize a service that
will be able to scale better than you will be able to do on your own. Check
out Amazon Web Services <http://aws.amazon.com/> . There are a lot of big
sites using their EC2 and S3 services...you should check these out. That said,
I am not a scaling expert, so if you're hell bent on doing it yourself, good
luck! I'm sure somebody on here will be able to offer you "do it yourself"
advice.

~~~
niels_olson
oh, we have no desire to do it ourselves. We only started using our own server
because of the politics of hosting university content offsite. It makes the
relations with the professors easier.

I've been looking for ways to move storage and bandwidth to AWS, but just set
up my own personal account with JungleDisk a couple of weeks ago. What
utilities do you recommend for shifting webserver stuff to AWS? What can be
pushed? As I understand it, it's strictly blobs, and all data manipulation
still has to come back to the site's CPU for processing. Am I mistaken?

