

Ask HN: On scalability and memory footprint - jgalvez

One of my co-workers is constantly arguing with me that we should not worry about memory footprint. For instance, let Rails use as much memory as it wants, the reasoning being that it would be pointless to try and cut down memory usage since we can just expand to as many slices with as much memory as we need on EC2.<p>Now, is there a significant speed increase in trying to keep memory usage low? What would be the educated answer for this? I could surely try to argue that yes using my programming knowledge and practice only, but I think a truly informed answer would require a deeper understanding of things.
======
ezmobius
Yes there is a big advantage to keeping your memory consumption low with ruby
apps. Ruby's garbage collector is not the best and has to walk all the objects
in the process when it GC's. The more memory your process uses the longer and
more often GC will happen. This will degrade performance the more memory you
use. In fact I've seen really leaky apps spend _most_ of their wall clock time
in the garbage collector.

Sure, throwing more ec2 instances at the problem is one solution, but if you
care about your apps performance you will try to optimize for smaller memory
footprint.

------
wheels
"We should forget about small efficiencies, say about 97% of the time:
premature optimization is the root of all evil." -Donald Knuth

That said, watch your _'n'_ s. If you scale linearly, you can keep throwing
servers at the problem. If you end up with some critical piece of your code
that's, say, quadratic in run-time or memory consumption, you can't. And when
you get there: profile, don't guess. Monte-Carlo code optimization is the
beginning of the end for an otherwise clean code-base.

~~~
alecco
Only consider that quote if you've been through all or most of Knuth's TAOCP.

Good design lets your site stay lean and mean.

~~~
ericb
While going through TAOCP is good advice, I think the quote stands on its own.

------
furiouslol
Here is my experience. I used Rails for one project of mine that requires
processing of millions of data rows. Because the Rails ORM create an object
for each data row, we end up using a lot of memory. We had to get a 2GB memory
server to hold up the project. Even after we avoided the ORM (which removes
the pleasure of coding in Rails), the memory usage was still high.

Sure, we could process the data rows outside of Rails in C but because the
processing of data rows is an integral part of the project, that would mean
coding 80% in C and 20% in rails. Not exactly an enjoyable experience.

So we rewrote it in PHP and avoid objects and use just functions and
hashes/arrays. And it worked very well for us. The site render time drops from
0.8s to 0.03s. Memory usage rarely exceeds 100MB.

~~~
davidw
You do know that you can connect to databases and do stuff in Ruby without AR,
right?

~~~
furiouslol
Yes. That's what we did. But the memory usage was still higher than the PHP
option.

------
tdavis
"Memory usage" is pretty vague. If you're just talking about writing poor code
that naturally requires more memory to store objects and such, it's not a big
issue if you can scale linearly. What you really need to watch out for is
memory leaking. This can cause processes to slow down dramatically and there's
nothing you can do to scale it when that happens; your app server will take
requests and probably even have RAM to spare, but the actual processing of the
request will take forever and your site will be sloooow. This is especially
relevant for long-running processes. I have accidently written memory leaks
into programs before that, despite having more than enough RAM to grow,
eventually slowed down to an absolute crawl. I'm talking 1 loop per second to
begin with and by morning it's doing 1 loop every 4 hours.

Premature optimization? Avoid it. Writing good code? Don't avoid it because
bad code is easier and memory is cheap. There are some times when it makes no
logical sense to allow something to use tons of memory, despite how much you
have available. For instance, if you request the same page from your site over
and over again and the memory usage continues to increase, that's probably a
bad thing. What is that process storing in local memory and not garbage-
collecting after a request? It's HTTP so there should be no persistence at the
web framework level (i.e. in Rails). It should manufacture a request, send it,
and move on to the next one, shedding all the local variables made in the
process. If your memory footprint is increasing while your real load is
remaining the same, that's bad. If you're storing the same object in memory in
5 different places, that's bad -- but not that big a deal so long as they're
all gone when the request has been served.

(Flamebait P.S. -- save memory, kill Rails.)

------
ashleyw
1) Build your app how you would normally build it (obviously dont expect to
put massive objects into memory for every request — just dont worry about the
small things)

2) Go back and refactor the things that really stick out as bulky

3) Deploy

4) Continue your release cycles and refactor stuff as you come across them

.

As the saying goes — hardware is cheap, your time isn't.

~~~
jamesbritt
"As the saying goes — hardware is cheap, your time isn't."

Don' t many small companies find that they have more time than money?

~~~
ashleyw
I guess, but that "extra hardware" wouldn't be that much. I was talking about
like 1 second extra CPU time per 50 page loads or something — I wouldn't worry
about that kind of thing, even if I had time to make my code slightly faster,
I'm sure I could spend my time better. Maybe I'd refactor it later on when I'm
adding extra functionality to that area of code.

I think if your a small company you shouldn't be worrying about small issues
like this, even if you have more time than money. It may be making your app
more efficient, but your doing it the same way people make apps bulky and
never get them out the door! Spend the time on things which truly matter, and
fix slightly slower code when its beginning to cause a problem. :)

------
ericb
If you're building a fairly standard web app you probably shouldn't be
worrying at this stage. Keeping memory usage low does not mean a significant
speedup necessarily. It means your memory usage will be low. Sometimes you
trade one thing for another depending on what you're doing.

When the app is written, if you run a load test (or you can let your users be
your load test, like Twitter), there will be a bottleneck for some n of users.
Remove this bottleneck, which could be cpu, database, memory, connections,
bandwidth, etc., and there will be a new bottleneck at some > n users. Rinse,
lather, repeat.

If you optimize now, you're most certainly optimizing the wrong thing. When I
load tested my first production rails app, I found cpu was the first
bottleneck I hit, not memory.

------
anotherjesse
I run userscripts.org, a rails site which runs on a single serverbeach box.

To keep memory usage low, I find recycling mongrel processes via monit can
help a bunch.

Before monit, my mongrels would hit 2GB quickly, after monit with a rule to
restart when they hit over 100MB, my memory usage is around 800MB for 15
mongrels. (I'm assuming you are using HAProxy or a similar balancer that can
deal with changes of availability).

<http://userscripts.org/articles/2-scaling-a-rails-site>

Another useful tip is make sure your queries aren't doing stupid stuff. A long
time ago I had integrated Beast (a rails forum project) into my site.
Unfortunately it was loading EVERY topic on the forum index page, which was
resulting in slowness as well as large jumps in memory usage. (The culprit was
a .last method that caused all the records to load and then grab the last
value)

So - I tend to agree with "don't worry" because you can "patch" over memory
usage pretty easily.

My site is currently getting 17req/s on a single box - it is never what you
think will be the issue that you have to fix.

~~~
jgalvez
That's the thing that making me nervous. When you have got to a point where
you need to periodically restart your processes because they somehow
uncontrollably start consuming more memory than they should, something must be
failing badly. Where there's smoke, there's fire.

------
iamelgringo
Your bottleneck in a web app is almost always the database. In some cases,
it's not. So, it depends.

If, like furiouslol, you're trying to access a 2-3 million row table and
Active Record is topping out at 2GB of RAM and taking close to 1 second on a
production machine just to render a page, then yes, you might need to worry
about memory usage.

But, if you're prototyping a site, and just getting it off the ground, I would
worry about memory usage later.

If you're really freaked out about performance, you can always try PHP or
Django. They tend to be a bit more sparing on system resources.

------
Hoff
Arguing performance and scalability and footprint can be less than fruitful; a
distraction.

If you're serious about this, build your test cases, and benchmark.

But before you invest here, ensure you have nothing better to do with your
time, and ensure that the probable payback can be justified against the
aggregate investment; against the costs of the testing and of the migration.
And nothing better to debate, for that matter.

