
Ask HN: Where/How can I learn more about general webapp maintenance? - ha470
Particularly things like fault tolerance, speed, unit testing, code abstraction, etc.<p>I've been hacking away/making webapps for a while and although they're functional they aren't optimized for heavy use. Now that I'm working on a startup/production application, I've realized that general maintainance is a skill-set I really haven't perfected. Any good sources for these - especially related to Ruby/Rails? I'm working through code-complete/pragmatic programmer for general good code practice, but would love some other books/blogposts/anything on the matter.&#60;p&#62;Thanks so much!
======
mattmanser

      1. Database, database, database
      2. Database
      3. It's probably your database unless you're doing something CPU or disk intensive, for example resizing and prettifying pictures, rendering a 3d image, etc. Realize that these days looping a few thousand times is trivial.
      4. It's your database
    

Starting with _fault tolerance, speed, unit testing, code abstraction, etc._
is starting at the wrong point. 90% of performance problems are at the
database level.

What's most likely wrong with performance in a webapp:

    
    
      1. No/bad indexes/missing foreign keys on your db
      2. Stupid joins
      3. You're doing stupidly complicated things in an ORM
      4. You're doing complicated things in the DB instead of loading a large chunk into memory and doing calculations or aggregations in code. Simple DB queries are a lot faster than you think they are. Complicated ones are a lot longer than you think they'll be. There are weird gotchas in DBs like using a function in a where or select clause will cause a massive performance hit.
      5. You're not caching into memory or something like memcached things that changes infrequently but are queried regularly. Memcached is actually overkill most of the time. Actually think about how much memory storing X would take compared to how much memory your machine has. Be surprised at how insignificant it is these days.
      6. 1-5 are especially true if you're using Mysql - It's great and all, but all the other big DBs piss all over it for out of the box performance. You have to give it some love. I expect some dissent here. They're wrong. MS SQL is a shit ton better at handling a poorly designed db/db queries out of the box than mysql. I can't stress this enough.
    

tl;dr Start looking at your database performance before anything else.

~~~
jtchang
Have to agree here. It is much easier to screw up a database schema than
application code.

I am talking in terms of performance here. Also, I always feel that if you
understand the schema (whether relational or not) of an application you
understand the domain.

~~~
ha470
This seems like the common point in each of these comments. Optimizing my
ActiveRecord queries will be my first todo.

------
pbh
I'm not an expert, but this is what I've cobbled together as a fellow
Ruby/Rails startup person.

 _Fault_ _Tolerance_ : Just use Heroku. We've seen maybe an hour of downtime
in a few months?

 _Speed_ (and fault tolerance): Cache everything you can, assets on S3/CF.

 _Testing_ : Rails Test Prescriptions by Noel Rappin
[<http://pragprog.com/book/nrtest/rails-test-prescriptions>]. Then you can
choose what you like, but I like Test::Unit, Mocha, FactoryGirl.

 _Code_ _Abstraction_ : Rails is already pretty sensibly organized, and if MVC
+ tests + static assets is not a good fit for your webapp, you really should
not be using it in the first place. One minor point: Noel and others will tell
you to use skinny controllers.

Curious what other people consider best practice for Ruby/Rails startups.

~~~
rbranson
"Cache everything you can"

This is terrible advice. Telling developers to cache everything is like
walking into a rehab clinic with Lindsay Lohan's purse.

Cache what you can measure as having a performance problem, which can't be
optimized in another way. It should always be used as a last resort.

~~~
pbh
Sure, of course, premature optimization is the root of all evil and all that.
I presume you're reacting to a bad experience in the past you had with some
young developer not understanding database indexes or memoizing everything
unnecessarily or something. I certainly don't want to create another one of
those experiences!

That said, he asked about how to get web application speed. One major answer
(as discussed elsewhere in this thread) tends to be "add a cache somewhere."
Of course, you have to architect things right so the cache works (possibly
with a layer of indirection like app servers, or figuring out where a cache
would be helpful, etc.) and the code that the cache is obviating needs to be
slow in the first place.

That said, the general rule (add a cache for speed and scalability) has been
the case from Slashdot to MovableType/Wordpress to Facebook to App Engine.

~~~
rbranson
Caching is probably one of those things that if you have to be told to do it,
you probably shouldn't be doing it.

------
bricestacey
I've heard good things about New Relic for performance testing.

MiniTest, which is built into ruby 1.9 has a full suite including performance
tests. It's really cool because you can measure e.g. whether an algorithm
scales linearly or not.

~~~
rprime
New Relic's RPM has became an important asset in all of my applications, plus
we heavily use it for a Rails app that receives ~5 million hits per day. The
best way to track performance issues, and that's in production.

Tip for OP: As you might already have been read on the internets, worry about
scaling later, because there's a very high chance that your app will need
special care and will differ from other people cases.

~~~
ha470
Ah, thanks for the input on scaling. Good call - will worry when it comes.

------
rprime
A quick tip would be to run everything process intensive in the backend, don't
waste frontend resources with API calls, data processing etc, got something
that takes more than 200ms to run, place it in a job que and do the processing
later.

Some tools for the job: <https://github.com/defunkt/resque>
<https://github.com/tobi/delayed_job>

Also don't forget to cache. You shouldn't worry too much if you do these two
properly.

~~~
ha470
Thanks! Using delayed job already for certain API calls, but will think about
how to use them more.

I think I definitely have to look into cacheing.

~~~
rprime
Basically if one of your controller has more than 10 lines of processing and
you don't need to present the data to the user immediately, you can put that
into a job, for example when a user register, you can add the new record to
the database and redirect him, but you can send the welcome email 1-2 minutes
late. And for presenting data, yeah, go cache everything, also take a look
inside rails partial cache features, for example in a page, even that the
results are dynamically generated, you can cache things like categories for
example, or the blog post, and retrieve just the comments.

------
nfm
From a performance point of view: test and measure! Find the worst bottleneck,
and reduce it. Rinse and repeat.

NewRelic is a great tool to help you do this.

Common performance problems to look out for in Rails are:

* Missing database indexes

* Long running code (eg PDF generation, file uploads to a third party) that should be put in a job queue

* Innocent looking ActiveRecord calls that use N+1 queries or fetch way too much from the DB and the result set is further reduced in Ruby

------
guard-of-terra
You can read on various massively scalable webapps architecture.
<http://highscalability.com/> seems to aggregate those, and you can find
articles and slides published by the developers of more services if you do
some googling.

The key to speed and load tolerance is massive multilevel caching; every
service does caching differently but they all do.

------
etothep
Pick up Release IT (<http://amzn.com/0978739213>) from the Pragmatic guys.
While many of the examples and stories are based on Java webapps there is
something in there regardless of which platform you are building atop.

Oh, and don't ever establish a blocking connection without a timeout or some
other mechanism to abort it.

