

Making Highrise faster with memcached - bdotdub
http://www.37signals.com/svn/posts/1506-making-highrise-faster-with-memcached

======
mdasen
A lot of the time, memcached won't make things faster: it's meant to make
things more scalable. On small sites, memcached won't increase your speed, but
when you start getting lots of users, it's wonderful!

Memcached allows for O(1) lookups. Your database server probably uses b-tree
indexes which are great, but they're log(n) lookups. If you aren't dealing
with a large number of hits per second, your database will do very well. Once
you start getting a lot of lookups per second n * log(n) starts looking a good
bit slower than n * 1.

Memcached also allows you to scale effortlessly simply by adding servers.
Databases can do replication, but only to a point. Remember, even in multi-
master replication schemes, every write must be done on every server.
Memcached shards your data so that the writes are only done on one server and
the load is distributed and knows where to read based on the hash. And every
time you're able to read from memcached, you're lowering the load on the piece
that is less easily scaled - your database. So, even non-cached loads should
get better simply because some of the lookups that would have gone to the
database are now hitting memcached.

~~~
palish
_Memcached allows for O(1) lookups. Your database server probably uses b-tree
indexes which are great, but they're log(n) lookups._

A B-tree lookup also requires reading from the harddrive, which can require a
_lot_ more time than reading from memory.

~~~
mdasen
Why? There's no reason that a b-tree must be stored on a hard drive. In fact,
there's no reason why MySQL can't exist entirely in memory for read purposes
(clearly you want writes written to the disk). I'm not going to comment on
configuring specific databases, but there's nothing about b-tree data
structures that means you couldn't create an entirely in-memory data store
that used b-tree indexes.

The difference between disk and memory storage is MemcacheDB and memcached.
RDBMS love to leave stuff in memory if they can and that makes the reads skip
the harddrive.

~~~
palish
You are technically correct. My words were imprecise -- indexing into a B-tree
does not _require_ accessing the harddrive. But it is a rather large
_possibility_ , and it's something to be aware of. From Wikipedia (
<http://en.wikipedia.org/wiki/B-tree> ):

 _"B-trees have substantial advantages over alternative implementations when
node access times far exceed access times within nodes. This usually occurs
when most nodes are in secondary storage such as hard drives. By maximizing
the number of child nodes within each internal node, the height of the tree
decreases, balancing occurs less often, and efficiency increases. Usually this
value is set such that each node takes up a full disk block or an analogous
size in secondary storage. While 2-3 B-trees might be useful in main memory,
and are certainly easier to explain, if the node sizes are tuned to the size
of a disk block, the result might be a 257-513 B-tree (where the sizes are
related to larger powers of 2)."_

So yes, "there's no reason that a b-tree must be stored on a hard drive".
There's also no reason why a car couldn't also have a built-in toaster. :) As
far as in-memory data structures go, there are better choices than B-trees.
Why use a suboptimal data structure?

------
soundsop
_People are not going to feel the difference between a page rendered in 50ms
and one rendered in 100ms._

I don't think that every 50ms difference is unimportant, but the difference
between a 50ms render time and 100ms render time may be unimportant. Anyone
have a feel for what the maximum render time is for a page to feel fast?

~~~
river_styx
One thing they're missing here is that optimization is also about optimizing
resources, not just the speed of an individual request/response cycle. If each
request takes 50% less time, the app has 100% more resources to handle other
requests. So even though the individual users don't feel a difference, the app
as a whole is much more efficient.

