

Database Sharding - Alternate strategy to overcome rank calculation - frosty
http://itsfrosty.wordpress.com/2009/03/20/database-sharding-basics/

======
nkurz
With due respect for a well written article, my first impression is that the
suggested approach misses the point. If a single machine will be able to
handle the load of the combined queries, then yes, you should replicate
instead of shard. But in this case, why shard at all? And what about the cases
that actually require sharding, where your rate of queries and rate of updates
don't allow you to run it all on a single machine?

It's possible my view is skewed. I've been looking at this problem from the
point of view of distributed full text search, where I don't see any
possibility of centralization in the manner you suggest. Still, I find the
solution of trying to handle this in the database API to be suspect. If you
know that some information will never need to be joined, why not have two
databases instead of splitting the tables?

