

Ask YC:  When is a relational DB the wrong tool? - jraines

The thread about Amazon's SimpleDB got me thinking about this, as did a recent night spent wading through some material on XML, XSLT, and XML Schema.<p>Since I've gotten into web programming, I've been  stuck (happily) in a LAMP and RoR mindset:  model data, normalize, create schema, lean almost entirely on SQL statements wrapped in whatever scripting language I'm using.<p>When have you forgone the relational db in your web apps with good results?
======
jackdied
SQL is good for doing arbitrary queries. Most web sites don't need to do that!

Look at duels.com (a simple player-vs-player game). On each page it needs to
know your stats/equipment and that of the person you are dueling. In an SQL
database you might store the person/stats in one table, the inventory in
another, and the fight record in yet another. In SimpleDB the stats and
inventory could be in one domain with fight record in another.

SimpleDB or even dumb file-per-player based systems don't pay the overhead for
allowing arbitrary queries so they scale better. Your SQL store will be slowed
down by the weight of inactive players and their inventories and all the
indexes and data they have lying around. Using in-memory caching (like
memcached) is a popular way to pay the SQL cost less often while keeping the
arbitrary-query ability on the back end.

And of course don't forget that SimpleDB is hosted by a farm of machines
tweaked and maintained by someone else. Your MySQL setup will never be that
cool.

Edit: and as for XML,if you add that to SQL, well now you have two problems.

~~~
pcowans
I'd caution against using the filesystem unless you think things out very
carefully. Typically you'll have a binary tree for the index in an RDBMS which
gives you log(N) scaling on lookups for joins etc. If you use the filesystem
your lookup time will probably be linear. You also pay a penalty in space for
the fact that files have to be a whole number of inodes, so having many small
files is a really inefficient use of disk space.

In essence, RDBMS are highly optimised for the sort of things people do with
databases, but you need to know what you're doing with them.

~~~
jackdied
Definitely, you don't want to scan the filesystem to do searches. But finding
user 1234's profile in profiles/1234.data is easy. In my experience this works
well (even with the wasted space) though operations like fsck start to get
painful when you have 1 million+ files.

An anecdote on bad linear search: in the 90s I worked on a system in C++ that
made heavy use of inheritance and other C-plus-plus-ery. There was a bunch of
file opener classes each that found files in a different way (regexp, straight
path, glob). They differed only in a method named match. The base class called
match on every file name in the directory until it returned true. That was
fine for regexps but even if you opened a file _by name_ it took N/2 syscalls
to readdir to find it! This worked OK until someone created a directory with
10k entries on a production box. Ouch.

------
davidw
Mnesia looks like something that might be interesting. I have never used it,
though, I'd be curious to hear more from those who have.

