

The SQL Trap - petewarden
http://petewarden.typepad.com/searchbrowser/2009/06/the-sql-trap.html

======
DenisM
On the flip side if you know your SQL you'll be running circles around the
competitors who are manually coding up all those joins, sorts and enforcing
the data integrity.

It's kind of like lisp in that regard...

~~~
skwaddar
man 1 join see also [http://www.strozzi.it/cgi-
bin/CSA/tw7/I/en_US/nosql/Home%20P...](http://www.strozzi.it/cgi-
bin/CSA/tw7/I/en_US/nosql/Home%20Page)

------
kragen
Was a nice surprise to open the page and see my wife's photo headlining it!

I agree that there are some things SQL databases aren't suitable for, but it's
very often the case that something that's too slow in SQL stops being too slow
once you tweak the right thing. When that's possible (as I said, that's very
often but not always), actually doing the tweak is much simpler than
reimplementing the subset of a DBMS that you need, but learning enough about
the DBMS to figure out what to tweak may not be.

Here are some things I haven't found a good SQL solution to:

* Inverted indices. Fortunately, this is a big enough use that a lot of SQL databases these days have a full-text search extension of some kind slapped on the side.

* Graph data. This is kind of the same problem; the issue is that you can store your RDF or whatever in a table with _subject_ , _verb_ , and _object_ columns, but not only are N-way self-joins an awkward way to query it, the SQL optimizer isn't smart enough to figure out that some verbs are orders of magnitude more common than others, and it should plan accordingly. (Not in PostgreSQL or SQLite, anyway, and I don't think in MySQL either.)

* Data of many terabytes. Use Hadoop.

------
dxjones
I wonder if the author could share something about his startup/web/app that is
so ill-suited to MySQL and so much more efficient when implemented using a
simple key-value store, plus algorithms in PHP?

I'm just wondering whether this observation has more to do with the nature of
your "computational problem", or maybe just a difference in skill level with
MySQL queries vs PHP algorithms.

~~~
petewarden
Probably a bit of both - I'm no SQL wizard, and I'm indexing 300 million+
Twitter messages and storing the @reply relationships (or more generally
analyzing massive communications stores, eg an organization's email server).

I'll try to put together some reproducible code, but for example I had a
relationship table that once it grew into millions of entries, took noticeable
wall-time to simply access via a primary key. Switching to InnoDB after
spotting a note in the memcache FAQ that it supports much faster primary key
fetches fixed that, but before that I was looking into the black box with no
good leads on fixing it.

~~~
falien
Sounds more like you needed to read a book about mysql + database design. Its
good to come up with workable solutions using what you know. Its better to
figure out what you need to know for a great solution and go learn that.

That's my philosophy anyway. And I would imagine reading just enough to know
whether you really should start from scratch again, or read much more to grok
the required intricacies, would be worth it.

------
lsb
This is a false binary. It's not just using SQL vs using a Key/Value store.
Caching/denormalization speeds up almost any algorithm; store a query result
into memcached to make it even faster.

Scale up and get the biggest box with the most RAM you can find, and if you
can afford your entire dataset in 15 GB of RAM for $300/mo in a 3 year
contract, congrats, use EC2 Extra Large and you're golden.

------
kevbin
The PHP trap is worse than any SQL trap.

------
javert
Newer hackers should know that parsing text files is a good alternative to
using a database for many web apps, esp. in early stages.

~~~
patio11
Coming from a Java & Rails background, with my SQL being non-existent as of 2
years ago and only fair to middling today, I'd have to disagree with that for
anything but trivial one-day apps. (And with SQLite around as an option I'd be
reluctant even then.)

You can use Rails for simple apps without learning SQL or worrying about your
database's internal black magic. It is about as simple as @user =
User.find(id) -- you don't have to worry about what poor animal had to have
its entrails removed to get that record.

You also didn't have to:

1) implement a parser (which is going to break the first time you add an
I-can't-believe-its-not-a-column)

2) write a migration framework (so that you can port your data files when you
break your parser)

3) deal with simultaneous access (which, lets be honest, you're going to
either ignore or whip up an implementation of locks which is exactly as buggy
as every other programmer's implementation of locks which is why locks are
right up there with crypto in the pantheon of Here Be Dragons, Use The
Library)

4) build your own structure for separating your I/O from your application
logic (which will asymptotically approach ActiveRecord, except with less
features and more suck)

5) do a rewrite and/or significant rearchitecturing when you eventually are
forced to move up to a database

~~~
lsb
SQLite is targeted, actually, at specifically those people who would just use
fopen() to store data: <http://www.sqlite.org/whentouse.html>

