

Can Amazon's SimpleDB substitute for MySQL database? - limeade

I'm developing a web app with only a couple people with small amounts of spare time and it would be great to "outsource" DB management to SimpleDB. I have heard SimpleDB can be slower than MySQL but it seems pretty fast so far w/o a large table. Does anyone have experience with it?
======
mdasen
Short Answer: No.

Long Answer:

SimpleDB is based off what every application developer wants to hear. Put your
data here and it will just be available. No worries, no cares. For
(relatively) small amounts of data, SimpleDB will definitely cut it. However,
SimpleDB also limits your usage of that data. No nice normalized schema and
joins for you. That makes development time much longer.

Of course, if you only have a small amount of data, it's pretty easy to manage
MySQL or PostgreSQL. Database administration becomes hard when you have lots
of data. But this is where SimpleDB just falls flat. In a single domain
(roughly equivalent to a database), you can only have 250,000,000 attributes
(cells). That sounds like a huge number at first, but if you have a table with
10 columns, you're down to 25M rows. Still, a lot, but not exactly no-worries.
Considering that you'll probably want multiple tables with maybe 100 or more
columns, you can see that it just doesn't support large web applications.

Amazon has set up an "infinitely scalable" database that has hard limits that
aren't particularly high. Open source relational databases are easy and
reliable when you aren't dealing with the scale of a Digg or Flickr. The only
reason to go with something less useful (no joins, limited querying) would be
to scale better. Amazon's hard limits on SimpleDB mean that it just doesn't
scale. So, why use something less easy when the benefit isn't there?

~~~
limeade
Thanks for your advice and everybody else's too... I still feel like I may at
least give SimpleDB a try by making a new domain for each table and
distributing the large tables over multiple domains (my data is amenable to
splitting up). I may make two copies of some domains that are split up in
different ways so I don't have to query all of the domains associated with a
table.

Are JOINs used that often? My thought was that I could export the data to a
mysql database for analysis if I needed to do any very demanding crunching.

~~~
mdasen
JOINs are awesome. They're expensive operations, no doubt, but they make
creating a web application a TON easier. If you're going to limit yourself to
the way that Google or Amazon has to develop web applications, you're getting
rid of your primary advantage.

Let's say you have a site with logins and comments. Comments are by one of the
logins with a nice foreign key in a relational database or you have to add
part of that person's info to each comment with something like SimpleDB (if
you want to display the name next to the comment, you place the name there).
So, then someone changes their name in the logins table - it happens. With the
foreign key and a join, all the comments appropriately show the correct name.
With SimpleDB, you have to go through and update every single comment that
person has made to reflect the changes.

Take something like Facebook. Every single mail message, every single wall
post, every single friend, every single Event, every single group, etc. would
have to be updated to reflect your new name if it wasn't referential. Now,
Facebook probably isn't referential. They're big. They probably spend a ton of
time/money keeping that stuff in sync. In fact, they probably run those types
of updates as low-priority background jobs (so while your new name shows up in
your profile now, it is a while before it gets propagated). That's all
guessing, btw, but you can see how much more difficult it is to keep non-
relational data in sync.

Places like Facebook have to operate differently because JOINs do have cost.
It's amazingly unlikely that a relational database won't suit your site due to
scalability. If you're looking to make a cool site, do it the easy way first,
then scale. Otherwise, someone else will build it faster than you.

Non-relational data seems easy and it is for simple things. It might be that
your site fits very nicely in a non-relational model. Do be aware of the
differences.

------
wehriam
There are fundamental differences that are more significant than speed or
administration.

SimpleDB is "eventually consistent" - that is, when you write, a subsequent
read might not immediately reflect the change. There's also no such thing as a
join, although you can approximate it.

I think of SimpleDB as searchable metadata storage. It's useful, but not a
drop-in replacement for a traditional database.

------
qhoxie
I have heard of inconsistencies with slow-down, but also quite a few
successes. I hear Amazon is improving it consistently though.

I would look at it this way: If you do not have time to manage it, then either
the DB or other aspects will suffer - It is likely that SimpleDB will better
this.

------
msie
I have a hunch that you may be worried about scaling problems since you are
looking at SimpleDB instead of MySQL. For an app in its infancy you shouldn't
be too worried about scaling problems. Just use a MySQL database for now.
Thousands of webapps can't be wrong!

------
briansmith
Getting SimpleDB working well isn't going to be less work than getting MySQL
working well. SQLite might be a better alternative.

