

Ask YC: Scaling and alternatives to simpledb - utnick

I want to store hundreds of millions of records and run some queries and calculations on those records. There would be millions of batches of calculations a day.<p>They are pretty simple records, it would just be one table in a database ( so I don't think I need a full relational database ).<p>I am doing some research on what it would take to build and scale this out. My initial idea was to use ec2 for the calculations and simpledb for the datastore. However, we might go over the simpledb size limit also simpledb is only in BETA.. so is there an alternative product or approach to simpledb with no limitations ( maybe something that runs over s3? )<p>Also if I have a lot of money, would it be better to do this on my own with a big oracle/db2/sqlserver db server, vmware, more servers, or is amazon the way to go for this?<p>I am totally new at scaling so any links to good articles or sites would be helpful!<p>Thanks!
======
keefe
How confident are you that your model for this is correct? Hundreds of
millions of records is a lot for one table... in some cases, it may be
beneficial to partition such a table even in a normal relational database.

That being said, if this is an application with complex calculations and a
very simple data structure, you should note that the biggest issue is creating
indexes to find the records that must participate in a particular calculation
quickly. RDBs are very good at building indexes, typically and I would
personally be inclined just to use mySQL with data partitioned across multiple
tables (perhaps with the same schema) backed up to S3 and with calculations
run in EC2. You could also create a custom data structure for this if you are
into low level pain.

What is your application?

------
wmf
Scaling is not about the number of records or queries, it's about the _growth
rate_ of data and queries.

Google App Engine is the obvious competition to SimpleDB.

Here are some case studies: <http://highscalability.com/links/weblink/24>

