

Deploying on EC2 + S3 - palish

Hello all,<p>I have a little side project I'm getting ready to launch.  It's a Rails app using MySQL (though database choice doesn't matter, since it's Rails).  I'm considering launching the app purely on S3 + EC2.  The idea is, the code will live in S3 by using the S3 filesystem component.  I can develop locally and deploy via Capistrano when I push out a new release.  The code is then deployed to S3 and the EC2 instance restarts.<p>I'm wondering if this is even viable.  Has anyone else that you know of done it? <p>For the database, I have two options.  I could either leave the database on the EC2 instance and back it up nightly and never restart the instance (since data only disappears if you restart the instance or due to a hardware failure).  The advantage there is database queries are most likely much faster, since it doesn't have to hit the network for each new database query.  The disadvantage is, the data may disappear at the drop of a hat, and then it's backup-restore time.  The other option is keeping the database in S3, which seems totally secure but might be slow for queries.  I'll, of course, be using memcached and other caching mechanisms, but database load seems like it could still play a part.<p>What d'you think, would you launch on EC2 + S3?<p>
======
nailer
This is possible using the S3 storage engine for MySQL, which is currently in
beta. Your database lives on S3, your hosts live in EC2 Xen VMs.

As I understand it, if your VM is powered off, you lose cached data your
database will then start re-pulling from the storage engine.

You can fetch the storage module from here:
<http://fallenpegasus.com/code/mysql-awss3/>

Relevant presentation: <http://fallenpegasus.com/code/mysql-
awss3/presentations/>

Computerworld article:
[http://www.computerworld.com/action/article.do?command=viewA...](http://www.computerworld.com/action/article.do?command=viewA..).

Have I done it? No. It's something I'm investigating for a startup which may
have a need for elastic scalability.

My main concern: you'd need a list of consistent speed between S3 and EC2. Any
current users know what this is like?

~~~
cpinto
On the Amazon forums, people wrote about 10MB/s throughput between S3 and EC2.

~~~
ntoshev
...and 200ms lag.

~~~
steve
...still sounds great for my app, actually.

My crazy caching design shows it's benefits after all!

------
staunch
Everyone rent a dedicated server on ServerBeach and use my referral code so I
get more free service (and you get $100 credit): 7XYHDMBU8A

[http://www.serverbeach.com/catalog/cust_ref_landing_new.php?...](http://www.serverbeach.com/catalog/cust_ref_landing_new.php?REF=7XYHDMBU8A)

Seriously though, just because EC2/S3 are useful things doesn't mean you
should use them for everything. Dedicated servers are unbelievably cheap and
powerful now. They're also way more straight forward to work with. Scale up
for a while, then scale out if you're hugely successful.

------
dood
In case you missed it, check the comments in this link
[[http://www.25hoursaday.com/weblog/CommentView.aspx?guid=f8e6...](http://www.25hoursaday.com/weblog/CommentView.aspx?guid=f8e678cc-045a-4316-917d-21b6754741ba#commentstart)],
discussed here [<http://news.ycombinator.com/item?id=33093>].

The suggestion is to constantly write db logs from EC2 to S3 instead of
nightly backup, so that in the event of a crash you're only likely to lose
minutes worth of data.

Also note that you can restart instances with no loss of data. Its only in
failures that you would lose data.

~~~
palish
Thanks, great resources. Do you, or anyone know of someone that has deployed a
full production website on entirely S3 + EC2?

~~~
wensing
We have.

<http://www.stormpulse.com>

Python (Pylons framework) + nginx + Ubuntu + S3 + EC2

~~~
palish
Which database do you use? If any.

~~~
wensing
MySQL.

------
dhouston
ec2 and mysql (or memcached for that matter) don't yet play well together. we
were going to deploy a big database across several EC2 instances but the lack
of static IPs and internal routing quirks makes configuration and automated
failover extremely complex. sadly, the ability to deploy lots of instances !=
the ability to scale effectively, at least for mysql/memcached.

~~~
palish
Does S3DFS solve that for you though? What I got from S3DFS's description was
that you fire up your instances and it mounts a shared filesystem. Then you
could start MySQL from that, in theory.

------
nickb
Just buy a real VPS server somewhere and avoid all the issues with data
persistence and EC2. Use EC2 for heavy processing and use S3 for
backup/storage/static file server.

Going down the path you described will be full of pain.

~~~
cpinto
agreed but for different reasons. in terms of bang for the buck EC2 is awful
if you want it up 24/7. And then you get other problems with EC2 too, the
biggest one being it's non-persistent storage (but that's something everyone
already knows and is happy to ignore).

Yes, it's possible to design an application to either not use a database at
all, by storing _all_ your data on S3, or to implement some workarounds that
deal with DB master-slave replication, store commit logs on S3, etc etc etc.
Question is: is it worth it? In my opinion, no.

S3 is great if you want to store images, movies, etc and aren't very popular
worldwide. EC2 is great if you need extra processing power to deal with extra
traffic coming from those Digg/Slashdot stories, or doing batch processing
like converting movie formats, but going 24/7 on simple EC2 or EC2+S3 (using
S3 as a database and not just for storing media files) is more of a hassle
than a "good enough" solution. VPS is definitely the way to go if you cannot
afford to lease a couple of servers or some rackspace for the cheap 1u boxes
you can buy at ebay.

~~~
tcwc
Even if used 24/7 is EC2 that bad value wise? Assuming it costs about
$70/month for one instance it is a lot cheaper than most dedicated servers.
EC2 is a lot more powerful than the average VPS, that is not really a fair
comparison. I would say it is pretty good value if you need (almost) 2gb ram
and a fast processor.

If Amazon sorts out the other issues EC2 could be really great.

~~~
cpinto
No, not really. Possibly this depends on where you are at but here in Portugal
(where I am), I can get a better physical server for about 80Eur/month, which
is only a bit more for what I would pay for an EC2 instance (including network
traffic). Is it worth the extra dollars? I think so.

Also, even though S3 doubles as a backup medium (and you _must_ backup your
database), if you want a database running on EC2 you _must_ go for S3 to store
your information. That too will increase your costs.

And on top of all that, you have to spend some time tweaking and testing your
configuration (and testing again). If you're short on working hands, the time
you spend on this activity may not be worth it.

I think that, unless you're doing it just for fun, you should let someone else
help Amazon sort out the kinks of their system.

In the end, having your entire system on EC2 will not give you an edge over
your competition, but the time you save by not doing it probably will.

~~~
tcwc
I was just making a comparison of pricing. Not everyone needs a database, and
not everyone will be too bothered by the other EC2s drawbacks. For my app the
persistent storage issue doesn't matter, and EC2 is good value for a number of
other reasons. I agree that for the OP's needs EC2 probably isn't worth
bothering with.

------
benhoyt
I can't speak about EC2, because we haven't used that. (But from what I've
heard, it's not quite right for a web hosting service. We rate hosting your
app yourself.)

But ... we've found S3 to be _very_ good for hosting big stuff, and static
files. Very simple. And use the S3Fox Firefox plugin for testing and getting
statics on there in the debugging phase.

~~~
palish
Oh, definitely. I'm using Rails' attachment_fu plugin to store all my images
using S3. It's drop-dead simple.

------
alex_c
As someone who's not very familiar with EC2 and S3... what's the advantage of
going this route over the traditional app server(s)/web server(s)/database(s)
model?

~~~
gregwebs
S3 is storage- it is mostly used to save money (and some people have saved a
lot of money). E3 is for computing and is attractive because of its easy
scalibility, but people are still trying to figure out how to best use it with
webapps.

------
jey
We're planning on renting a dedicated server for our web and database servers,
and using EC2 for red5 instances. The red5 machines don't need to store any
persistent data, and the number needed will vary according to demand, making
it a good fit for EC2.

