Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Deploying on EC2 + S3
17 points by palish on July 30, 2007 | hide | past | favorite | 27 comments
Hello all,

I have a little side project I'm getting ready to launch. It's a Rails app using MySQL (though database choice doesn't matter, since it's Rails). I'm considering launching the app purely on S3 + EC2. The idea is, the code will live in S3 by using the S3 filesystem component. I can develop locally and deploy via Capistrano when I push out a new release. The code is then deployed to S3 and the EC2 instance restarts.

I'm wondering if this is even viable. Has anyone else that you know of done it?

For the database, I have two options. I could either leave the database on the EC2 instance and back it up nightly and never restart the instance (since data only disappears if you restart the instance or due to a hardware failure). The advantage there is database queries are most likely much faster, since it doesn't have to hit the network for each new database query. The disadvantage is, the data may disappear at the drop of a hat, and then it's backup-restore time. The other option is keeping the database in S3, which seems totally secure but might be slow for queries. I'll, of course, be using memcached and other caching mechanisms, but database load seems like it could still play a part.

What d'you think, would you launch on EC2 + S3?



This is possible using the S3 storage engine for MySQL, which is currently in beta. Your database lives on S3, your hosts live in EC2 Xen VMs.

As I understand it, if your VM is powered off, you lose cached data your database will then start re-pulling from the storage engine.

You can fetch the storage module from here: http://fallenpegasus.com/code/mysql-awss3/

Relevant presentation: http://fallenpegasus.com/code/mysql-awss3/presentations/

Computerworld article: http://www.computerworld.com/action/article.do?command=viewA....

Have I done it? No. It's something I'm investigating for a startup which may have a need for elastic scalability.

My main concern: you'd need a list of consistent speed between S3 and EC2. Any current users know what this is like?


On the Amazon forums, people wrote about 10MB/s throughput between S3 and EC2.


...and 200ms lag.


...still sounds great for my app, actually.

My crazy caching design shows it's benefits after all!


Great! Thanks.


Everyone rent a dedicated server on ServerBeach and use my referral code so I get more free service (and you get $100 credit): 7XYHDMBU8A

http://www.serverbeach.com/catalog/cust_ref_landing_new.php?...

Seriously though, just because EC2/S3 are useful things doesn't mean you should use them for everything. Dedicated servers are unbelievably cheap and powerful now. They're also way more straight forward to work with. Scale up for a while, then scale out if you're hugely successful.


ec2 and mysql (or memcached for that matter) don't yet play well together. we were going to deploy a big database across several EC2 instances but the lack of static IPs and internal routing quirks makes configuration and automated failover extremely complex. sadly, the ability to deploy lots of instances != the ability to scale effectively, at least for mysql/memcached.


Does S3DFS solve that for you though? What I got from S3DFS's description was that you fire up your instances and it mounts a shared filesystem. Then you could start MySQL from that, in theory.


In case you missed it, check the comments in this link [http://www.25hoursaday.com/weblog/CommentView.aspx?guid=f8e6...], discussed here [http://news.ycombinator.com/item?id=33093].

The suggestion is to constantly write db logs from EC2 to S3 instead of nightly backup, so that in the event of a crash you're only likely to lose minutes worth of data.

Also note that you can restart instances with no loss of data. Its only in failures that you would lose data.


Thanks, great resources. Do you, or anyone know of someone that has deployed a full production website on entirely S3 + EC2?


We have.

http://www.stormpulse.com

Python (Pylons framework) + nginx + Ubuntu + S3 + EC2


Which database do you use? If any.


MySQL.


"Its only in failures that you would lose data."

Failures, and also explicit shutdowns (ex: "shutdown -h now")


Just buy a real VPS server somewhere and avoid all the issues with data persistence and EC2. Use EC2 for heavy processing and use S3 for backup/storage/static file server.

Going down the path you described will be full of pain.


agreed but for different reasons. in terms of bang for the buck EC2 is awful if you want it up 24/7. And then you get other problems with EC2 too, the biggest one being it's non-persistent storage (but that's something everyone already knows and is happy to ignore).

Yes, it's possible to design an application to either not use a database at all, by storing _all_ your data on S3, or to implement some workarounds that deal with DB master-slave replication, store commit logs on S3, etc etc etc. Question is: is it worth it? In my opinion, no.

S3 is great if you want to store images, movies, etc and aren't very popular worldwide. EC2 is great if you need extra processing power to deal with extra traffic coming from those Digg/Slashdot stories, or doing batch processing like converting movie formats, but going 24/7 on simple EC2 or EC2+S3 (using S3 as a database and not just for storing media files) is more of a hassle than a "good enough" solution. VPS is definitely the way to go if you cannot afford to lease a couple of servers or some rackspace for the cheap 1u boxes you can buy at ebay.


Even if used 24/7 is EC2 that bad value wise? Assuming it costs about $70/month for one instance it is a lot cheaper than most dedicated servers. EC2 is a lot more powerful than the average VPS, that is not really a fair comparison. I would say it is pretty good value if you need (almost) 2gb ram and a fast processor.

If Amazon sorts out the other issues EC2 could be really great.


No, not really. Possibly this depends on where you are at but here in Portugal (where I am), I can get a better physical server for about 80Eur/month, which is only a bit more for what I would pay for an EC2 instance (including network traffic). Is it worth the extra dollars? I think so.

Also, even though S3 doubles as a backup medium (and you _must_ backup your database), if you want a database running on EC2 you _must_ go for S3 to store your information. That too will increase your costs.

And on top of all that, you have to spend some time tweaking and testing your configuration (and testing again). If you're short on working hands, the time you spend on this activity may not be worth it.

I think that, unless you're doing it just for fun, you should let someone else help Amazon sort out the kinks of their system.

In the end, having your entire system on EC2 will not give you an edge over your competition, but the time you save by not doing it probably will.


I was just making a comparison of pricing. Not everyone needs a database, and not everyone will be too bothered by the other EC2s drawbacks. For my app the persistent storage issue doesn't matter, and EC2 is good value for a number of other reasons. I agree that for the OP's needs EC2 probably isn't worth bothering with.


Would you mind pointing me in the direction of some cheap hardware, or something to Google for? I'm totally clueless in that respect, and am doing research, but any help would be greatly appreciated.

I don't really need a monstrosity, just something small and reliable.


Before answering this let me tell you that I cannot endorse any of the services because I've never used them so I don't know how good or bad a service they provide. You'll have to check this for yourself (google for opinions, etc.).

Slicehost and other similar services (VPS') are about as cheap as it can get but bear in mind that you're sharing hardware with other people so don't assume you'll be the only one using the server.

If you want real hardware, google for "managed hosting". Alternatively, if you already have a 1u server lying around or can get hold to one (or more) it may be cheaper to search for "colocation services" which should show you some companies that will provide rackspace for your servers.

Renting hardware may be cheaper but if you buy a couple of servers on ebay and go for colocation, in the event of your business going belly up you can sell the hardware and maybe get enough money to last you for a couple of months while you search for a job or make new plans for world domination.


Shucks, it sounds like you're right. From the looks of it, EC2 doesn't have enough of a reliable database solution to justify putting a full production server on it. If I use the untested methods, I feel like I'm either going to always wonder if I'm going to wake up with my data gone, or if it will die from performance problems.


I can't speak about EC2, because we haven't used that. (But from what I've heard, it's not quite right for a web hosting service. We rate hosting your app yourself.)

But ... we've found S3 to be very good for hosting big stuff, and static files. Very simple. And use the S3Fox Firefox plugin for testing and getting statics on there in the debugging phase.


Oh, definitely. I'm using Rails' attachment_fu plugin to store all my images using S3. It's drop-dead simple.


As someone who's not very familiar with EC2 and S3... what's the advantage of going this route over the traditional app server(s)/web server(s)/database(s) model?


S3 is storage- it is mostly used to save money (and some people have saved a lot of money). E3 is for computing and is attractive because of its easy scalibility, but people are still trying to figure out how to best use it with webapps.


We're planning on renting a dedicated server for our web and database servers, and using EC2 for red5 instances. The red5 machines don't need to store any persistent data, and the number needed will vary according to demand, making it a good fit for EC2.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: