Hacker Newsnew | comments | show | ask | jobs | submit login

How do you get around the IO issue? We've been looking at EC2 for, well, forever, but IO performance has always been a huge issue -- to the point where it'd cost more than hosting it ourselves just to scale IO performance to the point where it'd equal a couple self-hosted servers.



As it hasn't been mentioned yet, Amazon released EBS Optimized and Provisioned IOPS a few months ago that vastly improves EBS throughput.

http://aws.typepad.com/aws/2012/08/fast-forward-provisioned-...

-----


Their new SSD-based instances are supposed to be a lot faster, but has anyone benchmarked these?

My experience with EBS was the the performance was decent most of the time, but would brown out intermittently for reasons beyond your control.

-----


Netflix managed to discover that the new instances cut server costs in half.

http://techblog.netflix.com/2012/07/benchmarking-high-perfor...

Keep in mind this is purely research and it may (or may not) have already been implemented as part of their production systems.

-----


Not all services are I/O bound. Hell, it's routine in the modern world to have datasets for real world products that fit in, what, $20 of RAM?

-----


Which is exactly the case in our environment. As soon as all libraries are loaded from EBS IO is not that much of an issue any more

-----


We store all the code and everything that changes in a Ram disk, so really fast.

The virtual machine lies on EBS, but it is really fast to start and performance is not limited by EBS at all in our case.

-----


when i last used ec2, it was almost exclusively for async backend processes. in those cases, high throughput is important, but latency is not a factor in the same way. If there's a 300ms delay talking to the system from the outside world, it's no big deal.

that said, netflix seems to have solved the IO issue with ec2. I'd be curious to know more about their findings.

-----


Main way we solve IO issues: We don't rely on EBS -- most data is stored on the local drives using Cassandra.

For cache throughput, our cache library adds zone affinity to memcache, so the clients are hitting the cache in the same zone first.

Also, we're now exploring their SSD options (http://techblog.netflix.com/2012/07/benchmarking-high-perfor...)

-----




Applications are open for YC Winter 2016

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: