
AWS Case Study: Parse (YC S11) - goronbjorn
http://aws.amazon.com/solutions/case-studies/parse/
======
jedberg
I'm really humbled to see the architecture that we came up with at reddit with
help from Heroku and JustinTV, which I then taught to Amazon and AWS, is still
the one AWS is highlighting today (and presumably teaching their customers).
It's an amazing feeling to see your work live on beyond your involvement.

* They flew me up to Seattle to teach Amazon retail how to move to AWS back when retail had just started that process. Amusingly, the guy mainly responsible for that move now works for Pinterest (a company that "does cloud right").

~~~
voltagex_
Was it a strange experience to teach a company how to use its own product?

~~~
jedberg
I was surprised when they asked, that's for sure.

It was a little odd because I felt like there is no way I could teach these
people anything they didn't already know, but I got really good questions and
feedback so at the end I didn't feel nearly as weird about it.

------
gfodor
I realize the diagram is likely heavily over-simplified, but what's the reason
for the nginx layer? It doesn't look like you guys are serving assets through
those boxes, so it seems like you have an unnecessary extra hop from the ELB
to the app servers. My guess is this is either nginx caching, a holdover from
the pre-aws architecture or a measure to keep the system decoupled from ELB?

~~~
tcwc
I can't speak for Parse, but I've come up with something similar in the past.
Nginx/HAproxy as a combo is far more flexible than the ELB alone, you might
want to use it for rate limiting, better load balancing algorithms, better
logging, tweaking headers, handling errors, or controlling buffer sizes for
example.

~~~
pjscott
Or for preventing DoS attacks from slow client connections. Unicorn is not
designed to be exposed to the outside world; it needs a reverse proxy that
does request buffering, like nginx (which the Unicorn docs recommend).

~~~
tcwc
True, though the ELB also does some simple buffering of HTTP requests.

------
lolz
Can you tell us more about RAID10 comment?

Are you doing RAID0 now instead for disk volumes?

What would happen if there's another EBS outage that would take out a lot of
your MongoDBs. Would you just recreate them from the very latest S3 backups?

------
tszming
Why Squid behind the cloud code servers? Are you moving from MongoDB to
Cassandra?

~~~
lacker
Squid handles external http requests that are made from cloud code.

~~~
tszming
That makes sense, thanks.

------
trustfundbaby
Why both redis and memcached?

~~~
manuelflara
I can't speak for Parse, but I use both of these technologies in one site.
Memcached is, well, for any kind data that you want to cache: results of SQL
queries, blocks of the site (HTML), etc. Redis, on the other hand, it's a
final data storage on its own (like MySQL, because it's also persistent), but
it's faster than MySQL for certain things. For example, for things like "List
of profiles I've visited / have visited me recently on Facebook". This seems
like a very simple MySQL table and query, but once it becomes big it starts to
get slow. On the other hand, you can have one key per user on Redis, like
"<id_user>_visit_history" which is a sorted set of <time, id_user>. You could
do the same with Memcached (although it doesn't support the same data types),
but then it wouldn't be persistent.

~~~
trustfundbaby
Right ... thats my point. redis is as fast if not faster than memcached for
caching exactly the same things you talked about (results of sql queries,
blocks of the site etc) so why not just use redis for everything and eliminate
memcached.

~~~
shykes
Here's one possible answer, for one specific use case:
<http://engineering.pinterest.com/posts/2012/memcache-games/>

