

Why did we take reddit down for 71 minutes? - icey
http://blog.reddit.com/2010/01/why-did-we-take-reddit-down-for-71.html

======
jbellis
"Memcachedb also has another feature that blocks all reads while it writes to
the disk."

Seriously?

Wow.

~~~
davidw
Sounds like it's time to look at Redis...

~~~
simonw
Or Tokyo Cabinet / Tyrant, which is still a high performance key/value store
but doesn't need to fit everything in RAM. Depends on how much they're
storing.

~~~
mrduncan
Virtual memory support is currently being added to Redis so that it won't need
to fit everything in RAM either. I'm sure antirez can provide some better info
on how it'll work.

~~~
antirez
Hello, VM is already in alpha on Git actually. There is some more work to do,
but I don't think in the Reddit use case there is the need of Redis VM: they
are using MemcacheDB as a persistent cache, if it's a cache it should match
very high performances delivered by memcached and is not required to cache
everything. Redis is as fast or faster than memcached (using clients with the
same performances) and is persistent, so probably it's a good fit for this
problem.

Instead of VM Reddit should use Redis EXPIRE I guess, that is, time to live in
cached keys so they auto expire.

Btw for people that don't know what Redis Virtual Memory is: with VM Redis is
able to swap out rarely used keys in disk. This makes a lot of sense when
using Redis as a DB. When using Redis as a cache, the way to go is EXPIRE:
rarely used things in cache should simply _go away_ and be expired instead of
being moved into disk.

EDIT: It would be very interesting to know where the Reddit performance
problem is, but Redis Sorted Sets are a very good match to create social-news
alike sites like HN or Reddit in a scalable distributed way, with a few
workers processing recent news to update their score into the sorted set. The
home page can be generated with ZREVRANGE without any computation.

It's a shame reddit is not sharing how this cache is used.

------
ShabbyDoo
I'm bothered by the need to run SW RAID on top of HW RAID. One would think
that Amazon would sell faster EBS "disks" for a premium.

And slower disks for a discount? But, I guess that's what S3 is for.

~~~
psranga
I think it works like this:

Even if Amazon uses 'k' hdds for one EBS "disk", since you're sharing the real
hdds with other users, you don't get 'k' hdds' performance, you only get a
fraction.

By RAIDing over 'n' EBS "disks", you are effectively compensating for the
reduced performance due to sharing.

~~~
ShabbyDoo
I get what the stack looks like, but it seems really broken and likely quite
inefficient. Thus far, Amazon has gone after greenfield applications which can
be written within the constraints of their cloud platform. However, there are
a ton of people hosting their own SQL database-based apps where a single DB is
the bottleneck. Without significant refactoring, these apps can only scale
vertically with the DB. So, while Amazon provides nice, big boxes to run
SQLServer/MySQL/etc., disk performance is that of a desktop machine -- hardly
a balanced system. How many more customers could they capture if they offered
premium, high-performance storage options?

~~~
ericd
You hit the nail on the head as to why I'm looking into a physical DB server
with RAIDed SSD's instead of hopping onto EC2. I would love to use Amazon and
not have to deal with the potential headaches of managing physical machines,
but the stories (maybe FUD) of having to raid EBS instances, spool up 20
instances to find the winners and kill the rest, etc etc really kills the
appeal.

If they could promise me consistent database performance on par with a really
nice physical machine, I would gladly fork over 500/month for it.

~~~
nethergoat
As someone who has spent the last year and a half running a
200-(persistent)node environment on EC2, including multiple m1.large and
m1.xlarge DB pools, I can assure you those stories stem from FUD and
unreasonable expectations.

Yes, EBS is not very fast, especially compared to local disk. You can work
around this, however, by configuring multiple volumes in a RAID configuration
(as you have mentioned), or by scaling out with additional nodes. The size and
workload of your database will dictate which is more cost-effective.

Spooling up many nodes to find the "best" one is completely unnecessary. In my
experience, EC2 nodes have been remarkably consistent in performance. I won't
say I run a load test on every one, but I will say that over 100,000 node
launches, I've never had to shut down a poorly-performing instance that
couldn't be attributed to a hardware issue (rare, and for which Amazon sends
notifications).

Don't listen to the naysayers. Come on in, the water's fine!

~~~
ericd
Thanks very much for the FUD-debunk, it's always great to get advice from
someone who has thoroughly kicked the tires of something. I may start
considering it once again.

Would you mind sharing what kind of small-block IO/sec numbers you've seen
from the EBS's? My app tends to generate lots of IO with not a huge amount of
cacheability, and it has a relatively small dataset, which is why I'm
considering SSDs in the first place.

------
mark_l_watson
I really like 'behind the scenes' stories like this from big sites like
Reddit, Facebook, Heroku etc. I work on a smaller scale of a few EC2 at a
time, but I really enjoy the scaling info.

------
jacquesm
One major reason not to hop on the cloud bandwagon just yet is issues like
these. The more layers underneath that are not under your control the more
layers you'll have to add to remedy that.

Systems with excessive complexity are hard to debug, especially when it comes
to analyzing performance issues.

Given complete control of the hardware from the ground up it can already be
quite hard to accurately pinpoint a bottle neck so you can solve it. Adding a
lot of stuff between your code and the hardware is not going to make that any
easier.

Typically a stack has 6 layers before you get to your application: drive,
controller, driver, filesystem, database, app.

In a cloud environment anything under the filesystem layer is effectively out
of your control and out of your ability to troubleshoot. The solution, to add
another layer of complexity in order to combat the slowdown is really the
opposite of what an ideal cloud environment would give you.

After all, the #1 selling point of the cloud is scalability and performance.

I think that it would be best if Amazon worked together with the OP to resolve
the issue as a problem ticket rather than to try to solve it by adding a
software raid.

Of course, that's just armchair reasoning, not being in the hot seat makes
life easier.

~~~
rms
Reddit went on the cloud bandwagon not because they thought it was superior to
managed servers, but because Conde Nast's IT department sucked. With the
ongoing growth, they were having trouble procuring the additional servers as
needed, and moving Amazon solved that problem.

~~~
jedberg
That's not entirely true. Yes, it is true that getting servers was hard, and
that was definitely a factor.

But the bigger factor for me was that I was tired of having to build, image
and rack all those servers. I liked the flexibility of EC2, and also not
having to waste resources ordering a full rack's worth of hardware every time.

Cost was also an issue. Datacenter space in SF is expensive, but it had to be
in SF, because that is where I was. EC2 proved to be much cheaper than
physical servers.

I also like the fact that I don't have to run to the datacenter anymore when
there is an issue. I just file a ticket with Amazon.

~~~
rms
OK, thanks for the clarification. I'm trying to find the post I read that gave
me that idea but can't find it. Was it in in your AMA?

~~~
jedberg
Could have been. Or possibly something Spez said.

~~~
rms
I thought it could have been a spez or kn0thing post but I went through all of
them and couldn't find anything. I guess it could have been deleted, but I'm
going to chalk this up to faulty memory on my part; you were there.

------
shrike
Does anyone know if reddit is doing anything special to create the RAID? Or is
it just mdadm?

~~~
jedberg
Just mdadm.

------
dgreensp
Reddit is down super-often for me. This very minute, I can't log in or use the
site logged in, it just serves me 503 errors.

------
rubyrescue
sounds like a redefinition of 'in any way' to me...

So why am I singing the praises of Amazon and EC2? Mainly to dispel the
opinion that the site getting slower since the move is in any way related to
Amazon...(snipped)...Unfortunately, the single EBS volumes they were on could
not handle these bursting writes.

~~~
aaronblohowiak
This isn't an EC2 issue, this is a SAN issue. Wether it is EBS or an NFS
drive, meh. This is an architecting issue and while it is a result of the
underlying hardware, the underlying hardware is not the constraint.

~~~
rubyrescue
agreed it's largely an architecture issue, however poor EBS performance is
contributing factor and he seems to go out of his way to say that it's not...

~~~
pavs
He went out of his way because most reddit user keep blaming AWS for the
recent issues, as jedberg (reddit IT guy) recently mentioned the problem is
not with AWS scaling but with reddit software scaling.

