

Reddit's May 2010 "State of the Servers" report - pavs
http://blog.reddit.com/2010/05/reddits-may-2010-state-of-servers.html

======
jbellis
Incidently, on EC2 the extra cpu in an XL or HMXL makes all the difference
between "oh shit, we ran out of capacity and bootstrapping more slows things
to a crawl" and "we ran out of capacity but we can limp along during the
bootstrap." YYMV.

Reddit was originally on L instances.

(HMXL is generally the sweet spot for Cassandra price/performance on EC2,
IMO.)

~~~
jfager
Is it normal for people to run Cassandra on only 3 nodes in production
environments? I realize that one of the selling points is starting small and
then scaling out without any headaches, but 3 nodes seems extreme in that
regard.

~~~
jbellis
Well, roughly speaking, you can group Cassandra deployments in two
categories+: new products that are hoping they need Cassandra's scaling
ability someday, and existing products moving to Cassandra from something else
because the pain of scaling something that wasn't designed to makes them.

The first category will start small. The smallest production deployment I know
is a single 256MB VM, but usually even starting small you should not have less
than 2 servers (why tempt fate with a non-redundant setup when Cassandra makes
it so easy to be safe?).

The second group is where you see larger deployments out of the gate. 3
machines does seem small for reddit; I guess they made up the difference with
memcached. Unfortunately they deployed just before the 0.6.0 final release was
out, which is where we added the row cache feature that could have made
memcached unnecessary.

+There are people using cassandra for non-scale-related reasons, though. Most
of these people are motivated by multi-datacenter replication.

------
apike
I'm not surprised that Cassandra doesn't perform ideally with only three
nodes, considering the scales it's intended for. Does anybody know how many
nodes are required for its resiliency safeguards to work properly?

~~~
megablast
It seems odd that it is configured to look up a key-pair, when that key-pair
is no longer needed. Surely it would better to have it no longer cache queries
that are no longer needed.

~~~
jbellis
It's like how in postgresql, if a client runs "select * from lots", and you
kill the client, the query keeps going even though there's nobody to hand the
answer to.

That said, there are ways we can mitigate this, primarily in
<https://issues.apache.org/jira/browse/CASSANDRA-685>

------
hello_moto
I'm in the middle of reading materials about system analysis and design, in
particular about Enterprise Application Integration.

I took a break and checked out Reddit and HN. Stumbled upon this article and
the GrooveShark AMA. I realized that these days some of the big websites are
moving toward similar situation with that of a typical enterprise apps
situation where there are different components/sub-systems written in
different technologies. Is my assumption wrong?

Seems like between Enterprise App (whatever Enterprise means) and these big
Web 2.0 apps, the difference is only in the matter of the users; the former is
geared toward businesses where the latter is leaning toward customers/end-
users. The technology obstacles are rather similar.

------
trin_
"We've written to Trend Micro explaining that we're actually neither a spammer
nor an individual end user, but rather an honest website that's kind of a big
deal, and they sent us a form letter explaining how to configure Outlook
Express and encouraging us to ask our ISP for further information."

ahh the joy of robot-email-responders.

~~~
AdamN
Even for a site the size of Reddit, an outsourced emailer is the way to go. I
use AuthSMTP.net but there are probably better ones for higher volume.

------
antirez
Rule #1: when you can't scale is always your fault.

