
This is what 864GB of RAM looks like now - wlll
http://37signals.com/svn/posts/3090-basecamp-nexts-caching-hardware
======
rwmj
Pfft ... We were testing RHEL 6 on machines with 2 and 4 TB of RAM (in a
single machine) the other day.

They have lots of cores and are very NUMA.

~~~
mdonahoe
I was 2/4 for understanding the acronyms in your post.

RHEL 6 = Red Hat Enterprise Linux NUMA = Non-Uniform Memory Access

~~~
Vargas
TB = TeraByte RAM = Random Access Memory

I would have thought TB and RAM were easier than RHEL and NUMA.

~~~
mdonahoe
Haha those were the ones I understood! I was explaining for anyone else.

------
forgotusername
I'd be more interested in a writeup of what data you actually _store_. I've
used Basecamp at a customer's before, and it certainly didn't justify the
requirement of 1TB RAM, let alone 100kb RAM to serve quickly.

This sounds more like Reddit's problem where some architectural
simplifications might net a giant win versus piling yet more gunk on top
(Reddit is still perceptibly doing random IO for every comment in a thread
during page load, or perhaps some insanely slow sorting, I have NFC how they
haven't fixed this yet).

~~~
LokiSnake
I'd love to hear how Reddit is doing it wrong and can be improved. As far as I
can tell, they want users to see the freshest content possible (latest
comments, up/down vote counts, etc.). Is it really that easy?

~~~
rbranson
Read on: <https://github.com/reddit/reddit>

------
herge
We had a server with half a TB of ram. It was a lot of fun until we had to
reboot it, and then it took half an hour to perform a memory check.

~~~
zdw
I worked on a Mac Pro once that someone had 64GB of RAM installed in... took
the better part of a minute to get the startup gong.

They ended up splitting the memory between a few machines as it became obvious
that the applications being used (Adobe CS stuff) wasn't going to use all that
RAM in their use case, and other machines needed it more.

~~~
mashmac2
Although the new Final Cut Pro X eats RAM for breakfast. Our editing Mac Pro
at work regularly uses 16GB+ when working in FCPX with multiple projects and
events. Frustrating at times, to say the least.

------
adestefan
"...russian-doll architecture of nested caching that I’ll write up in detail
soon."

The details will be interesting, but that description sounds like a headache
waiting to happen.

~~~
raghus
The cache invalidation bit is going to be the interesting part of that write-
up

~~~
dhh
I'll give away that secret right now: Key-based expiration.

~~~
codyrobbins
Are you defining dependencies between the keys somehow so that keys
invalidated further down in the hierarchy propagate up to invalidations all
the way up the stack?

~~~
dhh
Key-dependency is achieved using touch-parent-on-update.

~~~
codyrobbins
Whoa—I can’t believe this #cache_key and :touch stuff was introduced years ago
and I somehow completely missed it. This is the most useful and elegant thing
ever!

------
lrobb
One of the bright spots of my career was building a distributed system that
spanned 64 different nodes and regularly did over 100Mb/sec... Built on
machines holding (at the time) a top-end config of 2Gb/ram per machine...

That's more RAM than our entire system had.

~~~
foobarbazetc
That's vastly more interesting and impressive than a blog post about how much
RAM some company bought, though. :)

You've got $12K? Good for you. Here's a cookie.

You've scaled out a system onto 64 nodes? That's something worth discussing on
HN.

------
illumin8
This is what 864GB of RAM looks like after you laid it all out on a table
without using ESD (electro static discharge) protection. A few of these DIMMs
are probably bad now, but you won't know which ones because you're going to
stuff them all in a Linux box that you built yourself... this is such a bad
idea for a production system.

~~~
Aloisius
In my decades of building and taking apart computers, I have never had static
discharge ruin a component. I can't remember the last time I was shocked by
static discharge for that matter.

Who are these people who need to be grounded all the time else they build up
to component destroying levels of static?

~~~
sp332
It depends on the humidity where you live, and what kind of clothes you wear,
and if you're on a carpet or tile, or probably what you had for breakfast etc.
I've had ESD fry a hard drive a RAM module, and maybe an SSD (or maybe it was
bad firmware).

~~~
Aloisius
Ahhh. I've never lived very far from the ocean, so it's always pretty humid.
Plus, I've always had wood or concrete floors which limits static build up and
wood or metal tables.

I suppose if you're in Arizona, static build up is a much bigger problem.

~~~
sedev
I'm in Arizona right now. I cuss out the static electricity about every other
day, so, yes, around here I'd use an ESD wrist strap or something if I were
going to handle DIMMs.

------
ajdecon
We've been playing with a Kove xpd (<http://kove.com/xpress>) with 2 TB RAM
lately. 2 TB RAM and four Infiniband cards make for a very fast DB server...
:-D

------
russell
The obligatory comment from the old guy:

My first computer, an IBM 7094, had 32k of 36 bit words, about 200K
characters. The memory cost about $1M and was the size of a refrigerator.

~~~
dredmorbius
I had two rocks. One was down and the other was not available at my security
clearance.

------
ck2
Those must be ECC prices.

16GB DDR3 is "only" $100, that would make non-ECC $5400

ah okay, ECC is $175, so 175 * 3 * 18 = $9450

~~~
bradleyland
Generally speaking, you don't want to run a server with 840 GB of RAM and the
entire universe cached in RAM without ECC.

~~~
silvestrov
Generally speaking, you don't want to run a server without ECC.

~~~
jaxn
Generally speaking, you don't want to run a server.

~~~
Nrsolis
Generally speaking, you don't want to run.

~~~
batista
A funny exchange? What is this, Reddit?

Get serious, all of you.

~~~
enneff
Repeating the same tired jokes over and over again is not the same as being
funny. Original humour is generally well-appreciated on HN.

------
iamleppert
I question the architecture and approach of this much caching. Because this is
memory, I assume it's for a read-only cache?

That said: most read heavy services only need at most 10% of their working set
cached, in memory. The hardest problem in caching is figuring out what that
set is and how to control consistency. Not how to store it.

So, having this much cache seems to imply that you think you're going to have
large amounts of read intensive data to cache.

Or else you intend to cache your entire working set? Either way, you'll have a
single point of failure (either in a single server, single datacenter, or
single geographic area).

Any large operation can tell you that the problem becomes 90% network and
stuff like a CDN become far more important than how fast and big your cache
is.

Perhaps a nice technical writeup on your architecture would silence the pundit
inside me?

------
ch
Hmm. I wonder if those are laid out on a grounded mat or just some random
office table, sitting on carpet?

------
jebblue
That would make a great Minecraft server. For real.

~~~
gaius
Not really - Minecraft is single-threaded.

~~~
wlll
Luckily we bought single threaded RAM ;)

------
brudgers
That picture reminded me of the first time I saw a fully populated 8meg board
- for about the same price.

------
hyuen
I wonder what kind of stuff can be done with 1Tb of SSDs, much cheaper than
RAM, but also somewhat slower...

~~~
lallysingh
Well, it's 256 GB of ram each for 13 servers. I do wonder if the second half
of that RAM, with the money used for SSDs, would result in better performance
-- as your cache could be larger for the same budget. I think it'd really come
down to your cache hit rate / cache size.

~~~
TylerE
Not for a memached server. If it ain't RAM, memcached can't use it.

~~~
robotresearcher
Does memcached not use virtual memory?

~~~
danudey
Memcached is designed to be fast. Anything that makes it slower that doesn't
need to be there isn't there. For example: authentication (anyone who can
access the port can fetch data), indexing (you need to know the key),
deallocating memory (you configure memcached with an upper bound; it keeps
allocating memory as needed until it gets there), etc.

Virtual memory (I assume you actually mean 'does memcached not page data out
to disk') would make it much slower. Since memcached is just that - a memory
cache - if you're out of memory it just expires the least-recently-used data.
In your application, you fetch the key, and if that fails you fetch it from
the primary data store (or wherever else you can find it).

------
16s
It's common to see servers with 100+ GB of memory these days and I've seen
some programs use it all!

------
prolepunk
I wonder if they would run into fragmentation problems with this much of
memory, and if so how they are going to deal with that?

------
st3fan
I think this site could use some of that memory and caching love. Loading
articles or even /newest is sooo slow sometimes.

------
gaius
Is this supposed to be impressive? Show me what you can do with how _little_
resource. Anyone can brute-force it.

~~~
makmanalp
Devil's advocate counter-point:

Is there any point to not bruteforcing it? While you twiddle bits, I can throw
money on hardware and get my product out on the market. Then, when times are
more stable, I can worry about optimizations.

Also, since I haven't done any twiddling yet, there is likely to be a lot of
low hanging fruit. I was going to need that hardware anyway if I was to grow.
Now I can have a period of time where my savings allow me to slow down on
buying more hardware.

~~~
gaius
_I can throw money on hardware and get my product out on the market._

Up to a certain scale what you say is true - but it's not interesting at all.
As in, anyone can drive 26 miles but running a Marathon is still impressive.
Or anyone can order dinner in a restaurant, but not everyone can cook.

~~~
groby_b
Running a marathon is _impressive_ , yes. But if I'm in the business of
delivering food (to stretch an already thin analogy ;), there's no point in
training for a marathon. The constraints of my business make driving around in
a car a better choice.

And unless I'm a professional chef, I'd rather not spend time on cooking
dinners when I could do something to improve my business.

The point of building something (in a commercial environment) is not to
impress, but to ship something.

~~~
rbranson
No delivery business would post a blog with a picture of their new Ford Taurus
delivery car as if it was something impressive. That's the difference.

~~~
wmf
If everyone else was saying that you should use an army of autonomous
quadcopters to deliver stuff, then it wold be worth pointing out that a
simpler solution is available. 37S is challenging the cloud and scale-out
cargo cults.

------
st3fan
This is not really special anymore in 2012?

~~~
wmf
No, but a lot of "sysadmins" are being raised with no knowledge of hardware
and some others like to cache knowledge for years and need a good shock to
bring them into the present.

------
NanoWar
Reminds me of the days when 1 MB of RAM did cost 500 bucks. Used to run
Windows 95 :) after installing from floppy.

------
tezza
What sort of servers are they going into?

~ 13 servers with 4 sticks (64GB) each?

~~~
masklinn
3 servers, 18 sticks (288GB) each according to the comments.

------
jonknee
And who says Rails doesn't scale?

~~~
wlll
Facebook: "…use more than 800 servers supplying over 28 terabytes of memory to
our users." (<http://www.facebook.com/note.php?note_id=39391378919>)

The scale of a memcached cluster doesn't really say anything about the scaling
abilities of their underlying stack.

~~~
jonknee
Comparing BaseCamp and Facebook is laughable.

That said I was referring more to their one giant database and self described
"russian-doll architecture of nested caching".

~~~
wlll
I'd bet that Facebook uses a fairly complex cacheing system though, so your
point that using extensive _cacheing_ says anything about the ability of the
underlying stack to scale still doesn't really work for me.

------
strictfp
"...russian-doll architecture of nested caching that I’ll write up in detail
soon."

A.k.a object orientation? (Maybe an explanation is due here, or do other HNers
grok and agree?)

~~~
strictfp
Ok, ok so you don't get it. Fine, I'll write a blog post about it.

------
jrockway
Before I worked at Google, 864GB of RAM would have impressed me. Now I think I
have that much in my laptop...

~~~
viraptor
Yes, big companies put some things into a different perspective. Just today I
heard a sysadmin saying they lost 50GB or so storage per server due to some
partman issue. But that's ok... adds up only to ~ 10+ TB and only temporarily
(fix via lvm resizing possible).

------
ajack
It's great that companies post articles describing their architectures and
backend systems (and I'm certainly looking forward to your follow-up post
detailing your caching system) but why show a picture of commodity hardware
available for anyone with enough money? What's impressive is your skills, now
how much money is in your bank account.

~~~
Cushman
This is a picture of $22 million in cash: [http://www.common-sense-
politics.org/wp-content/uploads/2010...](http://www.common-sense-
politics.org/wp-content/uploads/2010/05/postie-media21.jpeg)

This is a picture of more than a ton of dried marijuana:
[http://calpotnews.com/wp-
content/uploads/2010/09/090410-Corn...](http://calpotnews.com/wp-
content/uploads/2010/09/090410-Corning.jpg)

A large amount of one thing in the same place makes for a pretty cool picture.
Don't take it so serious.

Now, with that said: 864GB? Pssh. Let's see some terabytes.

