
Foursquare and MongoDB: What If? - roder
http://facility9.com/2010/10/26/foursquare-and-mongodb-what-if
======
luigi
This is "how I would make Twitter scale" all over again. Unless you're an
engineer working on that particular system, you can't credibly give advice on
how to improve it. Doing so is an insult to the profession.

~~~
megaman821
What a bull-shit statement. So Twitter engineers can't take any outside advice
because those people aren't working on Twitter's particular system. No one
else in the world could possibly know how to make large scale messaging
systems work.

I am not saying any of the arm-chair analysts are right but it is pretty
presumptuous to dismiss them outright.

~~~
cubes
While luigi's statement is a little extreme, I think the gist is right. Yes
outsiders can offer advice on a system, but it's really difficult to offer
sound advice without specific details of the system. Perhaps outsider advice
shouldn't be dismissed outright, but it ought be taken with a grain of salt.

~~~
megaman821
I very much agree that the devil is in the details, and the largest detail
here is how Foursquare goes about calculating badges. So without knowing that
the article can't fully evaluate alternatives. That has less to do with the
author not being an engineer on Foursquare's team rather Foursquare not
detailing the calculation. Maybe on Riak map/reduce is fast enough and
Foursquare can shard by checkin, which would be much easier to balance.

Maybe they should just implement their own sharding based on user activity.
Maybe a few SSD drives on their Postgres server would solve they issue. They
through the issue out there, if nothing else works tell us why.

~~~
cubes
Some informative links, in case you haven't seen them already:

* [http://www.quora.com/What-caused-Foursquares-downtime-on-Oct...](http://www.quora.com/What-caused-Foursquares-downtime-on-October-4-2010)

* <http://blog.foursquare.com/2010/10/05/so-that-was-a-bummer/>

* [https://groups.google.com/group/mongodb-user/browse_thread/t...](https://groups.google.com/group/mongodb-user/browse_thread/thread/66752f49af68619?pli=1)

Someone on this thread suggested that Foursquare's performance problem is
related to the calculation of badges. I'm not sure where this idea came from.
I haven't seen any mention of the issue being related to badges in first or
secondhand sources.

That said, Mongo DB's map/reduce operation would not be a reasonable solution
at this time. Mongo DB's map/reduce performance is, at present, somewhat
lacking because it runs via the javascript engine which is currently single
threaded. I know there are plans to improve performance of Mongo DB's
javascript engine by switching to V8, but I don't know if V8 is multithreaded.

Often design decisions that look bad in hindsight get baked in early. By the
time you realize that an alternate design would yield better performance,
there may be too much data to migrate so you just have to live with it.

~~~
whakojacko
harryh mentioned it here: <http://news.ycombinator.com/item?id=1769909>
Computing the badges online (ie at each checkin time) requires having the
whole db in ram for acceptable performance.

------
terryjsmith
This is essentially a non-discussion depending on how they're using Mongo. I
suspect that they, as we are, might be making use of Mongo's ability to query
documents based on embedded objects, which means their only choices would have
been a traditional RDBMS with JOINs or MongoDB.

~~~
ericflo
That's just absolutely false. What makes you think that other database systems
can't be made to perform these kinds of queries?

~~~
terryjsmith
This is based on my own experience at our startup. We looked at a number of
NoSQL databases for our backend storage. While others may "be made to",
MongoDB was the only one that advertised it as a feature and was by far the
easiest to use for that feature.

------
irrelative
I'm not questioning the validity of the data point, but why would RDBMS have a
cost of 10 times per unit of storage? I haven't heard this reasoning before.
Anyone know where this comes from?

~~~
cdavid
Assuming RDBMS is to be understood as ACID, this may be because of the need
for more reliable hardware, and good IO hardware is comparatively expensive ?

If you use things like mongodb and co, you are giving up on single-node
durability, for example, and you don't need to have as much efficient IO (e.g.
reasonable fsync performances). Otherwise, I don't see why mysql would require
more expensive hardware mongodb

------
andrewjshults
Interesting writeup. The thing it seems to not address is that a different
sharding model might have not worked for foursquare's data access needs. In
this <http://news.ycombinator.com/item?id=1769982> HN post harryh alludes to
the fact that using user ID as the shard key was a design decision. Their
current design requires that an entire user's checkin history be able to be
loaded for calculating badges so sharding by user ID might have been necessary
to guarantee not having the performance overhead of multi-server retrievals.
From reading the various write ups it seems like the real problems (using the
current load all data for badge calculations method) came from getting too
close to the max memory and then the issue that foursquare's checkins weren't
filling full pages so migrating data didn't actually give nodes 1 & 2 any
place to write new data to (in memory) until after compaction could be
performed.

~~~
megaman821
Why not shard by check-in id and map/reduce to get the badge calculation? When
sharding by user id, no solution is going to lead to a balanced cluster since
some users will always be more active than others. If map/reduce takes to
long, batch it and return the badge calculation later.

------
cubes
I find this sort of uninformed armchair analysis to be unfulfilling.

------
Qz
_This begs the question “How we could handle the constant write load?”_

Please stop doing that. <http://begthequestion.info/>

~~~
sausagefeet
Language is dynamic, get over it. Unless you want to stick to 'thee' and 'thy'
(which clearly is not the first form of English). IMO "Begs the question" ==
"Raise the question" makes much more sense than the 'correct' interpretation.

~~~
Qz
I suppose I should of expected that reply.

~~~
sausagefeet
I'm certainly not saying it to be a douche, it's just how language has
worked...forever. Fighting it is pretty useless IMO.

------
swah
Off-topic: IMHO, this page has a unique and beautiful design (reminds academic
papers)

~~~
pjscott
I disagree; it might seem pretty from a distance, but that's about the only
good thing about the design.

The font is almost unreadably tiny, the lines are too long, and the links are
in simple boldface, rather than underlined and/or in a different color. The
headings are _also_ represented by boldface text of the same size, which
actually stands out less than the link text. There's very little visual
distinction between the body of the article and the navigation junk
surrounding it, making for a visually confusing layout.

This is a prime example of why the Readability bookmarklet is great.

~~~
Qz
Yeah, that page lasted about 2.5 seconds before I Readabilitified it.

------
jcapote
"To be honest, Foursquare’s data load would be trivial in any relational
database." Pretty insulting statement, imo.

------
narrator
A little off-topic here, but it kind of puzzles me why MongoDB gets so much
attention and the other half of FourSquare's stack: Scala/Lift, that scales
just fine, and has not given them problems at all gets hardly any attention.

~~~
sausagefeet
I think you answered your own question...

------
bhiggins
Foursquare did add more MongoDB servers, the problem was that the 5% of
records that were migrated off of the full server were small, so the removal
didn't result in many memory pages being released, so it didn't help. This
article does not address how any of the alternatives address small records and
rebalancing with small records.

