

Improving Running Components at Twitter - aditya
http://blog.evanweaver.com/articles/2009/03/13/qcon-presentation/

======
lr
So why are so few, to no, startups using LDAP? I would think LDAP would be a
perfect short and long-term solution for twitter. They could put all of their
users in LDAP, and then all of the people they are following into a group
(each user gets a group). Now, when you want to know who someone is following,
that is one query (no joins like in a DB!), and then if you want to know who
is following them, again, it is one query.

LDAP (like Sun's LDAP which is free) scales beautifully (it was designed for
this, just ask the telcos), and you can easily put all of this info in memory.
Sun's 6.3 version handles large groups like this very well. (I do not work for
Sun.) And if you want open source, they could use OpenDS or Fedora DS (but I
don't know how well Fedora would do with huge groups).

Looking at those slides makes my head spin in terms of what they have tired to
engineer with memcache, etc. It just seems like they are trying to fit square
pegs into round holes.

(This is the same comment I left on Evan's blog.)

~~~
iigs
At the size they're doing things, the algorithims, data structures, and
application specific optimizations (assumptions you can make about the data)
are where the big challenge is. It would be relatively simple to export/offer
the data as an ldap service, but the wire protocol is arguably an
uninteresting technical anecdote.

To see this in action, slap together a sun or oracle ldap server and fill it
with 2-4x physical ram worth of data and benchmark non-trivial queries against
it. I believe you'd see better results than a naive sql implementation but it
would be quite undesirable compared to their real world transaction rates.

~~~
lr
The whole point is that you have more RAM than data, not less. If you are
trying to do queries in LDAP and all of the data (entry db and index dbs) do
not fit in RAM, then forget about it. This is why you would design the system
to take growth into account. Modern LDAP servers -- in conjunction with LDAP
proxy servers -- are designed to do this very thing. Verison has their 75
million wireless subscriber "database" in LDAP, not a RDBMS, and the entire
dataset is in memory (spread across dozens of servers).

------
antirez
Implementing a scalable Twitter is the "Hello World" application of Redis:
<http://code.google.com/p/redis/wiki/TwitterAlikeExample>

(here "Hello World" does not mean the simplest, but the one that is non
obvious and exploits many important features to understand the system).

