
MemcacheDB: A distributed key-value storage system designed for persistence - chaostheory
http://memcachedb.org/
======
mdasen
For those who might not know how this speedup is possible:

Traditional databases use data structures like b-tree indexes. These are
awesome! They look up data in log(n) time (n being the size of your dataset).
That means that as your data grows, the time to execute a single query doesn't
grow so much. So, what's the problem with databases? Well, the problem is that
as your data set grows, you're likely getting more queries as well. So, you
end up with m * log(n) where m is the number of queries that users are making.
Boo!

What about throwing hardware at it? I hear that's all the rage! The problem
with that is that it's _really, really_ difficult to split data across more
than a handful of machines. Replication is good for a few machines, but
eventually you hit a wall with writes since writes must happen on every server
in the cluster.

Or do they? There's a technique called sharding that you may have heard of.
Basically, it's putting some articles on server 1 and some articles on server
2. Then, when you need article 450, you need to figure out what server it's on
and get it. This is what MemcacheDB (and memcached) do. They hash the key you
give it and that tells them what server it will be on without having to
actually ask each server if they have it. That's a constant time process so it
scales infinitely (unlike those hosting companies that claim they scale
infinitely). All you have to do is keep throwing boxes at it as you grow.

Awesome! That's so much better than MySQL! Well, no. I mean, it is in some
ways, but you can't say something like "get me the latest article that someone
has commented on". That might be useful to your site, might not. All you can
do is lookup by the key.

It gets hard when you want to do something such as getting the comments for an
article. You can't say, get me the comments for article 450. You have to
actually store a list of the comment ids in the article, get the article, and
then request the comments based on that list in the article object. Worse,
lets say two people comment on an article at the same time. You'd better make
sure you start a transaction before reading the object, modify the list, and
save it. If you don't, you'll hit race conditions that can see a comment's
association lost (something you don't have to worry about with foreign-key
associations).

Plus, beyond that trouble, you still can't say "get me the article with the
most comments".

To an extent, I'm wondering how this is much better than memcached. If you
still need/want an RDBMS for query capabilities, do you really need another
key-store that saves to disk? memcached can always be reconstituted from the
RDBMS if a server fails, but I guess MemcacheDB forgoes that warm-up process
(similar to the Shop.com Cache) and that can be a bonus. However, it's a lot
slower than memcached due to the disk access.

It looks really interesting, but I'm not sure how useful it would be (I should
say, in my line of work). I won't give up queries and their site says
emphatically that it's not a cache. If you're doing something that needs a
simple key-value store, this is going to be an amazing product.

~~~
teej
"It gets hard when you want to do something such as getting the comments for
an article. You can't say, get me the comments for article 450. You have to
actually store a list of the comment ids in the article, get the article, and
then request the comments based on that list in the article object. Worse,
lets say two people comment on an article at the same time. You'd better make
sure you start a transaction before reading the object, modify the list, and
save it. If you don't, you'll hit race conditions that can see a comment's
association lost (something you don't have to worry about with foreign-key
associations)."

\----------

I'm glad you brought this up. I've been working on a project that "solves"
this issue. It's basically a distributed memory hash-table like memcached, but
the values are array that can be modified in a CRUD manner. I call Alchemy
(<http://github.com/teej/alchemy/tree/master>)

So in your example you would have:

    
    
        alchemy[ article_450_comment_ids ] = [17, 39, 158]
    

Where you could then look up your three article comments out of Memcache,
MySQL, or whatever else. At the moment, it's mostly just a proof of concept,
though I am using it in a small live app. I feel like a true distributed,
super-fast, relational, in-memory database is close, there are just a few more
pieces of the puzzle to work out. For me, Alchemy is one of those pieces.

~~~
mdasen
You can do it in an even easier way.

For example:

article450 = {'title': 'This Awesome Thread!', 'text': 'Blargh!',
'comment_ids': [17, 18, 19]}

So, you can embed it right in the object just as easily as setting up another
object in the store just for the ids.

That helps with foreign-key type situations, but it doesn't help for a lot of
other things.

I guess my perspective on it is that the restrictions are fine for small apps
with few features, but for small (size of data) apps you don't need 20,000
operations per second. MySQL handles it easily. Likewise, if you're dealing
with a bigger app, you're going to want to be able to query data in more ways.
MemcacheDB is great for what it is: a key-value store. It's not a replacement
for a RDBMS.

In fact, you really can't create such a database as you propose. The issue is
that there are problems in computer science that grow at certain rates. For
example, sorting is an n * log(n) problem. You simply can't do better than
that. That's how indexes work (very simplistically). If you limit yourself to
operations that you can do in constant time, you have hash tables. You can't
query a hash table except by key. Distributed hash tables are decently well
known systems, but they don't replace databases in most applications and you
can't make them more database like unless you eliminate the property that
makes them scale so well.

~~~
teej
"So, you can embed it right in the object just as easily as setting up another
object in the store just for the ids."

That doesn't solve any of the issues for which I built Alchemy. The point was
to create an array store that could be reliably modified without transactions.
I built it after having real-life issues with race conditions in Memcache
where transactions weren't an acceptable alternative.

In the end, I really would like to see cache solutions that can be easily
dropped in for the different bottleneck situations, but keeping the standard
RDBMS.

------
chops
This seems very cool.

I could see this going a step further to merge it with being an actual cache,
keeping much of the frequently used data in memory, while still saving the
data so that it's persistent. I'm for sure going to tinker around with this.

MemcacheQ (message queuing system) linked on that page seems like it will
solve a problem I had been thinking about recently for my system.

On first glance, this looks like some pretty useful stuff in the overall.

~~~
schtono
MemcacheQ seems especially interesting! Anyways, the limitation of 64k-1 of
body length is a bit frustrating...

------
strlen
I've used an earlier version of this - and unless things have changed since
Spring 2008 - this isn't usable with a high volume of traffic: I benchmarked
it by having multiple fork()'d processes load data from a "partitioned" MySQL
database - at one point, memcachedb just gave up and quit reading from the
UNIX domain socket.

A possible alternative is Tokyo Cabinet: Tokyo Cabinet is a DBM implementation
(BDB - on which memcachedb is based - is another) that memory maps the entire
hash bucket. Tokyo Cabinet also has an RPC layer built on top of it which
specifically has a memcached compatibility mode. I haven't yet tried, but I'd
be interested to see what results people have with it:

<http://tokyocabinet.sourceforge.net/index.html>
<http://tokyocabinet.sourceforge.net/tyrantdoc/>

------
ezmobius
I've played with this and I like the idea a lot. but I found that BDB sucks
still and gets corrupted too easily.

So Instead o've been using tokyo tyrant. It supports memcached protocol as
well as HTTP and native client protocols.

<http://tokyocabinet.sourceforge.net/tyrantdoc/>

Much better then memcachedb and serves the same purpose.

~~~
chadr
Do you have any more info on the corruption? I have found bdb to be very
reliable. I used it as the backend storage engine for Sun's LDAP server
product.

------
geuis
Does this support a REST command interface?

~~~
mdasen
No. It supports the memcached interface. Most languages have libraries for
this as memcached is decently popular.

------
moonpolysoft
I've been working a lot on the performance aspects of Dynomite and I think it
is a better solution now than memcachedb. One big issue with memcachedb is
that it leaves replication management completely to the user. So if you want
to be able to seamlessly handle machine failure with a high traffic site, you
need to do a lot of work to set up replicas, do clever things with DNS, and
even then you won't get a guarantee of availability for that data. Dynomite
handles all of that transparently, plus it isn't tied to a single storage
engine. So that you can use a different storage engine based on the type of
load you plan on generating.

The code is here: <http://github.com/cliffmoon/dynomite>

