
Evaluating key-value and document stores for short read data  - peter123
http://bcbio.wordpress.com/2009/05/10/evaluating-key-value-and-document-stores-for-short-read-data/
======
daeken
It surprises me that Tokyo Cabinet is so slow to load, as I've had great
success with it on my own, but I've not used Tyrant/pytyrant. I'd love to see
some information on where the bottlenecks are in the loading process, as it
seems to me that there's something funky going on that can likely be improved
pretty easily.

Edit: He notes in the comments that a switch to pytc (direct Tokyo Cabinet
bindings rather than going through Tyrant) would improve things, but that he
wanted to keep it distributed. I'd still like to see where the bottlenecks
are, though.

~~~
greendestiny
It was probably compressed, hence the size load/time tradeoff.

~~~
chapmanb
Yes, that's right. For the test I ran the tyrant server with:

ttserver test.tcb#opts=ld#bnum=1000000#lcnum=10000

d specifies a deflate encoding. The type of database may also make a
difference on loading times; in this case it was a B+ tree.

------
leej
i dont get it why does each datastore needs a tyrant process running.
anyway...

OTOH, he can do that in any rdbms because he does not have large number of
simultenous users.

~~~
chapmanb
There I was brainstorming ideas to replicate the hierarchical
database/collection/document management that MongoDB and CouchDB offer. One
solution I have used, and became disenchanted with, was combined keys with all
this information munged together. It would be interesting to hear what others
do with their key/value stores.

