Cdb: a fast, reliable, simple package for creating, reading constant databases

davidu · on May 10, 2014

We know that a number of very high-traffic, high-volume startups in Silicon Valley have gone great distances with CDB... OpenDNS included. :-)

jedisct1 · on May 10, 2014

Unless you are just storing a couple entries that rarely change, you should really consider http://symas.com/mdb/ nowadays.

krenoten · on May 10, 2014

CDB is awesome for its use case - slow changing, read-heavy workflows that are tolerant of stale data. One limitation of the original implementation is the use of 32-bit keys for addressing, which limit the addressable size to 4gb. There are 64 bit modifications, but I have not used them. Does anyone have an opinion on any of the 64-bit implementations?

twic · on May 10, 2014

An alternative to using 64-bit pointers would be to use something like the JVM's "compressed OOPs":

https://wikis.oracle.com/display/HotSpotInternals/Compressed...

Where the pointers are notionally 32+k bits long, but the bottom k bits are always zero, so they can be stored in 32 bits. This would mean that records always have to be aligned to a 2^k-byte boundary. If the data has some natural alignment anyway, this could be arranged so as to not involve wasted space, and even if it doesn't, for large keys and/or values, the amount of wasted space would be relatively small.

Alternatively, there was a way to reliably locate the start of a record when scanning through data, the records could be packed without padding, and the 2^k-aligned pointers could simply be approximate, pointing to somewhere in the 2^k bytes before the start of the record. Retrieval would involve following the pointer, then scanning forward to find the actual record. This would be a bit sketchy, but something a bit like this is done in ATM:

http://en.wikipedia.org/wiki/CRC-based_framing

CDB doesn't include CRCs. However, you could think up various schemes to identify records based on what you know about the record format, the key you're looking for, and its hash. It's probably not worth the effort!

ptramo · on May 10, 2014

If you're looking for a very straightforward 64-bit port, [I'm the author of] https://github.com/pcarrier/cdb64

joshu · on May 10, 2014

Supposedly https://github.com/gstrauss/mcdb is quite good.

jrossi · on May 10, 2014

This here.....Glenn knows his stuff and the testing used by mcdb is great.

asb · on May 10, 2014

See also tinycdb, a public domain reimplementation http://www.corpit.ru/mjt/tinycdb.html

dne · on May 10, 2014

The original code is also in the public domain, since 2009: http://cr.yp.to/distributors.html

ctdean · on May 10, 2014

CDB is one of my favorite data structures. When a student wants to learn about databases, I get them to implement cdb.

It's easy to implement and really demonstrates some good system engineering tradeoffs.

zura · on May 10, 2014

Any links to theory behind cdb?

netghost · on May 10, 2014

Here's an articles I came across that explains the structure: http://www.unixuser.org/~euske/doc/cdbinternals/

I think the theory is mostly that it's really simple and ends up requiring very few disk reads.

twic · on May 10, 2014

I'd be particularly interested to know why there are 256 hashtables, rather than one, or some other number. I don't think i've seen this hybrid trie-hashtable before.

I wonder if there's any mileage in using a perfect hash function to build a database like this. It seems suited to the operating model of being slow to build but fast to access.

geogriffin · on May 11, 2014

i second that; if anyone knows why there are 256 hashtables rather than just 1, please speak up. my only guess is that it might be a way to prevent 32-bit int overflow in the C code..

derefr · on May 10, 2014

I'm put in mind of all the things that take up the most space on a minimal Ubuntu installation: charmaps, mime types, tzdata, geoip, etc. Seems like all of these could do with being packed into a tight, read-only K/V store.

ludamad · on May 10, 2014

What advantages does this have over SQLite?

zrail · on May 10, 2014

It's extremely fast and it's easy to work with for reading, but it's just a key-value store. Also, it's sort of weird to work with on the write side. You have to do atomic replaces of the read-only data files.

jcrites · on May 10, 2014

Regarding writes, most likely the expectation is that you'll update your entire data set periodically in a batch job, or export and cache another authoritative data store. I understood the atomic replace feature as referring to a convenient way to flip to a new version of the database after its production & distribution by a batch job. It sounds like the "cdbmake" tool is designed to facilitate this.

A lot of data doesn't change very often; and a lot of data can tolerate propagation delays on change. When you have an extremely high read volume against such data sets, it can be economical to cache the entire data set and distribute it periodically to the fleets of machines that require access, as opposed to servicing individual reads over a network. Provides lower cost and latency, while supporting a higher volume of reads and higher availability, at the expense of engineering cost and propagation delay.

cnvogel · on May 10, 2014

I don't know how many people remember this, but this kind of "weird atomic replaces of the read-only data file" was very common on unix-systems for a long time.

Take sendmail: people would have their email-aliases in a /etc/aliases file, with a simple ascii-format, lines formatted according to: "<key>: <values...>". The "newaliases" command then would convert the content of the ASCII file /etc/aliases into the binary (originally: Berkeley db) /etc/aliases.db.

As far as I know, CDB was developed initially for qmail, DJBs email-daemon, for exactly that purpose.

Another example for this is the "Yellow-Pages" (yp) / NIS directory services to distribute usernames/userids/groups/... in a UNIX cluster or network. ASCII files were converted into e.g. /var/yp/DOMAIN/passwd.byname, a .db-file where keys are the usernames from /etc/passwd and values were the full ASCII lines from /etc/passwd, same goes for passwd.byuid, group.byname, and so on. All for fast indicing, so that the network server didn't have to repeatedly parse /etc/passwd.

carey · on May 10, 2014

It can be reimplemented from scratch in a few hundred lines of code?

It's not really the same thing, though. CDB provides a single key -> value mapping in a single file, with updates by replacing the entire file; SQLite is a relational database with updates, locking, transactions, SQL, etc.

spydum · on May 10, 2014

i think the appropriate comparison would be bdb, not SQLite. and from the site, I can't tell?

krenoten · on May 10, 2014

cdb is immutable - when you want to change the contained data set you generally rebuild the file, flip a fs link to the new version, and release file descriptors referencing the old version in applications when it's time to move on. bdb updates in place, but is slower for reads.

PuercoPop · on May 10, 2014

IYAI Common lisp's quicklisp implements and uses cdb https://github.com/quicklisp/quicklisp-client/blob/master/qu...

sophacles · on May 10, 2014

How does this compare to LMDB (http://symas.com/mdb/)?

wcummings · on May 10, 2014

Weird side-note, when using CDB's from Perl, do not use tie, its painfully unperformant (realized this the hard way)

tantalor · on May 10, 2014

Indeed the documentation notes this.

https://metacpan.org/pod/CDB_File#PERFORMANCE

wcummings · on May 11, 2014

I was handed code that used tie, took me a while to realize what was killing my perf..

joshu · on May 10, 2014

CDB isn't really a DBM.

wcummings · on May 20, 2014

I was handed code already written using tie

visarga · on May 10, 2014

I'd like to see a more up-to date CDB with the 4Gb limit removed. Writing could be implemented by using additional smaller CDB or text file. It would be an interesting problem to find at what size a flat text file should be converted into CDB to maximize speed on the whole.

xvilka · on May 10, 2014

There is an also sdb - simple string based key-value database with support for arrays and json, and based on CDB https://github.com/radare/sdb

malkia · on May 10, 2014

Can anyone tell me what prevents it from running on Windows?

malkia · on May 10, 2014

Found out why (or maybe there is more) - it's using mmap() which is not directly available on Windows (but there is MapViewOfFile and CreateFileMapping).

krajzeg · on May 10, 2014

No random limits: cdb can handle any database up to 4 gigabytes.

Am I the only one who finds it hilarious that the first thing after the "no random limits" heading is a random limitation?

self · on May 10, 2014

When you have 32-bit pointers, 2^32 isn't a random limit.

cordite · on May 10, 2014

That's cool. But I feel like I'll totally forget about it and lose reference to this (in case I have future interest).

Where's a github mirror?

A google search reveals some entries from the language implementations.

Go: https://github.com/jbarham/go-cdb Java: https://github.com/malyn/sg-cdb Haskell: https://github.com/adamsmasher/hs-cdb

dsl · on May 10, 2014

Most open source development takes place outside of github. It is a very valley-centric thing.

You'll find a ton of life changing stuff on SourceForge and random FTP sites.

icebraining · on May 10, 2014

Notice cordite said a Github mirror, not the Github mirror. You don't need to have your main development happen on Github to have a mirror there, just someone mirroring from the main server.

See, for example, the Linux kernel.

keeganpoppen · on May 10, 2014

to be fair, i assume he's only asking because github starring something is actual a fairly good way of keeping track of projects in the way he's talking about. i agree that it sounds a bit presumptuous to say "where's the github mirror, mr. unpaid contributor to open source project", but on the flip side it's a fair point in terms of potentially capitalizing on exposure for a project. and also it's very true that you can sometimes find the best code in very unassuming places.

twic · on May 10, 2014

I'd be really interested to see statistics on this. I don't work in Silicon Valley, and almost all of the stuff people around me are publishing is on GitHub.

pekk · on May 10, 2014

That's not entirely fair. Github is not a valley-centric thing, it is used by a large proportion of projects started since 2008 or so.