Hacker News new | past | comments | ask | show | jobs | submit login
Cdb: a fast, reliable, simple package for creating, reading constant databases (cr.yp.to)
91 points by networked on May 10, 2014 | hide | past | web | favorite | 41 comments



We know that a number of very high-traffic, high-volume startups in Silicon Valley have gone great distances with CDB... OpenDNS included. :-)


Unless you are just storing a couple entries that rarely change, you should really consider http://symas.com/mdb/ nowadays.


CDB is awesome for its use case - slow changing, read-heavy workflows that are tolerant of stale data. One limitation of the original implementation is the use of 32-bit keys for addressing, which limit the addressable size to 4gb. There are 64 bit modifications, but I have not used them. Does anyone have an opinion on any of the 64-bit implementations?


An alternative to using 64-bit pointers would be to use something like the JVM's "compressed OOPs":

https://wikis.oracle.com/display/HotSpotInternals/Compressed...

Where the pointers are notionally 32+k bits long, but the bottom k bits are always zero, so they can be stored in 32 bits. This would mean that records always have to be aligned to a 2^k-byte boundary. If the data has some natural alignment anyway, this could be arranged so as to not involve wasted space, and even if it doesn't, for large keys and/or values, the amount of wasted space would be relatively small.

Alternatively, there was a way to reliably locate the start of a record when scanning through data, the records could be packed without padding, and the 2^k-aligned pointers could simply be approximate, pointing to somewhere in the 2^k bytes before the start of the record. Retrieval would involve following the pointer, then scanning forward to find the actual record. This would be a bit sketchy, but something a bit like this is done in ATM:

http://en.wikipedia.org/wiki/CRC-based_framing

CDB doesn't include CRCs. However, you could think up various schemes to identify records based on what you know about the record format, the key you're looking for, and its hash. It's probably not worth the effort!


If you're looking for a very straightforward 64-bit port, [I'm the author of] https://github.com/pcarrier/cdb64


Supposedly https://github.com/gstrauss/mcdb is quite good.


This here.....Glenn knows his stuff and the testing used by mcdb is great.


See also tinycdb, a public domain reimplementation http://www.corpit.ru/mjt/tinycdb.html


The original code is also in the public domain, since 2009: http://cr.yp.to/distributors.html


CDB is one of my favorite data structures. When a student wants to learn about databases, I get them to implement cdb.

It's easy to implement and really demonstrates some good system engineering tradeoffs.


Any links to theory behind cdb?


Here's an articles I came across that explains the structure: http://www.unixuser.org/~euske/doc/cdbinternals/

I think the theory is mostly that it's really simple and ends up requiring very few disk reads.


I'd be particularly interested to know why there are 256 hashtables, rather than one, or some other number. I don't think i've seen this hybrid trie-hashtable before.

I wonder if there's any mileage in using a perfect hash function to build a database like this. It seems suited to the operating model of being slow to build but fast to access.


i second that; if anyone knows why there are 256 hashtables rather than just 1, please speak up. my only guess is that it might be a way to prevent 32-bit int overflow in the C code..


I'm put in mind of all the things that take up the most space on a minimal Ubuntu installation: charmaps, mime types, tzdata, geoip, etc. Seems like all of these could do with being packed into a tight, read-only K/V store.


What advantages does this have over SQLite?


It's extremely fast and it's easy to work with for reading, but it's just a key-value store. Also, it's sort of weird to work with on the write side. You have to do atomic replaces of the read-only data files.


Regarding writes, most likely the expectation is that you'll update your entire data set periodically in a batch job, or export and cache another authoritative data store. I understood the atomic replace feature as referring to a convenient way to flip to a new version of the database after its production & distribution by a batch job. It sounds like the "cdbmake" tool is designed to facilitate this.

A lot of data doesn't change very often; and a lot of data can tolerate propagation delays on change. When you have an extremely high read volume against such data sets, it can be economical to cache the entire data set and distribute it periodically to the fleets of machines that require access, as opposed to servicing individual reads over a network. Provides lower cost and latency, while supporting a higher volume of reads and higher availability, at the expense of engineering cost and propagation delay.


I don't know how many people remember this, but this kind of "weird atomic replaces of the read-only data file" was very common on unix-systems for a long time.

Take sendmail: people would have their email-aliases in a /etc/aliases file, with a simple ascii-format, lines formatted according to: "<key>: <values...>". The "newaliases" command then would convert the content of the ASCII file /etc/aliases into the binary (originally: Berkeley db) /etc/aliases.db.

As far as I know, CDB was developed initially for qmail, DJBs email-daemon, for exactly that purpose.

Another example for this is the "Yellow-Pages" (yp) / NIS directory services to distribute usernames/userids/groups/... in a UNIX cluster or network. ASCII files were converted into e.g. /var/yp/DOMAIN/passwd.byname, a .db-file where keys are the usernames from /etc/passwd and values were the full ASCII lines from /etc/passwd, same goes for passwd.byuid, group.byname, and so on. All for fast indicing, so that the network server didn't have to repeatedly parse /etc/passwd.


It can be reimplemented from scratch in a few hundred lines of code?

It's not really the same thing, though. CDB provides a single key -> value mapping in a single file, with updates by replacing the entire file; SQLite is a relational database with updates, locking, transactions, SQL, etc.


i think the appropriate comparison would be bdb, not SQLite. and from the site, I can't tell?


cdb is immutable - when you want to change the contained data set you generally rebuild the file, flip a fs link to the new version, and release file descriptors referencing the old version in applications when it's time to move on. bdb updates in place, but is slower for reads.


IYAI Common lisp's quicklisp implements and uses cdb https://github.com/quicklisp/quicklisp-client/blob/master/qu...


How does this compare to LMDB (http://symas.com/mdb/)?


Weird side-note, when using CDB's from Perl, do not use tie, its painfully unperformant (realized this the hard way)


Indeed the documentation notes this.

https://metacpan.org/pod/CDB_File#PERFORMANCE


I was handed code that used tie, took me a while to realize what was killing my perf..


CDB isn't really a DBM.


I was handed code already written using tie


I'd like to see a more up-to date CDB with the 4Gb limit removed. Writing could be implemented by using additional smaller CDB or text file. It would be an interesting problem to find at what size a flat text file should be converted into CDB to maximize speed on the whole.


There is an also sdb - simple string based key-value database with support for arrays and json, and based on CDB https://github.com/radare/sdb


Can anyone tell me what prevents it from running on Windows?


Found out why (or maybe there is more) - it's using mmap() which is not directly available on Windows (but there is MapViewOfFile and CreateFileMapping).


No random limits: cdb can handle any database up to 4 gigabytes.

Am I the only one who finds it hilarious that the first thing after the "no random limits" heading is a random limitation?


When you have 32-bit pointers, 2^32 isn't a random limit.


That's cool. But I feel like I'll totally forget about it and lose reference to this (in case I have future interest).

Where's a github mirror?

A google search reveals some entries from the language implementations.

Go: https://github.com/jbarham/go-cdb Java: https://github.com/malyn/sg-cdb Haskell: https://github.com/adamsmasher/hs-cdb


Most open source development takes place outside of github. It is a very valley-centric thing.

You'll find a ton of life changing stuff on SourceForge and random FTP sites.


Notice cordite said a Github mirror, not the Github mirror. You don't need to have your main development happen on Github to have a mirror there, just someone mirroring from the main server.

See, for example, the Linux kernel.


to be fair, i assume he's only asking because github starring something is actual a fairly good way of keeping track of projects in the way he's talking about. i agree that it sounds a bit presumptuous to say "where's the github mirror, mr. unpaid contributor to open source project", but on the flip side it's a fair point in terms of potentially capitalizing on exposure for a project. and also it's very true that you can sometimes find the best code in very unassuming places.


I'd be really interested to see statistics on this. I don't work in Silicon Valley, and almost all of the stuff people around me are publishing is on GitHub.


That's not entirely fair. Github is not a valley-centric thing, it is used by a large proportion of projects started since 2008 or so.




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: