
Berkeley DB Architecture - NoSQL Before NoSQL Was Cool - pron
http://highscalability.com/blog/2012/2/20/berkeley-db-architecture-nosql-before-nosql-was-cool.html
======
antirez
Berkley DB was not SQL, and accordingly to my own classification of NoSQL
(that is, at least one of the two must be true: [1] A different data model
compared to SQL, [2] A different tradeoff in CAP compared to traditional DBs)
Berkley DB should be classified as a NoSQL database.

However I think it was not a NoSQL database for an important reason, it was
_only_ embedded in other programs, and not in the form of a networked server.
Not only this raised the barrier to entry, it never made Berkeley DB an
"object" in your infrastructure that you could use in different ways, and in a
competitive way with other databases.

So the Berkeley DB creators did not started the NoSQL era long time before
because they missed the importance of what they were doing, thinking that
their DB was limited to just something you could bind to programs not needing
all the power of an SQL database. At least this is what emerges form their
choices. And the implication is that they were also considering relational
databases as the only "real DBs" in my opinion.

So there was no real competition with traditional DBs, and it was not a NoSQL
DB.

EDIT: I understand this is a mostly personal point of view, and not an
objective critique, but I wanted to share it with HN nevertheless.

~~~
damian2000
And probably the reason is that is wasn't networked running as a server was to
reduce network traffic, which was obviously a lot slower (and a bottleneck)
back when it was created. Also, it would be trivial to expose its
functionality as a networked server, if that's what you needed.

------
luser001
I'd like to hear comparisons between Leveldb and Berkeley DB.

I'm using LDB in my most-recent project. I wrote a trivial HTTP wrapper around
it using Mongoose to make it a "server". :)

I like the snapshotting feature, the support for transactions, the built-in
support for accessing the db from multiple threads, and that it maintains keys
in sorted order (this great for timestamped keys like I have).

I looked at BDB a while back, and IIRC it doesn't have snapshotting and sorted
keys. Unsure about threads.

I use the snapshotting feature dump a backup copy of the database.

The BDB API seemed significantly more complex than Leveldb's. IIRC, the native
API is C.

I like C++ and like that Leveldb's native API is C++. It uses the standard
library string (which allows embedding null characters).

~~~
btbuilder
Berkeley DB has many features and many modes. This, IMHO is part of the reason
it has a steep learning curve (and why others have had a bad experience with
it).

It is easy to programmatically configure it incorrectly. E.g. not use
transactions when you should, fail to deal with deadlocks or not clean up
locks of failed threads or processes.

Some of the mature features it has are:

Page-level locking for concurrency via threading and multi-process. Snapshot
isolation for MVCC - you can get a high isolation level without read locks.
Nested transactions. B-tree indexes. Two-phase commit. You can also throw all
this out the window and use an in-memory database or non-durable

The newer versions even have replication.

BDB is a hard beast to tame properly, but is definitely fully featured, and
mature.

The Oracle paper on the BDB backed SQLite makes an interesting read.

------
tete
One simply has to mention Tokyo/Kyoto Cabinet/Tyrant.

<http://fallabs.com/>

With all the NoSQL Hype Redis and MongoDB get I think these are too often
overseen. What's really nice is that you can use them in embedded (cabinet) or
server (tyrant) "mode". What's also great about them is that they provide a
lot of flexibility, since they are not just one database. Also great that they
have official bindings that work. They have tons of features and sometimes an
embedded database is really interesting when you don't want (need) to care
about the protocol you send it over.

~~~
pak
I just found these two little jewels and have been using them to provide fast
access to a haystack of images (GB to TB size, each image about ~1KB). I am
curious to hear what others think about moving from Tokyo to Kyoto--the
website says Kyoto is recommended, but there is a real dearth of information
about Kyoto (at least in English) out on the web, compared with the scattered
breadcrumbs that you can sort of piece together for information on Tokyo. Most
of those breadcrumbs are unfortunately silly benchmark articles instead of
real experiences.

Also, I already did have one Tokyo Cabinet Hashtable go corrupt on me--which
caused certain operations to lock up the CPU indefinitely. Hmm, that didn't
ease my conscience. However, it was completely recoverable with "tchmgr
optimize". It'd be nice to google for more war stories, but like I said...
breadcrumbs so far.

------
orp
I've used BDB a lot over the last 10 years, embedded inside a C++ server, and
I have to say I've been disappointed in its multi-threaded scalability.

Running a single threaded access pattern can easily get you 20K plus
reads/sec, but if you try to run more threads the throughput per thread just
goes down, up to a point where more threads actually slow you down.

Make sure to run extensive benchmarks if you consider using it in a multi-
threaded application.

I've never tried running BDB in a multi-process architecture, so I have no
idea how it'd behave when used that way.

~~~
rbranson
BDB isn't really designed for scalable performance. It's for good performance
and support for concurrent, ACID transactions. Tens of thousands of reads per
second is reasonable for the vast majority of embedded applications, which
often live on the desktop or mobile devices.

------
mcbain
Coincidently, Keith Bostic (co-creator of BDB) announced his new project in
the last few weeks: <http://wiredtiger.com/>

I don't know all that much about it, but given the background of those
involved, should be worth keeping an eye on.

------
DanielRibeiro
Direct links <http://www.aosabook.org/en/bdb.html> ,
<http://news.ycombinator.com/item?id=3607914>

------
alatkins
Then there's Gelernter's Linda [1] and it's tuple space model [2], the
granddaddy of non-relational data stores:

[1] <http://en.wikipedia.org/wiki/Linda_(coordination_language)>

[2] <http://en.wikipedia.org/wiki/Tuple_space>

------
jrydberg
Some sane design lessions in the original material:
<http://www.aosabook.org/en/bdb.html>

------
zandorg
Is there an alternative to Berkeley DB which has a hash database file on disk?
Rather than in RAM? The problem is I use BDB 1.86 on Windows, because it's
free, but the DB size is limited to 2GB. I want to use an alternative which is
just as fast and pretty much the same interface.

~~~
emidln
This is designed to be very compatible with BDB:
<http://fallabs.com/tokyocabinet/> and is open source.

This builds on the ideas from BDB and Tokyo Cabinent with a newer codebase:
<http://fallabs.com/kyotocabinet/>

Both have network server implementations available (see links from their
pages).

------
Maro
Friends don't let friends use BerkeleyDB.

[http://www.google.com/search?hl=en&q=bdb+corruption](http://www.google.com/search?hl=en&q=bdb+corruption)

~~~
cjensen
My first exposure to BerkeleyDB was when Subversion was brand new. There was a
perl script which converted from CVS to SVN and used BerkeleyDB during the
conversion. I had to manually replace BerkeleyDB with a hand-written key/value
store in the filesystem to make the script work.

I used Subversion since pre-1.0. The only problems I've ever had with
Subversion were caused by BerkeleyDB failing to be sufficiently robust. Since
Subversion eliminated BDB use, I've never had a problem with it.

BerkeleyDB is dead to me.

