
Sophia: A modern embeddable key-value database – v1.2.2 released - pmwkaa
http://sphia.org/index.html
======
NhanH
There seems to be a lot of embedded KV store already, and SQLite is pretty
much the defacto embedded relational DB. Is there a good embedded Graph DB
around? Specifically, a embedded property graph DB. Hypergraphdb is
embeddable, but it's not property graph, and while neo4j has an embedded
version, I don't think it works for non-java use case.

~~~
RussianCow
There are a lot of platform specific solutions (neo4j, networkx, Core Data,
etc) but I'm not aware of a generalized solution. I would like to know this
too, because I'm often constrained to certain languages/platforms but would
like to use something like neo4j.

------
fasteo
Be sure to checkout Tarantool[1]; it uses Sophia for on-disk databases

[1] [http://www.tarantool.org](http://www.tarantool.org)

~~~
slapresta
Never heard about it before, it looks interesting.

    
    
      > Tarantool combines the network programming power of Node.JS with data persitence capabilities of Redis.
    

Is that sarcasm? I can't tell.

~~~
vbit
I don't think so.

Tarantool uses an async evented IO model, but uses Lua coroutines and not
Javascript. There are not callbacks, just 'yield points'.

Also, the primary data store backend is an in-memory database with optional
'snapshoting' to disk. An alternative backend uses sophia, so it's not 100% in
memory.

------
MichaelGG
Does anyone have recommendations on a constant DB optimized for sequential
integer keys? Running LZ4 over things is cool, but using delta encoding or
more clever schemes, you can work right on the compressed key data. (And even
more fun if the value is also just a restricted set of integers, like an
inverted index.)

~~~
hyc_symas
LMDB has optimized support for integer keys, as well as for sequentially
sorted data. [http://symas.com/mdb/doc/](http://symas.com/mdb/doc/)

------
gfodor
comparisons to kyoto cabinet, leveldb, and rocksdb (on features, maturity, and
performance) would be great if anyone has any to share.

~~~
donpdonp
also add lmdb/boltdb to that list. there seems to be a convergence around a
certain feature-set for embeddable key/value stores: MVCC semantics and
ordered keys.

------
diydsp
Just to be clear, does embeddable have a specific meaning here? I'm a firmware
developer, often writing code for small CPUs and microcontrollers. Does this
apply? It seems like here "embeddable" means it can be compiled into an app,
as opposed to getting accessed through a server. Is that correct? Thank you.

~~~
ximus
_It seems like here "embeddable" means it can be compiled into an app, as
opposed to getting accessed through a server. Is that correct?_

Yes

whether or not it is a good fit for small CPUs and microcontrollers is another
characteristic that I can't comment on.

~~~
hyc_symas
(since LMDB has been mentioned in this thread... LMDB is embeddable in every
sense of the word. It can work with as little as 64KB of memory and is already
deployed in a number of MCU-based products. Unfortunately I don't have
permission to name names.)

~~~
justin66
Is LMDB running on any operating systems that do not offer mmap support?

~~~
hyc_symas
Not currently. That's kind of a fundamental component of LMDB's design.

~~~
justin66
Thanks. That makes your comment about MCU-based systems that much more
intriguing. :)

------
vezzy-fnord
Since we're throwing around names here, depending on your use case something
as simple as cdb can be amazing:
[http://cr.yp.to/cdb.html](http://cr.yp.to/cdb.html)

------
eternalban
@pmwkee: [http://sphia.org/pv12.html](http://sphia.org/pv12.html) doesn't tell
us the scaling characteristics. The cited performance page is DB at steady
state of 6.0M keys. how does it behave under dynamic load? Various scenarios
to help your potential users determine if the software is a good fit for their
use-case, would be helpful.

Glanced at the code and the arch doc. Looks promising and shows careful
crafting. Well done!

------
johncmouser
Cool! I was looking for a simple key-value alternative for SQLite3 and was
going to use redislite[1]. But this is perfect, I think it has the potential
to replace SQLite3.

[1][https://github.com/seppo0010/redislite](https://github.com/seppo0010/redislite)

~~~
otoburb
SQLite4 is being designed as a key-value alternative to SQLite3[1]. SQLite3
and SQLite4 are meant for different use-cases and are expected to co-exist.
Unfortunately, SQLite4 hasn't yet been released, but wanted to let you know
that the SQLite developers are actively working on addressing the need for an
embedded key-value store with SQLite4.

[1]
[https://sqlite.org/src4/doc/trunk/www/design.wiki](https://sqlite.org/src4/doc/trunk/www/design.wiki)

------
aswanson
_BSD licensed and implemented as small C-written library with zero
dependencies._

What's not to like...

------
virmundi
So why not BerkleyDB? I couldn't find a comparison to the old standard on the
site (granted did just a cursory glance).

~~~
beagle3
Why BerkleyDB?

BerkleyDB is now AGPL3[0], which some projects have a problem with. (Of
course, you can buy a commercial license. Some projects have a problem with
that too).

But the main reason, almost regardless of context, as to "why not BerkleyDB"
is LMDB[1]. It works way better than everything else, in just about every
practical use that has more reads than writes.

The only downside as far as I can tell, is that right now it relies on memory
mapping the entire database, so you're limited to ~1GB overall database size
on 32 bit systems. There is no practical limit on 64 bit systems. Also, I
recall Howard Chu (main LMDB developer) mentioned that in the near future LMDB
will gain the ability to manage memory manually - thus removing this
restriction as well (for a performance price if used this way).

[0] [http://www.oracle.com/technetwork/database/database-
technolo...](http://www.oracle.com/technetwork/database/database-
technologies/berkeleydb/downloads/oslicense-093458.html)

[1] [http://symas.com/mdb/](http://symas.com/mdb/)

~~~
hyc_symas
The feature for re-mapping on 32bit systems is available here
[https://gitorious.org/mdb/mdb/source/69d7cb8d44e04f02d8d0c92...](https://gitorious.org/mdb/mdb/source/69d7cb8d44e04f02d8d0c923ae71fbaaa9f42f3a):

~~~
beagle3
When will it be merged to the mainline? (or is it already there?) Will it
eventually be a run-time option, or always a compile time option?

The latest reference I can find is [http://www.openldap.org/lists/openldap-
devel/201410/msg00001...](http://www.openldap.org/lists/openldap-
devel/201410/msg00001.html) \- were the problems solved?

Thanks for LMDB. It is awesome.

~~~
hyc_symas
It still needs heavier testing before going to mainline. I expect it will only
ever be a compile-time option. On 64bit it's pointless, and 32bit is going the
way of the dodo.

------
halayli
This looks very promising. The code is very clean and optimization is taken
into consideration.

------
raghavsethi
Also, what exactly is a database traversal? Is it a random read benchmark? If
so, what is the distribution - uniform, zipf, or something else?

------
skeoh
What's going on with that logo? It is completely illegible.

