
BuntDB – Fast, embeddable, in-memory key/value database for Go with geospatial - tidwall
https://github.com/tidwall/buntdb
======
jimktrains2
It's not really geospatial; you just have multidimentional indices and an
intersect operation. I'm not dissing you, indices like that can be extremely
useful and tricky to implement!

That said, why only 4D? 5D is very useful, but no one seems to support it :(
(x,y,z,time,value)

~~~
tidwall
Thank you for the feedback. I like the idea of lifting the cap for number of
dimensions. I hardcoded a limit of 4 to handle the standard XYZM, but the
implementation could technically handle around 20.

------
mappu
Single-file and 1.2Kloc, clearly inspired by Bolt but adds indexes, which is
great.

I actually use Bolt in a project at the moment - what's the performance story
for general-purpose database use?

~~~
tidwall
Thanks for the kind words.

I really like Bolt. It's a wonderful library with an good API. I was inspired
by the simplicity of it's transaction model.

Both Bolt and Bunt are ACID, and both persist data to disk. The biggest
difference between them is that Bolt reads and writes from disk, while Bunt
reads and writes from memory (and has an append-only file for durability).

Therefore the amount of data that Bolt can handle is limited by the size the
disk, while Bunt is limited by the amount of RAM.

A general purpose database user will likely see a bump in performance by
moving from Bolt to Bunt. What I'm seeing for my projects about 2x on reads
and about a 40x on writes. I wrote a Raft store implementation that is a drop-
in replacement for the the Bolt version. Here's a comparison benchmark:
[https://github.com/tidwall/raft-
boltdb#benchmarks](https://github.com/tidwall/raft-boltdb#benchmarks)

It really comes down to what you need. Lots of data, or lots of speed.

~~~
rakoo
> Both Bolt and Bunt are ACID, and both persist data to disk. The biggest
> difference between them is that Bolt reads and writes from disk, while Bunt
> reads and writes from memory (and has an append-only file for durability).

Just to be sure: does this mean that Bunt has a window of time where data is
purely in ram only, and it is eventually persisted ? Because the description
made me think that BuntDB was purely in-memor. Is there some upper limit on
how much time an object may be in memory but not persisted yet ?

On another note, congrats for this project. I see that you changed the default
"Set" to use strings instead of bytes, this was a bit of a pain point when I
used BoltDB. Indexes should also be interesting.

~~~
tidwall
Bunt is a purely in-memory database, but it also persists to disk so that the
database can be reopened. It's a lot like Redis in this manner.

Basically, BuntDB requires that data be persisted prior to completing a
transaction. There is no window of time where there is data in memory and not
on disk. It's designed so that there is no way for data to exist in memory and
not be on disk.

I decided that strings were a better way to go because 1) the string is the
most common type in a key/value database, 2) strings take up less memory than
a byte slice, and 3) strings are just bytes anyhow so they can always be
converted using []byte(str).

Thanks for the kind words and I hope you give it a try.

~~~
rakoo
So, what about this paragraph at the end:
[https://github.com/tidwall/buntdb#durability-and-
fsync](https://github.com/tidwall/buntdb#durability-and-fsync)

In the default configuration there is a 1 second window where data is not
fsynced to disk ?

------
herge
I can understand why they only allow one read/write transaction at a time.

However, could they implement multiple concurrent read/write transactions by
having the transaction fail if it writes to any key modified by any other
concurrent transaction?

Like if writer X modifies a key at time t1, but writer Y opens a transaction
at time t0 and tries to modify the same key at time t2, Y is told their
transaction is invalid and should restart their operation from the beginning.

~~~
jhugg
Sometimes this is slower than serialization. In fact, when you’re doing KV-
CRUD work on in-memory data, it’s _often_ slower than serialization. Keeping
RW-sets is non-trivial overhead compared to the hardest typical part of KV-
CRUD, tree or hash lookups.

Many many systems have more parallelism, but less throughput.

Now, if you want to prevent one transaction with a bad-actor blocking the
system, then RW-sets, timeouts and OCC/MVCC might be a good idea, it just
won’t be faster.

------
liotier
Is it common for this sort of database to not expose an interface over IP ? It
seems to me that a local-only database would severely restrict the use-cases -
but maybe I'm just ignorant of many local-only uses. Or should another program
handle the networking, with BuntDB as a backend ?

~~~
danielheath
It's common for programs to embed a database engine. Informally, you're doing
this every time you write any structured data to a file.

Baking a database into your application _drastically_ simplifies
distribution/deployment and avoids network bottlenecks (at the cost of
restricting your choice of storage engine, making it harder to hire staff
experienced with your tech, etc).

