Why would someone remove a non-GC database engine with a database engine with GC...

shakna · on Sept 16, 2020

Upthread [0], you can find some notes and an explanation from one of the devs that they actively work around the GC, because the approach it uses just... Doesn't work for this kind of workload.

> This percentage can be configured by the GOGC environment variable or by calling debug.SetGCPercent. The default value is 100, which means that GC is triggered when the freshly allocated data is equal to the amount of live data at the end of the last collection period. This generally works well in practice, but the Pebble Block Cache is often configured to be 10s of gigabytes in size. Waiting for 10s of gigabytes of data to be allocated before triggering a GC results in very large Go heap sizes.

[0] https://news.ycombinator.com/item?id=24485931

mateus_amin · on Sept 15, 2020

I wonder the same thing. I have not programed a DB but..

I would think the worst thing to program in a gc'd language would be the cache. So, if you write that layer of the database separately in a non-gc'd you would avoid most of the headache.

EDIT: I read the parent comment as why program a db in a gc language. I think the new wave 1st gen db's are doing alot of novel things. Minus the cache I imagine the velocity improvements form the 'simpler' language makes sense. Key value stores are become well trod ground however.

tonetheman · on Sept 15, 2020

I have had long running Java programs and their memory usage if done correctly is a nice saw tooth. As long as you are not leaking you can run a well written Java server for long long periods of time (months) without a bounce.

If you are leaking not so much. :)

FridgeSeal · on Sept 15, 2020

I’d also be curious why they didn’t go with something like Foundation DB either.

petermattis · on Sept 15, 2020

Pebble and FoundationDB are apples and oranges. Pebble is per-node KV storage engine. FoundationDB is a distributed multi-modal database. Internally, FoundationDB uses a library like Pebble for the per-node data storage. I think at one point it used SQLite. I'm not up to date on what it currently use. I seem to recall FoundationDB was writing their own btree-based node-level storage engine to replace the usage of SQLite.

The equivalent of FoundationDB is present inside of CockroachDB: a distributed, replicated, transactional, KV layer. This is where a big chunk of CockroachDB value resides. This is where our use of Raft resides. Pebble lies underneath this.

ryanworl · on Sept 15, 2020

The current production storage engine is an old-ish version of the SQLite btree. A new btree engine is being written now and is available but I don’t know if it is being used in production anywhere.

RocksDB is shipping soon thanks to some work by members of the community.