Show HN: SlateDB – An embedded storage engine built on object storage

Reubend · 2024-08-14T21:36:12 1723671372

It's a very very cool idea, but I'm still not clear on the main benefits.

Bottomless storage: yes, but couldn't you theoretically achieve this with plenty of cloud DB services? Amazon Aurora goes up to 128 TB, and once your DB gets to that size, it's likely that you can hire some dedicated engineers to handle more complicated setups.

High durability: yes, but couldn't this be achieves with a "normal" DB that has a read replica using object storage, rather than the entire DB using object storage?

Easy replication: arguably not easier than normal replication, depending on which cloud DB you're considering as an alternative.

roh26it · 2024-08-15T03:35:06 1723692906

Also wondering if this would become expensive very fast if it ends up using S3 with a large number of PUT calls

dangoodmanUT · 2024-08-15T11:18:20 1723720700

tune the write flush interval down, it'd be very cheap

dangoodmanUT · 2024-08-15T11:18:02 1723720682

if the benefits are not obvious to you, then you're not the target user, or you don't understand what kind of person needs this

there's a class of folks who desperately need this. It's the KV equivalent to turbopuffer

dangoodmanUT · 2024-08-15T11:25:18 1723721118

I've been working on something super similar, but some of the arch decisions here are curious considering the clear tradeoffs made.

For example, if you have a durability flush interval, what is the WAL for? L0 is the WAL now.

riccomini · 2024-08-15T17:14:41 1723742081

Great question! We started out with the design you described--WAL as L0. But we found that there's a bit of a tension between wanting to have L0 SSTs be larger (and having fewer of them) to reduce metadata size, while we wanted to keep WAL SSTs small and frequent (to reduce async/await latency).

Basically, we wanted to have WAL writes go on the order of milliseconds, but we wanted L0 SSTs to be larger since they actually service reads.

The architecture page has more detail if you haven't found it yet:

https://slatedb.io/docs/architecture

Already__Taken · 2024-08-15T05:43:12 1723700592

Seems analogous to putting seaweedfs in front of a cloud S3. Then adding a database. We use(unrelated) Zenoh and Loki keeping state on S3 so it would be interesting to have a KV engine.