Show HN: A tree DB to replace text based configuration when validation matters

atombender · on July 23, 2016

Trees are awesome, but their usefulness is bounded by the human-friendliness of the UI.

For example, a lot of apps use a hierarchical config where you can specify everything as YAML. Elasticsearch comes to mind. This works quite well because any property can be referred to as dotted keypaths, e.g. "discovery.zen.minimum_master_nodes".

Kubernetes also uses YAML/JSON structures as a configuration interface. Kubernetes supports JSON Patch [1], a poor man's patching format for editing keys deep within a document, and it also supports JSONPath [2] for templating.

But what these don't do is treat the entire config as a big tree, and I think that's a good compromise. Kubernetes stores all its config in Etcd, but this config is considered an implementation detail — you're not supposed to edit it directly — and moreover, Etcd is like a file system, not a key/value tree; Kubernetes maps each path to a file, not a single value.

I like this approach, because individual objects can be validated separately by type. Kubernetes has a schema, which each object (YAML file) declares:

    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: postgresql-master

This maps directly to the version of the API that changes need to go through in order to make the configuration persistent.

If you try to make one huge key/value tree, you're more likely to end up with something like the Windows Registry, a huge ball of spaghetti.

[1] http://jsonpatch.com

[2] http://goessner.net/articles/JsonPath/

rixed · on July 23, 2016

Interestingly, when I first saw this project I thought it would be useful to me for exactly this kind of small configuration files describing jobs. I was not considering merging all config files together though. Maybe calling this a DB is misleading as it suggest having a single big file, which indeed I would not recommend.

atombender · on July 23, 2016

I don't know what your project is, but Etcd is excellent for this sort of thing if your project is distributed and you need resilience, and you can live with the external dependency on its daemons.

Etcd is also a hierarchical database, but you can use it like a robust distributed file system. A big benefit you get is liveness; you can respond dynamically to config changes by watching for change events.

rixed · on July 23, 2016

Wouldn't etcd require to setup several servers?

atombender · on July 23, 2016

daenney · on July 23, 2016

Though the documentation has plenty of introductory material and examples on how to use the code, it lacks on the "what can this help me solve" side of things. The examples in section 5 sort of get in to it but it would be nice to have that as part of the introduction to this project. Just from reading it it's not entirely clear what I could or should use this for.

rixed · on July 23, 2016

I plead guilty for this. :) Indeed, this is a friend's project that wasn't intended to be publicly released. I was initially very sceptical about a "tree DB for configuration" as well, but I have been given a demo and eventually considered it interesting and worthwhile, therefore convinced him that he would gather interesting opinions on "Show HN". I'm sure a better version of the documentation (and the code) will be available soon.

thawkins · on July 23, 2016

What does this get you that etcd does not.

daenney · on July 23, 2016

This would be simpler than etcd in that it's not a distributed key/value store but a local-only thing. It's also not a key/value store but more of a graph database and allows you to validate the configuration with predefined schema's (in its own terms), something etcd doesn't do for you.

zepolen · on July 23, 2016

You can open, read, validate and write a small text file in an atomic action already - there is no reason to use anything more complicated.

If your configuration is more than simple, then there are plenty of real dbs that implement everything explained in this article "properly".

This "tree db" is concurrent to read, not to write, doesn't bother with the hard parts of concurrency, ie. deletion of old data and uses an ad hoc json/pointer bs. system to write/read data.

alexchamberlain · on July 23, 2016

This looks pretty awesome as a simple, embeddable graph database; not sure the configuration examples are the best use case to be honest.

mvitorino · on July 23, 2016

So, a bit like Windows Registry? Because that didn't turn out to be such a good idea.

networked · on July 23, 2016

I think this project is more directly comparable to Augeas (http://augeas.net/).

The most common criticism of the Windows Registry is that it is a per-user/per-machine database stored in an opaque format, thus making it harder than necessary to manipulate or version the data for individual applications. On the other hand, this project and Augeas focus on providing means to read and manipulate text-based configuration files that ensure the file contents' integrity is preserved. They do it by parsing the text into a tree first and serializing the tree afterwards. This lets you keep your individual configuration files as small as you wish and easy enough to version and compare.