

YawnDb: Time Series Database Written in Erlang - skazka16
http://kukuruku.co/hub/erlang/yawndb-time-series-database

======
ejp
Any benchmarks or anecdata about how this performs with millions of 'paths'?
What about data larger than memory?

Also, you mention difficulties with changing RRD settings, like time
intervals. How does this handle similar config migration?

I've recently been involved in RRD replacement as well for the same reasons
you cite in the article. Our technology choice was OpenTSDB. How does YAWNDB
stack up? (Granted, OpenTSDB is in another category of deployment complexity.)

~~~
lambdadmitry
Regarding benchmarks: here is a screenshot that I've made during development
[http://i.imgur.com/wEbn4.png](http://i.imgur.com/wEbn4.png) Things to note:
1) ~100k "paths"; 2) ~18.6k RPS; 3) very nice load distribution between CPUs
(actually, it will scale almost linearly because of it's architecture); 4) CPU
isn't saturated at all (my laptop wasn't able to push enough load). I remember
that I've observed a bit of non-linearity in CPU load (there is some sub-
linear overhead), so it should handle more than 3x that load.

------
azdle
This looks really interesting.

How are you writing the data to disk? I don't see exactly what you're doing in
the article. Are you just writing the data to files directly or is there some
sort of DB in there? From what I've seen, the storage is the most difficult
part.

~~~
oinksoft
Data appears to be persisted to disk via Basho's bitcask. You mentioned in
another comment that ETS is being used, which it is, but that's not on disk
(DETS is the disk-backed version); ETS is used here to communicate between
processes via public, named tables.

~~~
azdle
Thanks for correcting me. I'm still just dipping my toe into erlang and forgot
that there were two different versions.

(And thanks to lambdadmitry for the additional insight.)

------
lambdadmitry
Hi there! Ask me anything, I was an original author of the DB before Pavel
took up it's development after I left. Frankly, I was really surprised to see
the link here — Pavel did an excellent job describing our rationale in
English.

~~~
otterley
Why didn't you use Cassandra for storing time-series data? The technology is
already out there and used by pretty much everyone once they outgrow RRD.

~~~
lambdadmitry
It felt too heavyweight for that particular case. Moreover, it was 3 years
ago, Cassandra wasn't a big thing than.

------
mtourne
I'm excited since I'm specifically in the market for a lightweight tsdb
written in Erlang. Also the documentation looks quite nice.

But like all the other possible contenders I've found, so far. I am unable to
compile this project out of the box. Hopefully the maintainers of this one
will be commenting on my github issues.

At this point it I'm getting the feeling I will have to my own half-baked tsdb
to answer only my immediate problem, instead of finding the magic do-it-all
well written, documented and extensible tsdb in Erlang. (I'm 2 weeks into
Erlang, so it will probably be very terrible).

I've tried (and opened various issues on): pulsedb:
[https://github.com/pulsedb/pulsedb](https://github.com/pulsedb/pulsedb)
litetsdb:
[https://github.com/dreyk/litetsdb](https://github.com/dreyk/litetsdb) etsdb:
[https://github.com/philipcristiano/etsdb](https://github.com/philipcristiano/etsdb)

~~~
lambdadmitry
Can you please describe what failed during compilation?

~~~
mtourne
Everything I have found is documented in their respective github repositories.

This is my main issue
[https://github.com/band115/ecirca/issues/38](https://github.com/band115/ecirca/issues/38)

I can't find much docs on how to use rebar's {pre_hooks, [{compile, ...

$ERL_CFLAGS is set to (note the extra quotes).

-I"/usr/local/Cellar/erlang/17.3.4/lib/erlang/lib/erl_interface-3.7.19/include" -I"/usr/local/Cellar/erlang/17.3.4/lib/erlang/erts-6.2.1/include"

I don't know if this issue befalls on the erlang distribution (homebrew) on my
system, rebar or ecirca ..

~~~
lambdadmitry
Big thanks for the report, I'll look into it in a few days.

------
rdtsc
Looks great. Thanks for sharing!

I like the architecture diagram and explanation how everything is put together
at a higher level.

------
CptMauli
I don't see how to record meta information like a quality indicator (important
for lets say timespans where no data could be recorded e.g. during a
connection loss).

------
dozzie
1\. `make all' fetches dependencies. This is bad. It should only compile them.

2\. No canonical way of running the thing as a daemon. How am I supposed to
make a package out of it?

3\. No "reload config" command, at least I don't see it. Erlang supports such
thing with application's environment.

