
Tank – A very high performance distributed log service - olalonde
https://github.com/phaistos-networks/TANK
======
mfukar
GH is now a magic land where "high performance" means "no numbers to back this
up", "secure" means "we haven't actually been audited", and "awesome"
something to the effect of "here, have this".

~~~
markpapadakis
Some benchmarks added [https://github.com/phaistos-
networks/TANK/blob/master/README...](https://github.com/phaistos-
networks/TANK/blob/master/README.md). Will add consumer benchmarks later (our
ops folks are on it). Apologies for not backing the "high performance" claim
with numbers in the first place.

------
biokoda
No performance numbers. No tests. Nothing about how it manages concurrency,
how it actually writes to disk, how it does replication.

It appears to be a server not a lib. Why would someone use this instead of
kafka?

~~~
markpapadakis
It is a service, and a library(C++). You can always check the implementation,
but obviously that's the wrong answer to that question :)

The README states that replication is not implemented yet. I encourage you to
also check this issue [https://github.com/phaistos-
networks/TANK/issues/14](https://github.com/phaistos-networks/TANK/issues/14)
for some details on the I/O semantics, and the Wiki for answers to other
questions: [https://github.com/phaistos-
networks/TANK/wiki](https://github.com/phaistos-networks/TANK/wiki)

~~~
biokoda
So single threaded, blocking file io. Not very impressive or useful (as a
lib).

~~~
markpapadakis
Yes, single threaded, because the contention to the various files and the cost
of serialisation would likely negate the benefits of using multiple threads.

Network I/O is obviously asynchronous - you may want to check the codebase.
Disk I/O is synchronous, but: \- uses sendfile() and read readahead() to
reduce or eliminate the likelihood of blocking reads: see here please
[https://github.com/phaistos-
networks/TANK/issues/14#issuecom...](https://github.com/phaistos-
networks/TANK/issues/14#issuecomment-232614206)

\- AIO is either broken or supported on XFS (depending on kernel release, and
also in the past, appending to a file on an XFS fs could block and degrade
performance). But not using DirectI/O so writes end up in memory and only get
flushed if it hits the commit count, or periodically - so, particularly for
local disks, HDDs or SSDs, this in practice never blocks for more than a few
ms when flushing, if at all.

~~~
biokoda
> Yes, single threaded, because the contention to the various files and the
> cost of serialisation would likely negate the benefits of using multiple
> threads.

Multithreaded does not mean multiple files.

> But not using DirectI/O so writes end up in memory and only get flushed if
> it hits the commit count, or periodically - so, particularly for local
> disks, HDDs or SSDs, this in practice never blocks for more than a few ms
> when flushing, if at all.

Unless you are actually using it under a high throughput scenario which is why
you would use a lib like this. It will work great until you hit the actual
flush point, then possibly block for seconds even minutes.

If your performance needs are high, mandating XFS is not unreasonable.

~~~
krenoten
It's totally unreasonable to mandate a particular FS for modern general
purpose software.

Multiple files does mean multiple threads. Unless you manage to run this
without a modern operating system.

~~~
biokoda
> It's totally unreasonable to mandate a particular FS for modern general
> purpose software.

That is completely debatable. We are talking about a "a high performance
distributed log service". Not a word processor.

> Multiple files does mean multiple threads.

Which is something entirely different from what I said...

------
Whitespace
I don't see any information other than a claim that it's "engineered for
optimal (very high) performance": [https://github.com/phaistos-
networks/TANK/wiki/Why-Tank-and-...](https://github.com/phaistos-
networks/TANK/wiki/Why-Tank-and-Tank-vs-X)

------
jedisct1
No JVM required. Way easier to set up than Kafka. I'm sold :)

~~~
virmundi
Is headless install of the Oracle JVM that hard? Isn't it just a Puppet or a
Chef snippet?

------
markpapadakis
I think I, or someone else maybe?, needs to run some benchmarks and produce
some meaningful comparison metrics -- as pointed out by commenters here, and
elsewhere. I suppose I should have done that already. Apologies for the lack
of concrete numbers. There's some information about performance in the Wiki
and Issues though.

------
nope_42
I'm looking for something like kafka with an at least once guarantee. I
believe this can be achieved with the kafka java client (not sure on that) but
librdkafka (C++ client) doesn't seem to support this guarantee. Performance is
secondary to messages not getting dropped in my use cases.

What kind of guarantees does tank make?

~~~
markpapadakis
The Tank Client will immediately publish to Tank (it doesn't buffer requests).
You get at least once semantics /w Tank(exactly once pretty much means at
least once but with dupes-detection).

~~~
nope_42
So if I have a subscriber that simply publishes a transformed message onto
another topic I can have a guarantee that if the publish fails it wont move on
to the next message in the subscription?

~~~
markpapadakis
The consumers applications (which interface with Tank brokers via a Tank
client library) are responsible for maintaining the sequence number they are
consuming from.

Suppose one such application is consuming every new event that's publishing
("tailing the log"). As soon as another application successfully publishes 1
or more new messages, the consumer will get them immediately. If the
application that attempted to publish failed to do so, or didn't get an ACK
for success, then you are guaranteed that no new message(s) were published(i.e
no partial message content).

I am not sure if that answers your question, if not, can you please elaborate?

~~~
nope_42
I believe so. I suppose I'm asking for an abstraction that makes maintaining
the sequence number simple and fails safely in the presence of errors.

I'd basically like to be able to map messages from one topic to another with a
guarantee that none of those messages will be lost; even when some error
occurs (either a programming error, system downtime, or network partitions).
I'd prefer the application to stop producing messages than lose any of them.

It sounds like that is possible with Tank so I may end up giving it a try.

------
kinkdr
Looks like it is abandoned; first commit was June and last September.

~~~
markpapadakis
I am the core developer. It's not abandoned at all. There are updates that
haven't been pushed upstream, but no new features. Was going to support
replication via an external consensus service (etc, consul, etc) -- but looks
like implementing Raft directly into Tank for interfacing with other cluster
nodes is a better idea, all things considering (no external deps.,
simplicity).

The reason this hasn't happened yet is that, other than lack of free time to
pursue it, we(work) haven't really needed that feature yet. We run a few
instances and they are very idle and we can also mirror to other nodes (via
tank-cli).

~~~
SEJeff
Can you please please please use the etcd implementation[1] of raft and not
the normal go-raft or consul raft implementations? They've done some serious
business fault injection and integration testing with etcd as part of google's
hosted kubernetes (GKE). There are still some lingering issues with consul at
scale that make me a bit gunshy. Mesosphere did some of this work themselves:
[https://mesosphere.com/blog/2015/10/26/etcd-mesos-
kubernetes...](https://mesosphere.com/blog/2015/10/26/etcd-mesos-kubernetes/)
, but I know that Google engineers have done tons of work on this as well.

[1]
[https://github.com/coreos/etcd/tree/master/contrib/raftexamp...](https://github.com/coreos/etcd/tree/master/contrib/raftexample)

~~~
markpapadakis
Thank you - yes, I was planning to base the implementation on etcd's. I
appreciate the heads up:)

