
Introducing zetcd - rtp
https://coreos.com/blog/introducing-zetcd
======
philips
A big use case of this that we are thinking about is enabling people to use
the etcd Operator[1], which makes it simple to run etcd clusters on
Kubernetes, to back their ZooKeeper applications.

The neat thing about the etcd Operator is you can define a cluster and the
etcd Operator takes care of normal operations by using the Kubernetes API.

    
    
      apiVersion: "etcd.coreos.com/v1beta1"
      kind: "Cluster"
      metadata:
        name: "example-etcd-cluster"
      spec:
        size: 5
        version: "3.1.8"
    

Pretty neat!

Anyways, the zetcd project is still super young but would love more folks to
try it out. As the post says folks have already tried using Kafka, Mesos, and
others.

[1] [https://coreos.com/blog/introducing-the-etcd-
operator.html](https://coreos.com/blog/introducing-the-etcd-operator.html)

~~~
agibsonccc
Who are you targeting with this? The "hesitant" zookeeper folks that already
depend on etcd? Are you hoping to unseat cdh here? Pardon the naive question
here - I never bump in to k8s selling to traditional enterprise hadoop
customers.

I'd also never pick kubernetes for my "from scratch" cluster due to already
being reliant on the JVM stack. I actually like the idea of giving an IT
department that already understands zookeeper a mesos cluster with DC/OS.

That being said - k8s has a ton of momentum but it seems to be mainly with
startups or maybe niche teams (prove me wrong here?) outside of google. It
would be great to understand what you guys are looking at for things like
this. Right now it feels like k8s and a lot of the other startups in this
space like pachyderm are trying to compete with the hadoop ecosystem (which is
great! competition forces innovation which is good for the ecosystem as a
whole)

~~~
jaz46
Kubernetes is actually getting a solid amount of large-tech and early adopter
enterprise deployment. That's still pretty nascent, but it's picking up
quickly. Happy to discuss in more details offline the adoption we're seeing.

The reason you don't bump into k8s while selling to Hadoop users is that
Hadoop isn't something you'd run on a container-based stack (at least not
right now and IMO it wont be). There are lots of Hadoop users who run
containers for their application infra (as opposed to data infra). Pachyderm's
whole pitch is that containerized data infra can be really powerful and that
enterprises will want to unify their stack to all be containerized and k8s is
THE answer for the orchestration layer.

P.S. Despite all my opinions above, I actually agree with your initial
question around who zetcd is actually targeting. I don't have a clear picture
of that.

~~~
bogomipz
>"Pachyderm's whole pitch is that containerized data infra can be really
powerful and that enterprises will want to unify their stack to all be
containerized and k8s is THE answer for the orchestration layer."

Doesn't Pachyderm predate K8s though? Is this a recent development? Have they
shifted focus then?

------
Randgalt
I did some quick testing with Apache Curator (note: I'm the main author of
Curator) and it looks like zetcd isn't implementing the create2 opcode and
several others. If the goal is to really be ZK compatible it has a long way to
go. I'm not sure who the target of this is. But, I'll keep following and try
to get the Curator tests to run when it's further along.

~~~
philips
Issue filed:
[https://github.com/coreos/zetcd/issues/49](https://github.com/coreos/zetcd/issues/49)

------
tyingq
Interesting. Maybe hashicorp should release an etcd compatible layer for
Consul. Or CockroachDB even. Both have one thing etcd does not...inbuilt WAN
support.

~~~
philips
etcd can cross WAN links with tuning for the expected latencies. Tuning
latencies is required to ensure the leader election algorithms know when to
trigger a failure[1].

What is your use case?

[1]
[https://coreos.com/etcd/docs/latest/tuning.html](https://coreos.com/etcd/docs/latest/tuning.html)

~~~
tyingq
Yes, you can sort of detune the whole cluster. That's not quite the same as
Consul's and CockroachDB's specific WAN awareness. Those two take different
approaches, but do specifically understand and compensate.

>What is your use case?

Contract work, so use case varies. I agree that etcd is often the right
answer.

~~~
philips
Right, etcd's entire focus is on being a consensus database for distributed
systems needing coordination, locking, etc. So, eventually consistent WAN
replication hasn't really been a focus.

I do think this sort of cross-cluster key replication is useful and we offer
it as a userspace external tool called make-mirror[1].

[1]
[https://github.com/coreos/etcd/blob/master/etcdctl/README.md...](https://github.com/coreos/etcd/blob/master/etcdctl/README.md#utility-
commands)

~~~
tyingq
Worth noting that CockroachDB isn't using an eventually consistent model. Yes,
it's not the same thing as etcd, but I can see some potential use case
overlap.

~~~
ideal0227
There is some overlapping. But we should choose solution wisely :P.

Here is a doc
[[https://github.com/coreos/etcd/blob/master/Documentation/lea...](https://github.com/coreos/etcd/blob/master/Documentation/learning/why.md)]
comparing etcd with other systems, including CockroachDB.

I work on etcd.

------
callumjones
Great idea, we only run Zookeeper because of Kafka but everything else is in
Consul so the idea of this existing is pretty neat.

~~~
wink
Same here, Storm and Kafka - everything else is in consul - and 2 single
sources of truth.. well :)

------
ninjakeyboard
So how does zetcd handle the zookeeper session? It's in zetcd and then the
"ephemeral nodes" are removed from etcd if the zk client session expires? This
is handled by the proxy?

This seems to be the major design compatibility item between the two is that
the zookeeper protocol has a session state and so supports ephemeral nodes. If
a gc pauses the jvm for a long time the client session will expire. In etcd
there is ttl instead (zookeeper always lacked ttl as it wasn't needed)

~~~
philips
A Go routine is spawned which runs the etcd client's KeepAlive method to tie a
session to a lease in the proxy. The code is here:
[https://github.com/coreos/zetcd/blob/d33e3b836a2a2de8a8ec077...](https://github.com/coreos/zetcd/blob/d33e3b836a2a2de8a8ec0771f5d249aa66d4bf8a/session.go#L48)

Disclaimer: The person who wrote this is AFK at the moment so I might be
completely wrong ;)

------
moderation
Another reason to use zetcd over Zookeeper is security. I don't believe it is
possible yet to use TLS for Zookeeper-to-Zookeeper communication (clients can
connnect to Zookeeper using TLS). The Jira covering this feature [0]. This
will increasingly be a problem for those running Zookeeper for Mesos, Kafka
etc. from a security and risk point of view.

[0]
[https://issues.apache.org/jira/browse/ZOOKEEPER-236](https://issues.apache.org/jira/browse/ZOOKEEPER-236)

------
batbomb
A few years ago I tried to reimplement the server connections directly in
Zookeeper to be HTTP connections and a REST-like API and try to use long-
polling and some other things. I got some code written, but no prototype up
and I had to put it aside. This was initially before etcd was really going (I
think it was at version 0.2 when I started the work). I had always hoped
somebody else would try the same thing, but no such luck.

