Hacker News new | past | comments | ask | show | jobs | submit login
Introducing zetcd (coreos.com)
264 points by rtp on May 19, 2017 | hide | past | web | favorite | 34 comments

A big use case of this that we are thinking about is enabling people to use the etcd Operator[1], which makes it simple to run etcd clusters on Kubernetes, to back their ZooKeeper applications.

The neat thing about the etcd Operator is you can define a cluster and the etcd Operator takes care of normal operations by using the Kubernetes API.

  apiVersion: "etcd.coreos.com/v1beta1"
  kind: "Cluster"
    name: "example-etcd-cluster"
    size: 5
    version: "3.1.8"
Pretty neat!

Anyways, the zetcd project is still super young but would love more folks to try it out. As the post says folks have already tried using Kafka, Mesos, and others.

[1] https://coreos.com/blog/introducing-the-etcd-operator.html

Who are you targeting with this? The "hesitant" zookeeper folks that already depend on etcd? Are you hoping to unseat cdh here? Pardon the naive question here - I never bump in to k8s selling to traditional enterprise hadoop customers.

I'd also never pick kubernetes for my "from scratch" cluster due to already being reliant on the JVM stack. I actually like the idea of giving an IT department that already understands zookeeper a mesos cluster with DC/OS.

That being said - k8s has a ton of momentum but it seems to be mainly with startups or maybe niche teams (prove me wrong here?) outside of google. It would be great to understand what you guys are looking at for things like this. Right now it feels like k8s and a lot of the other startups in this space like pachyderm are trying to compete with the hadoop ecosystem (which is great! competition forces innovation which is good for the ecosystem as a whole)

Kubernetes is actually getting a solid amount of large-tech and early adopter enterprise deployment. That's still pretty nascent, but it's picking up quickly. Happy to discuss in more details offline the adoption we're seeing.

The reason you don't bump into k8s while selling to Hadoop users is that Hadoop isn't something you'd run on a container-based stack (at least not right now and IMO it wont be). There are lots of Hadoop users who run containers for their application infra (as opposed to data infra). Pachyderm's whole pitch is that containerized data infra can be really powerful and that enterprises will want to unify their stack to all be containerized and k8s is THE answer for the orchestration layer.

P.S. Despite all my opinions above, I actually agree with your initial question around who zetcd is actually targeting. I don't have a clear picture of that.

>"Pachyderm's whole pitch is that containerized data infra can be really powerful and that enterprises will want to unify their stack to all be containerized and k8s is THE answer for the orchestration layer."

Doesn't Pachyderm predate K8s though? Is this a recent development? Have they shifted focus then?

Sure! Feel free to reach out. I'm just commenting on a wider trend I'm seeing with parallels to the hadoop ecosystem popping up written in go that are container based. I agree you don't tend to run hadoop and co on containers. We tend to see the app side as well though. We do both microservices as well as hadoop infra.

Non-startup companies are adopting Kubernetes. You can see some of their stories on the Tectonic Summit website[1]: Ticketmaster, eBay, Concur, SAP, BNY Mellon, MLS, etc.

I will try to reply in the morning in depth on the other points.

[1] https://coreos.com/summit/

Appreciated! I'm wondering if these are just one off teams though? We have "enterprise adoption" for our software but it doesn't mean company wide. One thing hadoop has been able to do is actually get deployed at scale. You can have small teams within companies using k8s for their apps. Some other parts of these companies can be too conservative to actually deploy new tech. The "nascent" adoption usually means innovation labs and 1 off deployments for certain teams.

What I'm trying to gauge here is k8s as an actual "company wide platform". I would love for it to be something I can depend on to be at an enterprise in a few years. It's great technology but still feels like it needs to be beaten up a bit yet.

I work on OpenShift (which is k8s with tenancy) and there's a good mix of "production apps", "dense development clusters", and "single app experimentation" out there. Like all things about the future, it's here, just not evenly distributed.

You'd be surprised how many services you interact with on a daily basis are running on k8s (whole or in part).

It's still early, and many of the adopters today in large companies just happened to be making modernization efforts of their app-dev / app-deploy pipelines and moved to k8s or OpenShift. That said, it's certainly not ubiquitous yet.

This sounds more palatable to me. I definitely know it has traction and I wouldn't be surprised to see it powering quite a few of the bigger services but it still feels like a big part of the earlyadopter phase yet. This is line with what I have seen. I know it's "out there" but it's not exactly "RHEL" yet ;).

So I can run a Kubernetes cluster on Mesos and have the Zookeeper for Mesos deployed on the Kubernetes cluster using the etcd operator and zetcd

Joking, of course.

The funny thing about the situation you describe is that there are real world examples of similar circular dependencies.

I recall GitHub having an issue like that where their build pipeline used Bower which is hosted on GitHub. When shit hit the fan and a build broke the site, they could build the "fix" as Bower didn't work.

My own experience working at CoreOS is that many of our projects exploit self-referentiality as it's a particularly useful property.

Off the top of my head:

- Quay.io, our registry service, is built and deployed by itself

- Clair, our static analysis tool for detecting security vulnerabilities, analyzes itself

- Tectonic, our enterprise Kubernetes distro, is "self-driving" and manages itself

- discovery.etcd.io, a service we run to make it easier to bootstrap new etcd quorums, is just a quorum of etcd nodes

I think you are missing the point. It's like running docker registry on kubernetes.

If for some reason the cluster goes down, bootstrapping it might be a bit difficult.

Yes that's the exact point I was trying to make. Things are fine until they're not, at which point it's surgery and tribal knowledge to fix them.

Of course you are joking, but you should keep in mind, that _in the real world_ you would operate the etcd kubernetes needs also via etcd operator!

Oh, and here is a video explanation of the etcd Operator https://youtu.be/Uf7PiHXqmnw?t=11

I did some quick testing with Apache Curator (note: I'm the main author of Curator) and it looks like zetcd isn't implementing the create2 opcode and several others. If the goal is to really be ZK compatible it has a long way to go. I'm not sure who the target of this is. But, I'll keep following and try to get the Curator tests to run when it's further along.

Interesting. Maybe hashicorp should release an etcd compatible layer for Consul. Or CockroachDB even. Both have one thing etcd does not...inbuilt WAN support.

etcd can cross WAN links with tuning for the expected latencies. Tuning latencies is required to ensure the leader election algorithms know when to trigger a failure[1].

What is your use case?

[1] https://coreos.com/etcd/docs/latest/tuning.html

Yes, you can sort of detune the whole cluster. That's not quite the same as Consul's and CockroachDB's specific WAN awareness. Those two take different approaches, but do specifically understand and compensate.

>What is your use case?

Contract work, so use case varies. I agree that etcd is often the right answer.

Right, etcd's entire focus is on being a consensus database for distributed systems needing coordination, locking, etc. So, eventually consistent WAN replication hasn't really been a focus.

I do think this sort of cross-cluster key replication is useful and we offer it as a userspace external tool called make-mirror[1].

[1] https://github.com/coreos/etcd/blob/master/etcdctl/README.md...

Worth noting that CockroachDB isn't using an eventually consistent model. Yes, it's not the same thing as etcd, but I can see some potential use case overlap.

There is some overlapping. But we should choose solution wisely :P.

Here is a doc [https://github.com/coreos/etcd/blob/master/Documentation/lea...] comparing etcd with other systems, including CockroachDB.

I work on etcd.

We might have just been talking about that

"We" as in Hashicorp?

Yeah, few people internally would like to tackle this

Great idea, we only run Zookeeper because of Kafka but everything else is in Consul so the idea of this existing is pretty neat.

Same here, Storm and Kafka - everything else is in consul - and 2 single sources of truth.. well :)

So how does zetcd handle the zookeeper session? It's in zetcd and then the "ephemeral nodes" are removed from etcd if the zk client session expires? This is handled by the proxy?

This seems to be the major design compatibility item between the two is that the zookeeper protocol has a session state and so supports ephemeral nodes. If a gc pauses the jvm for a long time the client session will expire. In etcd there is ttl instead (zookeeper always lacked ttl as it wasn't needed)

A Go routine is spawned which runs the etcd client's KeepAlive method to tie a session to a lease in the proxy. The code is here: https://github.com/coreos/zetcd/blob/d33e3b836a2a2de8a8ec077...

Disclaimer: The person who wrote this is AFK at the moment so I might be completely wrong ;)

> zookeeper always lacked ttl

ZooKeeper 3.5.3 supports TTLs

Another reason to use zetcd over Zookeeper is security. I don't believe it is possible yet to use TLS for Zookeeper-to-Zookeeper communication (clients can connnect to Zookeeper using TLS). The Jira covering this feature [0]. This will increasingly be a problem for those running Zookeeper for Mesos, Kafka etc. from a security and risk point of view.

[0] https://issues.apache.org/jira/browse/ZOOKEEPER-236

A few years ago I tried to reimplement the server connections directly in Zookeeper to be HTTP connections and a REST-like API and try to use long-polling and some other things. I got some code written, but no prototype up and I had to put it aside. This was initially before etcd was really going (I think it was at version 0.2 when I started the work). I had always hoped somebody else would try the same thing, but no such luck.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact