Hacker News new | past | comments | ask | show | jobs | submit login
Serf: A decentralized solution for service discovery and orchestration (serfdom.io)
117 points by coffeejunk on Oct 23, 2013 | hide | past | favorite | 44 comments

I'm jumping on a plane right now (a couple hours) but I'd be happy to answer any questions related to Serf once I land. Just leave them here and I'll give it my best shot! We've dreamt of something like Serf for quite awhile and I'm glad it is now a reality.

Some recommended URLs if you're curious what the point is:

"What is Serf?" http://www.serfdom.io/intro/index.html

"Use Cases" http://www.serfdom.io/intro/use-cases.html

Comparison to Other Software: http://www.serfdom.io/intro/vs-other-sw.html

For the CS nerds, the internals/protocols/papers behind Serf: http://www.serfdom.io/docs/internals/gossip.html

Also, I apologize the site isn't very mobile friendly right now. Unfortunately I can write a lamport clock implementation, but CSS is just crazytown.

A meta question if you will - I have often come across situations in work where "if only we had that tool". sometimes I have hacked something together, other times taken it further and tidied it up and released it. But this seems to have a large level of polish

so ...

When did you realise the need for surf

Did you work on it as a main project at some point or is it a side project

When and how did you decide to commit to getting this done

and the big one for me - tools are driven by a need, but often the need keeps coming while the time to build it diminishes. What strategies did you use to keep the plates spinning while building surf?

I think we can all mostly answer the questions - I just waant to know how different your answers are from say mine when I don't release two major OSS projects and you do.


Great questions. I'll answer each in turn.

I want to mention the "polish": I personally don't believe in releasing an open source project without polish. If it is missing docs, its just not complete. If it is ugly, it is not complete. The technical aspects of Serf were done weeks ago. Getting the human side of things done took another few weeks (contracting designers and such).

> When did you realise the need for serf?

The need for something like Serf has existed since I started doing ops. Every time I hit something where I say to myself "why is this so hard/crappy" is when I write it down in my notebook for a future date. I then just think on the idea for awhile and eventually when I feel like I have a significantly better solution than what is out there already, I build it.

I decided to start building Serf when @armon started throwing gossip protocol academic papers at me. I realized he figured it out, this was clearly significantly better, so we started working on it.

> Did you work on it as a main project at some point or is it a side project?

To get it out the door we focus on it for some period of time. After it is shipped it is still what I would consider a "main project" but time is split between various projects.

> When and how did you decide to commit to getting this done?

A few weeks ago. It took about a month to build. Building it is easy. Figuring out WHAT to build... took a long time. I have to say I've had "service orchestration/membership" in my notebook for years.

> What strategies did you use to keep the plates spinning while building surf?

No good answer here, we just prioritize some things over others. Serf was our top priority this month.

Thank you - that "building was easy, compared to knowing what to build" put a lot into perspective. And reaching out to external people to build the polish is a surprise, but obvious in retrospect.

I am afraid that for such a helpful and clear answer, you get a mere 1 karma point from me - but thank you.

I'm still looking for a very simple system that would let me do a live redeploy with a blocking database schema migration in between.

For that I (think I)'d need a system that will:

* start off with X nodes live in the load balancer

* trigger redeploys on half (or so) of them

* when half of the nodes have been redeployed, block and perform the database migration and trigger redeploys for the remaining server

* after the migration completed switch over the load balancer to the now updated half of servers

Is this something that you could orchestrate with Serf? Or am I looking in the wrong direction here.

You could build something like this on Serf, but it would take a little creativity. You might do something like this:

1) Send a "pre-deploy" event. Handler scripts use random number generator to decide which group they are in ("flip a coin" basically)

2) Half the nodes should transition to the "left" state, do the deploy and rejoin the cluster.

3) Once this is done, trigger the migration.

4) Flip the LB to the nodes that did the Join/Leave (you can potentially distinguish them using different role names, or by tracking who left and joined)

5) Run the "post-deploy". The other half of the nodes should now deploy

6) Update the LB to include everybody as the nodes leave/join

This is of course a rough sketch, it is certainly possible if tricky to build something like this.

The one thing that feels lacking to me, and maybe there's just something I don't get, is the ability to tag nodes with metadata. From the docs it seems like what you're expected to do is fire an event for a node to, eg., declare itself a webserver, but this seems prone to failure in the long run. If I bring a new load balancer online how does it find out what's a webserver already?

This might be a little unclear, but if you check the documentation for agent configuration (http://www.serfdom.io/docs/agent/options.html), there is an option to provide a role. The role is the metadata support currently

Cool, I must have missed that. It would be nice, then, if you could also have a node be tagged with multiple roles (as well as add/remove them in a way that propagates).

Dynamic roles is a fairly hard problem, which we hope to solve by building a different tool on top of Serf in the near future. If you want multiple roles that are static, it can be simulated by just providing a comma separated value as the role (since that is just an opaque string value to Serf)

You need to do some repair on the website for iOS; the front page is white text on a white background.

Fixed. How does CSS work?

Since the main goal seems to be node discovery, can you compare why I might want to use this in addition to Salt? I've found salt's node discovery and targeting tools fast and powerful.

We have a comparison against Chef and Puppet, which may be relevant to Salt here: http://www.serfdom.io/intro/vs-chef-puppet.html. Not sure about Salt's search, but Serf is designed to run much more often than config management tools, and is able to handle topology changes in seconds instead of minutes or hours. Serf is also designed ground up to be fault tolerant, which is not usually a design goal of config management tools.

Looks like this might be loosely related to Mitchell Hashimoto [1], who makes Vagrant.

[1] https://twitter.com/mitchellh

Indeed, I'm one of the creators of the project. See here: http://www.serfdom.io/community.html :)

mitchellh 237 commits / 16,925 ++ / 9,149 --

from github, so yeah

Both are HashiCorp projects.

The title needs to be improved. "A decentralized, highly available, fault tolerant solution..." for what? The title should include "for service discovery and orchestration".

Agreed, these tag lines come off as buzzword soup instead of informative. I would love a small quick scenario describing what serf can help prevent/enable.

You might find this useful: http://www.serfdom.io/intro/use-cases.html

Yes, that page helped me, but the front page should be the hook and it left me none the wiser.

It seems serf and etcd are both trying to solve service discovery but are attacking it using different approaches. Which is pretty cool!

Serf looks to be eventually consistent and event driven. So you can figure out who is up and send events to members. This gives you a lot of utility for the use cases of propagating information to DNS, load balancers, etc.

But, you couldn't use serf for something like master election or locks and would need etcd or Zookeeper for that. Serf and etcd aren't mutually exclusive systems in any sense just solving different problems in different ways.

They have a nice write-up on the page here: http://www.serfdom.io/intro/vs-zookeeper.html

This would really benefit from a "how does this relate to zookeeper". I think this is an entirely new service, with different technical insides, and trying to provide a higher level solution to what people usually cobble together with ZK.

But I'd be interested in comments from someone knowledgeable.

Edit: I see this is addressed at http://www.serfdom.io/intro/vs-zookeeper.html but it would be nice to have something more "just the facts" rather than arguing the serf is good.

In writing that section, we tried to provide "just the facts". If there is anything that seems wrong or misleading in any way, we'd like to know so that the page can be corrected. It is not our intention to say "Serf is good, ZooKeeper is bad". They are very different tools, and we are just trying to highlight the differences. In fact, we believe that the strongest use cases involve using those tools together.

I didn't mean to say that it sounded like a sales pitch. What I meant is that it's talking about relative strengths and weaknesses, where what I really need to understand what Serf is is more along the lines of how the API / model differs from ZKs notion of writing to or waiting on locations in the distributed space.

The package documentation for the `serf` library[1] looks really exciting. I've been wanting to make a distributed file synchronization tool, and perhaps this would be an excellent library to build it on.

Question: as a relative networking idiot, how does NAT traversal fit into all of this?

[1] - http://godoc.org/github.com/hashicorp/serf/serf

We designed the `serf` library to be able to be easily embedded, so hopefully it can be of some use. Unfortunately Serf does not make use of any sort of NAT traversal currently. We've open sourced the project hoping to get the community involved, and NAT traversal is something we'd gladly work with the community to get implemented.

The first building block, STUN, is implemented over here: https://github.com/ccding/go-stun

How does security work in a system like this. If this is used in a shared hosting system, can a user inject false messages into Serf with for example PHP?

Yes, they can. In the general case its not an issue because usually your nodes are inaccessible by the public, but if you're using a shared hosting environment, this is entirely possible.

We're addressing this in the next release by signing/encrypting gossiped messages. See the roadmap: http://www.serfdom.io/docs/roadmap.html

This is actually very cool and is something incredibly handy for managing failover/membership.

What I unfortunately don't understand is that there doesn't seem to be a library I can use to take advantage of this in my own application. If I have a program (in Go) am I expected to spin up my own serf process then communicate with it via socket? Is there an option for me to have serf live inside my application?

The Serf executable is actually just a wrapper around the `serf` library. That library is designed to be embedded in Go applications. Documentation for the library is available here: http://godoc.org/github.com/hashicorp/serf/serf

What happens to your cluster when your network experiences intermittent packet loss and your random UDP messages get lost? Nodes just start going down and up randomly? (For those of you going "So what, that's normal", this not a quality of an HA system)

I'd highly recommend taking a look at this page: http://www.serfdom.io/docs/internals/gossip.html. One of the great attributes of the gossip protocol is it is very robust to intermittent network failures. Under minimal packet loss conditions (<5%), the rate of false positives should be very low. This is due to a few techniques, one of which is indirect probing, and another is a novel "suspicion" mechanism. In the case of a network partition, the parts of the cluster can run in isolation and will recover when the partition heals. If you are interested, the paper referenced there ("SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol"), is the foundation of Serf. In the paper you can find more details about the behavior of the cluster, false positive rates under packet loss, and partition handling.

tl;dr the systems is in fact designed with network errors in mind, as opposed to handling them being an afterthought.

What you're saying is it's designed with the knowledge that it's going to cause false positives, and basically doesn't work well under anything more than minimal packet loss. I think this is probably an important factor to note in the description (and I still fail to see how this is considered highly available or fault tolerant, as described in Intro pages)

I think we are maybe just working with different definitions. High Availability for Serf means that it can continue to handle changes in topology and deliver user events in the face of node failures and network problems. However, it is inevitable that there will be a degradation in it's performance given network failures. If there are serious packet loss issues, Serf will mark a node as failed.

I'm not saying it "won't work well". It works as it is designed to. It will be available for operations, it will automatically heal when the partition recovers, and the state will be resynchronized with the "failed" nodes. The system will be in an eventually consistent state, which is expressly documented and is it's normal mode of operation.

If you consider 5% packet loss "minimal", I'm not sure what applications you are running. TCP degrades at over 0.1% packet loss, and most UDP streaming protocols have serious degradation over 5%.

I'm still confused. You mention resynchronizing when the "partition" "recovers". First, can you clarify what a partition is? Second, can you define "recovery"? I'm not worried about performance degradation, i'm worried about nodes being marked down when they aren't down.

Please correct me if i'm wrong, but it sounds like this software only works reliably when you have two sets of nodes that suddenly can't communicate at all, and are eventually connected. Sometimes that does happen on a real network, but often the cause of failures is intermittent and undetermined for hours, days, or weeks. In this case, how would this program work? Would network nodes keep appearing and disappearing, triggering floods of handler scripts, loading boxes and keeping services unavailable?

Yes, tcp performance does degrade under packet loss. It also continues to operate (at well over 50% loss) and automatically tunes itself to regain performance once degradation ends. And it does not present false positives.

It maintains its own state (ordered delivery), checks its own integrity, stands up to Byzantine events (hacking), and is supported by any platform or application. Unfortunately, due to its highly-available nature, it will eventually report a failure to an application if one exists. But if latency is more of a priority than reliability, UDP-based protocols are more useful.

If you're designing a distributed, decentralized, peer-to-peer network, that's cool! But I personally wouldn't use one to support highly-available network services (which is three out of the five suggested use cases for Serf)

They should add "from the folks who brought you Vagrant" to the top of the homepage.

can we see serf as a kind of riak core but written in go?

Riak Core provides a superset of the features of Serf. Riak Core uses gossip to manage membership, but it also provides quorums for coordination, and is based around the notion of a hash ring and virtual nodes. You could instead using Serf to build riak core like technology on top.

Highly available?

Yes, the availability of the system is not tied to any given node(s). Any node (or group of nodes) can continue to operate in the face of failure.

This isn't much of a solution when it involves putting a binary on every host. Clearly, the best solution is a service framework at the platform level, not a separate, unmodifiable blob thrown on your hardware.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact