
Postgres high-availability cluster with auto-failover and cluster recovery - tuyguntn
https://github.com/nanopack/yoke
======
angelbob
Can we just start posting "Jepsen or it didn't happen" on these?

Clustered relational DBs are really remarkably hard to get right. A shocking
percentage of projects and vendors don't really even make a serious try (cf
GaleraDB.)

This project may or may not have done a reasonable job (no clue,) but the lack
of information about how they tested it is a bad sign.

~~~
antirez
Most (but not all) MySQL / PostgreSQL HA systems have similar semantics of
Redis Sentinel. They don't try to achieve any linearizable semantics, but to
perform failure detection and fail over to some slave, with heuristics about
what slave to pick. All this kind of solutions will fail to pass Jepsen tests
since they are not designed to achieve the kind of strong properties Jepsen
will test for. The behavior of those systems is acceptable in many production
situations (it's up to the product requirements to understand if it is ok that
a given set of failure modes produce data loss and/or inconsistencies) but the
important bit is that the semantics is very well documented so that users know
what tradeoffs they are making.

EDIT: Shameless plug: I believe it was a shame that Sentinel was initially
badly received because of the Jepsen test (now tons of users are using
Sentinel with success, fortunately), since for what it does, it is very
advanced compared to * SQL failover solutions, so a porting of Redis Sentinel
to failover those systems could make sense IMHO. A few things Sentinel has
that are desirable in * SQL failover systems:

1\. It's distributed so Sentinel itself is not a single point of failure.

2\. When the partition heals Sentinel is able to have a coherent view of the
configuration.

3\. It is able to automatically reconfigure the old master and the other
slaves to replicate from the new master.

4\. It is able to work as a configuration provider for clients, with a well
defined handshake in order to trigger the reconfiguration of all the clients.

~~~
lucian1900
A large reason to use Postgres in the first place is because it's a good CP
system. Any clustered version is only worthwhile if it preserves consistency.

It's already possible to use more than one Postgres database while not
maintaining consistency, but that isn't desirable for most users.

~~~
antirez
> it's a good CP system

PostgreSQL is not clustered by default so it is only _from the point of view
of the theory_ a CP system. Most _proper_ CP systems have the ability to
provide a limited form of Availability, which is, they are available in the
majority partition (this is not "A" of CAP but is a lot better than, if
primary is not available, the whole cluster is not available). If you use
PostreSQL with asynchronous replication and a best-effort failover solution to
provide some form of Availability, you are doing something that makes sense,
assuming your product can tolerate the resulting weak consistency properties.

This is a great example on how CAP does not capture certain important
semantics of a system. Technically a single primary node is CP but the real
world CP systems have better availability than that, or they are useless (as
CP systems).

------
tylerflint
Hey everyone, I work for nanobox and on the yoke project. I just woke up and
noticed the thread. The interest is appreciated. There are a lot of great
questions here, and I'll try to do my best to answer them individually. Please
keep in mind that the nanopack cloud-initiative was just announced friday
([https://blog.nanobox.io/nanopack-a-new-vision-for-
automated-...](https://blog.nanobox.io/nanopack-a-new-vision-for-automated-
infrastructure/)), so we're working feverishly to get documentation,
tutorials, and demos available. Any help or contributions would be greatly
appreciated.

------
jpgvm
This is inferior to the Manatee design:
[https://github.com/joyent/manatee](https://github.com/joyent/manatee)

Funnily enough I ran into this yesterday when Mist was posted and I looked at
what other projects were under the umbrella.

~~~
tylerflint
Much of the architecture of yoke was inspired by manatee. In fact, we've spent
the last 2 years working with Joyent on smartos and illumos, so we were
heavily inspired by the great engineering behind manatee.

Manatee didn't work for our use cases, which should not be interpreted as
"manatee is bad". Here are a few of the reasons we forged a different
solution:

\- While node.js is a great language for many things, we didn't want the
overhead of running the node.js/v8 runtime alongside postgres. 50M might seem
negligible, but with thousands of postgres clusters for our clients every MB
adds up.

\- We didn't want the administrative overhead of managing a zookeeper cluster,
and instead built the cluster management semantics directly into the yoke
project.

\- This may be different now, but at the time manatee was very heavily
integrated into illumos/smartos. We really love smartos, but recognize that
not all clients are able to use this tech.

~~~
lobster_johnson
You guys seem to be creating a bunch of great little tools, but I wish you
would spend more time on documentation.

For example, in this case it's not at all clear how the cluster management
works, or even what the semantics _are_.

~~~
tylerflint
I couldn't agree more. We literally just introduced the nanopack cloud-
initiative on friday ([https://blog.nanobox.io/nanopack-a-new-vision-for-
automated-...](https://blog.nanobox.io/nanopack-a-new-vision-for-automated-
infrastructure/)). Bear with us, we'll get there!

------
secure
similar:

[https://spilo.readthedocs.org/en/latest/DESIGN/](https://spilo.readthedocs.org/en/latest/DESIGN/)

[https://github.com/sorintlab/stolon](https://github.com/sorintlab/stolon)

I wonder how they compare to yoke.

~~~
navls
If I had a penny for every HA postgres solution

~~~
Rafert
I'm no Postgres expert, why isn't something like this built in to the product
itself?

~~~
Jweb_Guru
To discourage people from using something broken. You're talking about a
database that waited until 2011 to add a SERIALIZABLE implementation because
the existing solutions all sucked, and only did it then because of new
research (SSI). They have zero qualms about keeping features out of the
database, no matter how useful, if they can't be implemented well.

~~~
notpeter
Forget serializable, PostgreSQL 7.0 included functional ACID-safe transactions
5 years before MySQL but didn't include LEFT OUTER JOIN
([http://www.postgresql.org/docs/7.1/static/release-7-1.html](http://www.postgresql.org/docs/7.1/static/release-7-1.html))
. That was my first taste of "if you can't do it right, don't ship it."

------
abritishguy
I can guarantee that this cannot guarantee both of:

\- system stays online when any server goes down

\- after confirming a write it will always be returned in corresponding reads
following a failover

------
atmosx
How does this work?

By skimming the installation instructions I deduce that every time a 'write'
call hits the primary server, the primary server sends a message to 'monitor'
server which sends a message to primary or secondary server(s) to run rsync?

~~~
acveilleux
It's really not documented well (and the generated docs are a joke) but from
what I can gather, it's bog-standard PG replication with a go app to monitor
the replication and do automated failover.

The monitor node appears to be there to act as neutral witness node so that a
secondary is only promoted leader if it can reach the monitor _and_ the
monitor cannot reach the current leader. Hopefully, the leader also commits
suicide when it cannot reach the monitor.

MS SQL server implements this pattern and calls it "Database Mirroring
Witness": [https://msdn.microsoft.com/en-
us/library/ms175191.aspx](https://msdn.microsoft.com/en-
us/library/ms175191.aspx)

------
lamby
How likely is this to be merged upstream..? There's a glut of reasonable-to-
good third-party solutions which I'm always hesitant to commit to as they are
just Some Startup's solution.

Not saying it doesn't work for them or it couldn't work for me, but I do feel
its justifiable to deduct a few "points" there.

~~~
lobster_johnson
Zero chance. For one, it's written in Go. Postgres is written in C and needs
to be portable to a lot more architectures than Go currently supports.

Secondly, this is one possible design. The Postgres team prefers tools like
these to complement the core so that people have a choice between different
solutions depending on their use case, network topologies, availability
tolerances, etc.

This just a helper tool that can control Postgres and ship logs around. A
native Postgres solution could do it better.

------
tbarbugli
no docs no party

~~~
tolmark12
[https://godoc.org/github.com/nanopack/yoke](https://godoc.org/github.com/nanopack/yoke)

~~~
acveilleux
That is not documentation, that's a bunch of code comments and signatures
turned into a web page. You cannot tell from this documentation how the system
works, what are the failure modes and how to recover. Those are kind of useful
things in a failover management system.

------
FranzSunder
ddddddddddddddddddd

