
Postgres-XL: Scalable Open Source PostgreSQL-based Database Cluster - icholy
http://www.postgres-xl.org/
======
NDizzle
It's a huge pet peeve of mine, but why do people present network maps of these
huge distributed things featuring ONE SINGLE load balancer in front of it
all!?

~~~
joshpadnick
Actually, I'm curious. What are best practices for setting up redundancy at
the load balancer level? Options I've seen are:

\- Hot standby that detects when the master is down and takes over as master
when needed

\- DNS-level solutions that distribute across multiple load balancers

But the DNS has a TTL that may not be honored by all ISPs, so how do you
create a truly no single point of failure with the load balancer?

~~~
kilburn
DNS-level solutions are oftentimes misunderstood, probably because there are a
couple of things that can be done at this level:

1\. "Geo-DNS" is about using an anycast network to direct users to their
nearest datacenter(s). This does _not_ aid High Availability at all.

2\. DNS Round Robin is about distributing the load between multiple IPs. As a
load balancing solution it is relatively poor, because you have no control
over the actual balancing and can end up receiving most users though a single
IP.

3\. DNS Failover solutions that replace the IP when the server goes down,
which is also a poor solution because of TTL and non-TTL browser caches.

4\. DNS Round Robin but for the High Availability, not for the balancing. This
is actually an interesting approach because most modern browsers automatically
switch to using (one of) the other record(s) when the IP they were using goes
down (sorry, I have no reference that clearly states which browser do and
which ones don't exhibit this behavior). In fact, there are some sources
around [1] that seem to identify this approach as the only one to achieve
instant failover in the face of datacenter-wide outages.

[1] [http://serverfault.com/questions/69870/multiple-data-
centers...](http://serverfault.com/questions/69870/multiple-data-centers-and-
http-traffic-dns-round-robin-is-the-only-way-to-assur)

~~~
agwa
I have experience using approach #4 for global high availability, and the main
downside is that most network outages result in packet loss rather than
returning an immediate ICMP error, so browsers will hang for about 30 seconds
before timing out and trying the next A record.

My personal preference is to combine approaches #3 and #4: have all hosts in
the round robin with a lowish TTL, but automatically remove any host that goes
down.

------
mrweasel
I understand that doing an database cluster is a hard problem and even harder
to make easy to use, but this seems really complicated to set up. I'm not a
fan of needing "transaction managers", "coordinators" and data nodes, I would
like these to all run on the same server.

Of cause Postgres-XL may not exactly be designed for my use case where I just
want to be able to have one node fail or take a server down to do upgrades.
Ideally it would be as simple as the built in Postgresql replication as
become.

~~~
TkTech
RethinkDB is an interesting project which is exceptionally simple to cluster -
it's actually their "Hello World!" example.

[http://rethinkdb.com/](http://rethinkdb.com/)

~~~
ddorian43
not even 1 node transactions, while postgres-xl has inter-node transactions

and sharding is currenty only-primary-key

------
hcarvalhoalves
The page is light on details, what is this exactly? A fork of Postgres, or
something you run on top to orchestrate clustering?

~~~
icholy
Postgres-XC + Postgres-R + StormDB

[http://www.theregister.co.uk/2013/10/10/translattice_stormdb...](http://www.theregister.co.uk/2013/10/10/translattice_stormdb_acquisition/)

~~~
twic
Postgres-R! I knew about Postgres-R, but i thought it was a research project
which had never really gone anywhere. I am highly intrigued to learn that it
is still in play. From what i remember, there were some terrifically clever
ideas about multi-master databases at the heart of it. I don't think there's
anything else quite like it in the open source world.

~~~
mason_s
Postgres-R bits have not made their way into Postgres-XL, but some parts of
Postgres-XC have. Some XC code is merged in to make future merging into
Postgres-XL easier, even parts that are not used.

Some parts of Postgres-R are in the TransLattice Elastic Database (TED), but
that is a proprietary closed source product, sorry. (I work for TransLattice,
who also open sourced Postgres-XL.)

~~~
twic
Ah, i see. It's a shame that no descendant of Postgres-R is open source, but
i'm still pleased to hear that there is one, even if it's proprietary. Thanks
for the information.

------
nusbit
Has anyone used this in production?

------
barosl
Oh, I had thought the link was about Postgres-XC. So it's a different project.

"The project includes architects and developers who previously worked on both
Postgres­-XC and Stado, and Postgres-XL contains code from Postgres-­XC. The
Postgres-XL project has its own philosophy and approach."

So should I see a brighter future for Postgres-XL rather than Postgres-XC? It
seems that the Postgres-XC respository is still making some commits, though
the rate is quite low.

~~~
techdragon
From what I can see, Postgres-XC is the original project but Postgres-XL is
sort of a 'stable fork'.

Its very murky and to be honest, very frustrating. I just want a Postgres that
scales and doesn't cost me $$$ like FoundationDB does. The Postgres-XC team is
far too insular and dont seem to care about outside users. I hope Postgres-XL
can inspire them to push ahead and then hopefully we can see a postgres 9.4
based postgresql clustering soloution for write scalability. (and no Pgpool-2
doesnt count)

~~~
barosl
I'm completely new to both Postgres-XC and Postgres-XL, and just started
looking at the documentation. It seems that the main structures of the two are
similar, except some features implemented in Postgres-XL like allowing node-
to-node data transfer.

It is natural, because they've just forked the project. But I think if there
are some _gotchas_ in the original project, Postgres-XL would also expose
similar problems. What have been your main concerns using Postgres-XC? Any
design issues?

~~~
mason_s
The planner and executor in Postgres-XL is quite different, and direct node-
to-node data transfer for MPP parallelism results in big performance gains.

Any bugs in XC in the planner and executor would not show up in XL (that is
not to say that we in Postgres-XL do not have any of our own bugs :-)).

Bugs related to GTM could show up in both, but in Postgres-XL additional
precautions have been taken to reduce the likelihood of problems.

------
hardwaresofton
The overview doesn't provide much actual info, but does "Global Transaction
Monitor" sound like "single point of failure" to anybody?

I wonder why someone can't just add etcd to postgres and create self-
sufficient distributed postgres nodes (I am oversimplifying, surely)

~~~
dragonwriter
> The overview doesn't provide much actual info, but does "Global Transaction
> Monitor" sound like "single point of failure" to anybody?

Since you can have more than one of them, no.

~~~
hardwaresofton
Right, but generally, why not just integrate that functionality with the
actual data nodes?

In the diagram they have 2, but where you have two, you can add more for more
reliability, right? following that thinking down the rabbit hole, and you've
got distributed transaction monitoring across the data nodes(paxos/raft)...
Just wondering why that wasn't the goal

------
philtar
For anyone who wants to help: I've been looking at a bunch of different ways
to achieve high-availability postgres. What approach would you all suggest for
a solo startup on a limited budget

~~~
michaelbuckbee
I'm worried that this is going to sound flippant (not my intention), but use
Heroku. You're under significant time constraints with being a solo founder
and it would most likely be a win to just not have to worry about this one
technical aspect.

~~~
gaadd33
Why recommend Heroku for that as opposed to AWS RDS? The pricing seems to be
significant (Heroku "Ika" is $750/mo, db.m3.large multi-az is $280/mo)

~~~
michaelbuckbee
Well, the main reason was that I didn't realize that AWS RDS was offering
Postgres now (looks like it is still in beta, but awesome news nonetheless).

------
yunong
The site is a little thin on details and internals. I would be very interested
to learn more about how they handle failovers whilst still guaranteeing
"strong consistency".

~~~
mason_s
Failovers are currently handled outside of Postgres-XL.

Consistency is achieved because all of the nodes use the same transaction ids
and snapshots (list of running transactions), via the Global Transaction
Manager. There is no need to worry about statements being executed in a
different order across the nodes.

------
rdtsc
Looks good. I wish Aphyr would unleash his Knossos on it. That would be very
informative. In other words wonder how it handles network partitions...

------
polskibus
Hopefully we'll find out more on June 17th: [http://www.postgres-
xl.org/2014/06/learn-about-postgres-xl-i...](http://www.postgres-
xl.org/2014/06/learn-about-postgres-xl-in-nyc-on-june-17-2014/)

I can't wait!

------
based2
Does it support a kind of Oracle RAC TAF Select?

[http://www.dba-oracle.com/art_oramag_rac_taf.htm](http://www.dba-
oracle.com/art_oramag_rac_taf.htm)

------
fjordan
From the description it sounds as if this provides synchronous replication.
How does this work in a WAN environment?

~~~
mason_s
Well, "synchronous replication" is used in a couple of ways here.

Tables can be designated as replicated or distributed (sharded). Replicated
tables are typically fairly static. These are handled synchronously in the
cluster on every datanode where the table resides. Actually, it first applies
to a designated "primary" node, and upon completion, it will execute on the
other nodes. The reason for this is to reduce the chance for deadlocks; if it
succeeds on the primary, it has obtained all of the needed locks and we can be
sure we can then get the locks on the other nodes.

In addition, the term synchronous replication is also used as in PostgreSQL's
replication, but in Postgres-XL for a datanode having a standby replica,
synchronously. It is a warm, non-queryable standby.

With regards to a WAN environment, Postgres-XL is not designed for that
because of the latency, and for handling long network outages. If there is
enough interest, that could be an area to enhance in the future, but
consistency may be compromised.

