Pgo: The Postgres operator from crunchy data

mdasen · on June 26, 2022

https://www.crunchydata.com/developers/terms-of-use

Before using Crunchy Data, I'd read their terms of use.

"without an active Crunchy Data Support Subscription or other signed written agreement with Crunchy Data are not intended for... using the services provided under the Program (or any part of the services) for a production environment, production applications or with production data"

https://hub.docker.com/r/crunchydata/crunchy-postgres

If you look at their Docker Hub images, you'll see that they're provided under the terms of use of the Crunchy Data developer program which means you can't use them in production without an active subscription.

Maybe I'm reading it wrong, but if that's the case Crunchy Data should definitely change their terms of service.

https://www.percona.com/blog/2021/05/26/percona-distribution...

Percona certainly seems to think you can't use the Crunchy Data images in production saying, "CrunchyData container images are provided under Crunchy Data Developer Program, which means that without an active contract they could not be used for production."

PeterZaitsev · on June 26, 2022

CEO Percona Here.

It is more than "Percona Thinks" We have number of customers who started using Crunchy Kubernetes Operator based solution thinking it is Open Source and were contacted by Crunchy Sales team to indicate they need subscription to use it.

This was one of the reasons for them to move to Percona Operator for PostgreSQL which does not require any commercial relationship with Percona to use in practice and completely Open Source

https://www.percona.com/doc/kubernetes-operator-for-postgres...

gbartolini · on June 26, 2022

CloudNativePG Maintainer and VP of CloudNative at EDB here.

We decided to go even further with the CloudNativePG operator.

EDB as original creator has decided to donate the intellectual property of the source code to the community, open sourced the existing operator under Apache License 2.0 to apply for the CNCF Sandbox. The project not only includes the operator, but also the PostgreSQL operand images - which can be customized (we provide details on how images should be).

We genuinely welcome other vendors to participate in the community and contribute to the project, including by offering professional services around it. Our multi-year commitment is to become a graduated CNCF project.

For more information: https://cloudnative-pg.io/

snuxoll · on June 26, 2022

I just noticed you guys are finally working on an operator for MySQL (not just XtraDB), I’ve been waiting for something like this forever after numerous false starts from Oracle, MariaDB, and some well intentioned community led projects that never really got off the ground. Kudos to your team!

PeterZaitsev · on June 26, 2022

Thank you!

Yes. We have development version available already, check it out and give us some feedback - what else need to be done so it is Production Ready

nwmcsween · on June 27, 2022

The percona pgo fork is quite a few versions behind the crunchydata pgo which has more declarative configuration, this is what ultimately made me reconsider persona pgo.

988747 · on June 26, 2022

I think this is more of the disclaimer that they are not responsible to any damage to your production systems if you do not use their support. That's why the wording is "are not intended for..." and not "you are prohibited from..."

mdasen · on June 26, 2022

I'd also note that the agreement doesn't provide a license for non-development purposes. "Crunchy Data provides access to Crunchy Developer Software free of charge for development purposes", but there's nothing that says that they provide access for other purposes.

Basically, the license seems to be (and IANAL): we provide the software for development purposes...without a subscription, it's only intended for development purposes

They didn't say "we provide this software for your use...without a subscription, it's only intended for development purposes". They basically said: we provide this software for development purposes, it's only intended for development purposes.

If it's only a disclaimer, where's the grant of rights to use the software beyond development purposes?

Yes, if I were a lawyer defending a user, I'd definitely be arguing your point. However, I think Crunchy Data's lawyer would simply point out that there's literally no grant in the license for non-development purposes. Maybe a judge would take pity on you given that it seems hidden, has some ambiguity (though maybe it's not ambiguous to a lawyer), and because they allow you to spin up the operator basically without ever knowing these terms exist.

Given that there are many other PostgreSQL operators from companies like Percona (which I think has a great and long track record of supporting open source databases), EnterpriseDB, and Zalando, I don't see why I'd want to choose Crunchy Data.

You might be right and I think you would be right if only looking at the piece I quoted, but given that there's no general grant in the license that the "intended for" is merely a disclaimer for, it seems like the license grant is that they provide access for development purposes. IANAL and I'd rather work with software and companies where I don't have to be a lawyer. Crunchy Data could have said "We provide access to this software free of charge for any purpose. Without a subscription, it is unsupported and not intended for production use. We are not responsible for anything that happens if you use it in production." That's not what they said. They said that they provide it "free of charge for development purposes" with no grant for non-development purposes.

sbuttgereit · on June 26, 2022

I agree that the developer program ToS is written in a confusing manner and suggests that you can't use it all for production without a license. I expect that to be intentional: if the impression you get is that you can't even use their software in production, it's much less likely that you will do so with the impression that there is any support, warranty, or endorsement of such use.

[EDIT: I made an error, I originally said that it was explicit about no license being granted for the software, but that was about things like no license for service marks and trade names.]

On the other hand, there is a license for postgres-operator and it looks like Apache 2.0:

https://github.com/CrunchyData/postgres-operator/blob/master...

For what's being distributed in their Docker container image, I imagine it depends on what they're actually distributing for that to matter. I expect that it's mostly other people's software (like PostgreSQL) and that the Docker page listing the container just too much of a summary to say anything about licensing. I'd investigate that further to clarify license status prior to use, but expect it to not be legally constrained to non-production use only.

vbezhenar · on June 26, 2022

Another popular alternative: https://github.com/zalando/postgres-operator

New player from Enterprise DB: https://github.com/cloudnative-pg/cloudnative-pg

ahachete · on June 26, 2022

If building a list of alternatives, let me do a shameless plug for StackGres [1], the Postgres platform for Kubernetes with a fully featured Web Console, AMD64 and ARM64 support and more than a hundred available Postgres extensions [2]. Fully open source, no usage restrictions.

[1] https://stackgres.io/

[2] https://stackgres.io/extensions/

Disclaimer: founder of the project.

knewter · on June 26, 2022

Second. StackGres is phenomenal

bo0tzz · on June 26, 2022

+1 for cnpg - I couldn't get any of the other operators to work well for me, but cnpg has been a blast.

gbartolini · on June 26, 2022

Thank you!

loganloganlogan · on June 27, 2022

Another great resources is https://operatorhub.io/ which lets you search by certain parameters; as well as https://dok.community/landscape/.

gbartolini · on June 26, 2022

One clarification about CloudNativePG. It is not EDB's anymore (or EnterpriseDB if you prefer).

EDB is the original creator. The software is now entirely owned by a vendor neutral community, openly governed. We have applied for the CNCF sandbox and waiting for the approval at this stage.

qeternity · on June 26, 2022

Worth noting that the Crunchy operator is based on Patroni, which is maintained by Zalando.

mdasen · on June 26, 2022

https://www.percona.com/doc/kubernetes-operator-for-postgres...

Percona also has a PG operator.

sylvainkalache · on June 26, 2022

In case some of you are interested in learning more about running databases on Kubernetes, or more generally stateful workloads, there is a community dedicated to that https://dok.community/

I a am part of the community team. We have weekly live streams/blog posts about the topic and a Slack channel.

kmarc · on June 26, 2022

I used this in the past. Helped a lot with provisioning micro-pg-clusters.

But honestly, the project always felt like a one-man-show, some (realy great) dev had a working set of scripts and kubernetized / operatorized it, but the whole thing feels hacky as hell.

I'd still give a try if I ever needed postgresql again, but I would also know that I need to implement (again) my set of scripts and hacks on top of it.

gbartolini · on June 26, 2022

Maintainer of CloudNativePG here.

Back then, we evaluated Crunchy Operator's source code. Being primarily imperative and using an external tool for failover, where the two main reasons we decided to start a new project in 2019 which was entirely declarative and purely based on the Kubernetes API server for cluster status. Such project was released open source last April under the name CloudNativePG and hopefully it will enter the CNCF Sandbox soon (fingers crossed).

smartbit · on June 27, 2022

> the whole thing feels hacky as hell

Couldn’t agree with you more, especially if you’re used to the quality of strimzi.

kmarc · on June 27, 2022

Did we work on the same project? I was also responsible to integrate strimzi (redhat amq streams) and gosh that's a quality piece of software!

Johnnyq72 · on June 27, 2022

CloudNativePG community member here.

I think this is a super interesting space. What I mean by that is the fact we have Postgres, which is arguably one of the most successful publicly governed open source project out there, for many decades! That meats on of the most vibrant and transformative publicly governed projects as well, called Kubernetes (CNCF).

Why then consider a (proprietary) vendor governed (open source) project to bring these two technologies together? With CloudNativePG, you bring these two super strong communities together using these exact same governance principles, enabling everyone to benefit and contribute.

It is my conviction that this is going to be one of those elements that is going to contribute to the ongoing transformation of data management today.

Johnnyq72 · on June 27, 2022

For those who would be interested, I wrote a bit about it here: http://jk-consult.nl/a-new-north-star-has-risen/

noodlesUK · on June 26, 2022

How does this compare with something like kubegres?

https://www.kubegres.io/

hardwaresofton · on June 26, 2022

Have chatted with the guy who builds kubegres, it’s a really solid solution and is meant to be better integrated (CRDs) and more of the reliable parts of the pg stack.

It looks super solid. Unfortunately I can’t vouch for it in production yet since I still write my own resources but for me Zalando is #1 and Kubegres is either 2nd or 3rd.

https://www.reddit.com/r/PostgreSQL/comments/mqrsbn/kubegres...

smartbit · on June 27, 2022

Could you elaborate why Zalando is #1?

We abandonded Zalando after failing to set it up properly. Don’t remember the details, but do remember that the setup is leaning towards the internal infrastructure at Zalando.

hardwaresofton · on June 27, 2022

I personally preferred Zalando for the following reasons (this is from quite early on, when Zalando and Crunchy were the only options and differed more in approach):

- Focus on CRDs versus outside tools (crunchy used to insist on pgo and while it's not required, I don't really want another CLI to manage/use)

- Amount of open conversation, conference talks etc around Zalando's solution and why they've built their solution/how they've scaled it

- No specific need for pgadmin/badger/extra logging features (I might feel differently these days!)

- Customizable pods

These days both of these projects are very similar overall, and I haven't compared them recently.

Looking back at my notes I had this written down:

> Looks like Zalando is probably the winner (https://www.youtube.com/watch?v=WBdNVrffOSo)

The video is old and in Japanese but basically:

- scale in/out are similar

- RBAC control is similar

- Automatic version upgrades (only Zalando supported rolling update at the time)

- Zalando had PVC-driven volume resize vs Crunchy required cluster change

It's been a year so maybe they're probably even more similar now but I'd love to hear what I'm missing if I'm wrong.

Some more resources:

https://blog.flant.com/comparing-kubernetes-operators-for-po...

https://www.youtube.com/watch?v=3TFXztwat_s

These days I might lean Crunchy -- but a lot of the things that would be gaps (like no built in postgres support) I would actually opt to solve myself with a different operator (ex. Prometheus Operator) or just adding a deployment myself (when needed). Taking another quick look today, the differences I see are:

- customizable tablespaces with Crunchy

- local backups with pgbackrest + crunchy (and in general wal-g is preferred to wal-e so zalando is a little behind there)

Neither of them have citus though, which I think will be a huge game changer as a default-include for their pg images.

gbartolini · on June 26, 2022

As maintainer of the project, I suggest looking at CloudNativePG, which is production ready (cloudnative-pg.io).

hardwaresofton · on July 1, 2022

Super late, but just got a chance to take a look -- maybe worth a mention that it's started by you folks over at EDB!

Feels like all the Postgres enterprises are making kubernetes operators these days -- probably seeing it as a way to stay in the game/relevant.

Just like before when I benefited from EDBs contributions to pg as a whole (and advancing the cause of pg), I look forward to benefiting again!

uberduper · on June 26, 2022

I cannot trust critical infrastructure components to 3rd party kubernetes operators. I don't like operators (I didn't write) in general because I find them too opinionated and obfuscated. It's unbelievable to me that someone would deploy a production datastore using one.

gbartolini · on June 26, 2022

Can you please elaborate about obfuscation?

Regarding being opinionated I believe that it is what we expect from an operator. An operator simulates what human DBAs in this case would do. I am a maintainer of CloudNativePG, and I have been running and supporting PostgreSQL in production for 15+ years, creating also another open source software for backups (Barman). In CloudNativePG we have basically translated our recipes into Go code and tests.

Many people believe that databases should not run in Kubernetes. I not only believe the opposite, I believe that running Postgres in Kubernetes represents the best way, potentially, to run Postgres out there.

uberduper · on June 26, 2022

What I've seen from teams that use operators is that nobody ends up understanding how to manage the situation when something goes awry.

I've run into way too many exotic edge cases with kubernetes to trust an operator to do the right thing with data I care about. Most especially when the operator is also managing the replication and replicas and their underlying storage.

I am pro running datastores and other stateful workloads on kubernetes. I've been running databases on kubernetes since petSets.

gbartolini · on June 27, 2022

I agree with your view point and concerns.

That is why we took the approach to reduce the number of components and integrate everything in Kubernetes, especially with logging (we directly log in JSON to standard output) and the usage of application containers, which enables us to cover the case of troubleshooting via the fencing mechanism (your pods are up, you can access storage, but Postgres is down, giving you the possibility to check even possible data corruption issues).

Also, the status is directly available in Kubernetes, so in our view easier for Kubernetes administrators.

Finally, the source code is open source and directly available for inspection - if you want to understand what is happening.

Johnnyq72 · on June 27, 2022

So, instead of (re-)inventing the wheel, with something like CloudNativePG, I would encourage your team to look into this, perhaps help contribute to better documentation. I believe that would be the way to a) get your project to benefit from using an operator for Postgres (hence scaling / benefitting more and more quickly) and it would b) help a great bunch of other folks to also benefit from that shared knowledge.

loganloganlogan · on June 27, 2022

This survey (1) shows that the majority of orgs develop their own operators, perhaps related to what you say here. Do you think industry standards for operators would move things in the right direction?

(1) https://dok.community/wp-content/uploads/2021/10/DoK_Report_...