Hacker News new | past | comments | ask | show | jobs | submit login
Make enterprise features open source (github.com/citusdata)
431 points by ahachete on June 16, 2022 | hide | past | favorite | 69 comments

Really nice seeing this commit, that I pushed this morning to Github, on the frontpage of HN a few hours later :) It's awesome that other people are just as excited about open-sourcing as we are. But if you like this definitely keep an eye on our blog[1] and/or twitter[2] for the upcoming "official" announcements. They will have a lot more info on this and all the other cool changes things that are part of Citus 11.

[1]: https://www.citusdata.com/blog/

[2]: https://twitter.com/citusdata

I submitted the news, sorry if I spoiled the announcement! ;) But as soon as I received the news, and these are definitely good news, I wanted to share it with the HN crowd :)

BTW, is item #8 (or any other, for that matter) an alternative to having to use .pgpass? Because this is a notable itch towards automation of Citus, would be great to see this resolved in the open source version.

Congratulations for the commit!

Yes, pg_dist_authinfo is an easier to use alternative to .pgpass.

And no worries about spoiling the announcement a bit. Seeing this on the frontpage of HN made my day!

as soon as I received the news, and these are definitely good news, I wanted to share it with the HN crowd

It's a natural impulse but somewhat counterintuitively


One way to think of it is as repetition-prevention since a big enough story is going to end up on the front page one way or the other.

I don't see this an "announcement of an announcement", as there's concrete and detailed content about the announcement itself. You can read the commit message and even the source code.

Moreover, I'm not affiliated with Citus and I don't know if they are planning later on an official announcement or not.

I stand by the idea of sharing these good news at the earliest for everyone interested to know, look at the code, and plan ahead of time if they need/want to.

I didn't say it was an 'announcement of an announcement' just that this

the idea of sharing these good news at the earliest

isn't really HN's remit, despite the 'news' in the name. The actual announcement is going to get posted as well and dupe-debated, etc. So there's no harm in waiting.

> I didn't say it was an 'announcement of an announcement' just that this

But it appears you didn't consider the following part of my comment, where I explained that a) I didn't have any way to know there would be a later announcement; and b) that there's enough interesting information with the current news that is worthwhile to many (definitely was for me), so there's no reason for holding it.

The important thing is the idea that it's important that something show up on HN asap is not accurate, it's not HN's thing, there's no harm in waiting as you can see in 9523772353.33 moderator comments.

I didn't have any way to know

I mean, they weren't going to keep it secret. Anyway, let's wrap up here.

This is the release announcement with more details on the open sourced things and all the other totally new features: https://www.citusdata.com/blog/2022/06/17/citus-11-goes-full...

(sadly the accompanying HN submission got marked as a duplicate from this submission)

+1 on Jelte's sentiment

Btw HN, if you share the sentiment about more open source, please feel free to star us on GitHub :)

> Don't solicit upvotes, comments, or submissions. Users should vote and comment when they run across something they personally find interesting—not for promotion.

Probably applies

It's less "cool" if you ask for it but no I don't read the HN guidelines as applying here.

Sweet jesus I can't wait for the blog post to drop.

Thank you for your work Citus Team.

Is it possible to easily configure Postgres/Citus such that you can replicate your database across a managed offering such that if your database load is sufficiently high then you start using the managed offering, otherwise you use your managed offering?

Being able to combine the cheap compute of something like Hetzner with the high availability of a managed offering would be nice.

Not right now. It's definitely something that's on our radar, but it'll take a while to get there.

Are there any high level thoughts on how you'd approach that, or is it mainly just a "we want that at some point" in the feature list right now?

Am I correct that such functionality effectively requires the ability to scale compute independently from storage?

You submit a request for more storage and an additional disk appears a day or two later.

Good product.

We were using it for a few years and though we transitioned to pure PG - that database is still named “citus” in honor of what citus did for us.

Glad to see the open sourcing - especially the backups (which is why we switched). That’s really important.

From the title, I thought this would be a call for them to make it open source, not them making the change. This is great!

Does anyone have any experience building with Citus they can share?

I've been reading about it for years and it's always sounded very impressive to me - the magic sauce that you can sprinkle on regular PostgreSQL to have it scale horizontally - but I haven't yet had the opportunity to try it.

I've yet to make it out of the lab. The failover solutions seem to be A-B rather than A-B-DR. That is, you can have a primary and a replica (probably in the same DC/AZ), but you can't do a primary, a replica, and another replica in another datacenter that can function on its own (in a DR capacity).

Citus uses physical replication for worker and coordinator nodes, and things start getting complicated when you're trying to monitor the status of all the replicas. With vanilla PostgreSQL, you don't have this issue. I'm guessing that they solve this in Azure with block-device primitives at the storage level or something along those lines? It's probably not insurmountable to do it yourself, but we're not yet at the grip-n-rip enterprise offering.

Making HA easier while self-hosting is definitely something that's high on our list of future improvements. I wouldn't be surprised if there will be some announcements regarding this in the coming year. One of the main issues (imo) is configuring HA for PG isn't easy, even when using regular PG. And this becomes even more complicated in the case of Citus, because you're effectively running multiple PG servers, all of which you need to configure HA for.

For my understanding about the multi-region DR capability you would want: Do you want to be able to write to the DR region all the time? Or do you want to be able to switch over to the DR region in case the main DC is down?

I'm not a user per se Simon, but I work on the Citus team and can share links to a few customer stories.

+ Implementation of the UK Coronavirus Dashboard running on Citus and Postgres, a post co-authored by the technical lead of the dashboard, Pouria Hadjibagheri: https://www.citusdata.com/blog/2021/12/11/uk-covid-19-dashbo...

+ Architecting petabyte scale analytics using Postgres and Citus, based on an interview with the user, Min Wei: https://www.citusdata.com/blog/2019/12/07/petabyte-scale-ana...

+ Video of a customer talk about scaling a SaaS application on Citus by Jonathan Denney at ConvertFlow: https://youtu.be/PzGNpaGeHE4

Same here, wanna see some production usage testimonials. Plus has someone compared it to CockroachDB? I understand they are two different technologies with broader strokes of same idea of scaling Postgres horizontally but still would love to see some sort of comparison and differences.

There's some listed at https://www.citusdata.com/customers

Same here, unfortunately I've no chance yet to try

I remember years ago, Citus offered people a free pair of socks as a bribe for signing up for their newsletter. I loved my database socks and gave them some free advertising during college.

I wonder if the decision to make all enterprise features open source was affected/influenced by the Neon release/announcement recently? The timing makes me wonder, or has this been planned for a very long time?


Either way, great news! I've been curious to try out Citus many times but the fact that many of the best parts has been closed has held me back a bit, I think it's great with more postgresql open source extention options.

This is fantastic, and should bring some relief to those who want to self-host Citus outside of Azure. I wonder if TimescaleDB's recent license changes had anything to do with this.

I'm not sure what "recent license changes" you are referring to?

The Timescale License was introduced in late 2018, although we never _relicensed_ any of our Apache 2 code, we only created a space for _new_ capabilities to be developed under these terms.

There were a few changes to the license in Sept 2020, but those only liberalized a few clauses (including from feedback we received from the Hacker News community, including the "right to repair" and "right to improve"). At the time, we also moved any remaining "enterprise" features from an Enterprise/Paid license into the TSL/Community/Free license.


We haven't changed anything since, nor have we ever moved any Apache-2 code into the TSL.

(Timescale cofounder)

Time really flies by! I was referring to you guys moving all enterprise features into the TSL/Community license.

Off topic, but I just realized that I've been calling it "Citrus" for years.

I read it as that until I just saw this comment. I wonder if it’s pronounced site-us or sit-us (kite-us is probably out).

Pronounced as site-us (or sight-us, as in "eyesight")

Side-note, I had a bit of fun with the pronunciation & spelling challenges of working on Postgres and Citus here: https://twitter.com/clairegiordano/status/150378415161432064...

> 12. Support for `sslkey` and `sslcert`

There is a special place in hell for companies that consider basic encryption to be an "enterprise" feature in 2022.

This one surprised all of us on the Citus dev team, when we were reviewing what we were open sourcing. This "feature" was never intended to be Enterprise only and not having it in the open source repo was simply an honest mistake. If some open source user had asked for it I'm certain we would have also added it to the open source version. But so far no-one asked.

At least this mistake is resolved now, and no new ones like it should occur, since we will only commit to the open source repo now.

Thank you for clarifying the Citus position.

For my part, I should perhaps clarify (if it was not already obvious) that my comment was a generic comment intended towards all such companies and not specifically targeted at Citus.

For anyone else out there browsing hn and unaware of Citus...I will save you the search I just did.

Citus appears to be an extension to Postgres that allows easier distribution, replication, sharding, parrallelism etc.

We did this with Caddy a couple years ago. Best decision the project ever made.

Want to share why?

If you follow the references from https://en.wikipedia.org/wiki/Caddy_(web_server)#Financial_b... You'll see a bunch of discussion covering the saga (including a lot of it on HN).

I for one am immensely happy that mholt and team made the decision to re-open source Caddy, and I wish them all the best in the future. It's truly a wonderful piece of software.

As an early backer of Caddy I don't think I ever touched it again after what I felt as a bait and switch weeks(?) after my donation to an open source product.

Maybe it is time to reconsider now, because I really really really liked Caddy back then, which is probably why I donated enough to make me equally annoyed when I felt I had been had.

Caddy is amazing! My most favourite feature at the moment is on-demand TLS, which I'm using in my SaaS project (not launched yet) to allow customers to use a custom domain.

Great to hear! Hope it serves you well.

So grateful for Caddy!

Give Microsoft some hug here.

Impressive. Much respect.

Enterprise crippleware is obnoxious. So are restrictive and radioactive licenses like AGPL. What bugs me are those who speak out of the sides of their mouth how it's not with different flavors of rationalized BS.

Bug fixing priority, tech support, and integration consulting are the proper places to monetize, not features for-profit express turnpike or telling me what I can't do with code.

I don't see anything wrong with the AGPL. If $bigCorp wants to rent access to improved versions of my software, they should give me the changes.

AGPL doesn't require that $bigCorp give you the changes, only that they give their users the changes. You could get changes from existing users, but they might not be willing or able to help you. You could become a user, but that could be expensive or $bigCorp could just ban you from the service.

I really can't say whether that's correct or not. IANAL.

The licence in terms says you must give Corresponding Source to users "interacting with it remotely through a computer network". Does it suffice if I initiate a connection to a public $bigCorp server (and receive "access denied")? Or must I be able to successfully connect?

Maybe $bigCorp could put a proprietary authentication proxy in-between to avoid me "interacting with" the software. So, let me narrow my comment to address what you've said.

If $bigCorp wants to rent me access to improved versions of my software, they should give me the changes.

Your revised comment is definitely correct now.

Not sure about your questions since IANAL but I guess it depends on the details of how $bigCorp's server setup works. If they have some other software that controls access to the AGPLed software, then maybe they don't have to give non-customers source code. If the login form comes from the AGPLed software, but non-customers don't have a login, then maybe the non-customers do interact with the software in a limited way, so source code obligations may come into play.

Wait, isn't the license [1] on this repo AGPL?

While I would like to voice a general protest against the AGPL FUD, it's also somewhat strange to do it in the same breath as "Much respect" for Citus here.


[1] https://github.com/citusdata/citus/blob/master/LICENSE

> Bug fixing priority, tech support, and integration consulting are the proper places to monetize

Yeah ... the problem is, you can't charge a corporate millions of dollars for that off the bat, but you sure can charge the millions of dollars for software licensing. And they are happy to pay it. In fact, they want to pay to it, it makes them feel more secure and they have a budget to use.

The easy way to do that is to have themes and branding. Call it something else. "Enterprise Edition". Then they can enjoy the nice, warm fuzzy feeling of different logos.

> Bug fixing priority, tech support, and integration consulting are the proper places to monetize

I know this is a widespread business model, but it's always seemed backwards to me.

Having my customers pay me more when I produce more bugs and lower usability? That sounds like a reward for doing a bad job.

While it does seem like a perverse incentive, there will always be more customers than you are able to provide service to anyway, since no matter how good your documentation and how perfect your install process, issues will arise at the boundaries between your software and the customer's other systems, so integration services are valuable. In complex systems there will be more bugs than you can fix, so prioritization is valuable. Support is valuable because of the customers who have less experience with the software than you.

Nothing about AGPL prevents anyone monetizing bug fixing, tech support and integration consulting. You can also do whatever with the code. You just need to provide it if you host the project as a service.

Has anyone benchmarked city’s column store against other competing solutions, for example clickhouse or std row store in postgresql?

It's a bit late as i see. But it's better late than never.

Hope more open source adaptation, to a point that i could use it in production.

Interesting, what does that leave proprietary?

Apparently nothing :-). The commit says all enterprise features and the associated PR says that they will retire the internal enterprise repo.

Wow, so whole thing open-source? What can be lifted straight into PG proper? Licenses are compatible I hope.

Of course, PG maintainers will only want to lift what they can maintain and has a compatible release cadence. But it wouldn't be the first time that Citus' work has landed in PG core.

Citus itself is AGPL, iirc, and it uses the PostgreSQL extension API and isn't a fork. Besides being carried in the contribs extension tree and possibly being a part of project internal tests, I don't really see what having it developed there would necessarily bring. Are there any other benefits/drawbacks I've missed?

This is great news I will definitely try to support a model where citus get paid (presumably consultancy) in the future and it’ll be my go to upgrade from postgres. I could probably use their sharding features fairly soon on the startup I’m building!

vitess vs citus.. which is superior and why?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact