Hacker News new | past | comments | ask | show | jobs | submit login
Designing a SaaS Database for Scale with Postgres (citusdata.com)
241 points by mixmastamyk on Oct 6, 2016 | hide | past | favorite | 54 comments

We serve enterprise customers, and use the 'separate database' strategy for these reasons -

1. Stronger data isolation. Technically schemas provide the same level of isolation, but they are much harder to explain or defend during compliance audits.

2. More secure backups. Client data can be encrypted with their own key and stored as per their needs.

3. More useful backups. The frequency and retention of backups can vary to meet the SLA requirements (and costs). And we if something goes wrong, we can recover that particular customer's data in isolation without impacting the rest of the application/customers or worry about merging it in.

4. Secure data deletion. Many European customers demand that all their data is securely removed upon termination. This creates a massive problem with purging backup information if everything is in the same dump.

5. Independent load & performance. If one customer is particularly high-load or chatty, we can move them onto a separate server where they don't impact well behaved folks.

6. Easier horizontal scalability. Just move heavy loads to their own servers + read replicas.

7. Direct access to data. Specifically, we can use BI tools that do not understand schemas and even give direct access to clients' analysts.

8.Independent migration paths. Sometimes the customer's timetable does not allow them to upgrade to the newest version right now (e.g. they are in the middle of using the app for a key process this month). We can leave their account routed to the previous version in both the codebase and the data store.

Out of those, the key 3 reasons are: stronger data isolation, better backup strategy, and more predictable compliance story. But that's enterprise: even if we're widely successful, we'll have "thousands" of customers - never millions. And we can manage thousands of databases, so this architecture path is preferable to us within those boundaries.

How do you handle look ups that would require cross tenant queries? e.g. each customer has multiple devices connected devices to our system and we want to know the health (is it online, is it responding, etc).

The one way I've thought of current is either having an aggregate table in a "central" database that would be used to collect these kind of statistics. It would be "real-time" but it would be near real-time depending on the frequency of updates. The downside is you have the over head of maintaining a separate data source.

The other option was to just have the software set up to query each tenant at a time and take the performance/time hit. That's not really the best experience and probably violates the idea of data isolation.

I'm having a trouble understanding what you are saying by cross-tenant queries in context. What you are describing is application monitoring which would be completely independent of the DB holding customer data, which is what they are describing in your parent's post and the OP. Application runtime/health status could trivially be handled in a separate stream.

It's not really the application runtime/health status, it's the status of individual assets for each tenant. So a scenario would be as technical support person I would like to look at what customer assets are currently off-line across all tenants.

One way to do that would be to query each individual tenant database and then combine the information. But this would void the idea of security/isolation. The other downside is that it's slow.

But I think I've been looking at this from the wrong point of view, there should be a central health check service for these assets that collects the status of the assets and then that would be queried by the support person.

Generally I see it the other direction as a technical support person- each tenant's host is responsible for reporting uptime to a central service (because it's SaaS) and the customer is aware of this (even if it's basic syslog forwarding). Because we are not reaching in and they are sending out their customer data is still isolated.

Any time you are reaching into the customer datasets that is where you have the possibility of breach via that access and would have to make sure all of those events are audited and that audit records are secure. If it is sending monitoring data to an aggregation service that can be well-defined in contents and much easier to demonstrate that no sensitive tenant information is shared.

It's much easier and saner to setup a server to receive and configure each tenant's host to send exactly what is needed and then analyze who didn't send their stuff at regular intervals for a more direct investigation and everyone can agree on what needs to be done based on that.

Yeah I agree with what you're saying, as I said I was looking at the problem somewhat naively.

When you talk about having each tenant's host do you mean a dedicated application server for each host? The current setup we have is that we have multiple tenant databases but only two application servers providing functionality to them, so there's a switching cost on the application servers because they can't maintain a connection pool to the databases (they also have to know the credentials for the databases.

It seems excessive if you have a 100 tenants to have your infrastructure for serving the application replicated 100 times but maybe we're just haven't quite built our system correctly yet.

In the model I am describing every tenant component would have an agent that was reporting the state of that tenant. If there was a shared component/pool that wouldn't be reporting separately, I'm really only describing the tenant databases/tenant-specific files. So if you had a tenant database, there should be an agent that just is reporting on that tenant database.

As for separation of the application frontend it really is a risk assessment and culture thing. If the team understands that making sure you never cross streams of customer data is a critical requirement which is one of the first viability questions in the design then sharing really isn't too bad. If your team struggles with this or has recurring issues to this end separate application servers make more sense. Generally if you have tenants with data exposing another customer's data is a quick way to have trust in a product evaporate, I know I have had this happen as a customer where I was able to show a bug where I was getting another customer's accounting information in a cloud app which caused me to terminate the contract.

If you guys are minding the soup of the application servers well enough there's nothing say you can't, there are many people that do it.

> 8.Independent migration paths.

In this case, how is the business logic (codewise) handled across different states of databases?

Not OP, but in our case, every customer (company, not user) gets its own process, running whatever version of the codebase they're on, much like if they were hosting it in their premises.

We don't support very old versions, mind you, except for security fixes.

Thank you, Andre. That's an interesting concept I haven't thought of.

> But that's enterprise: even if we're widely successful, we'll have "thousands" of customers - never millions. And we can manage thousands of databases, so this architecture path is preferable to us within those boundaries.

Exactly. Unless you're ACV is $100k+ this is strategy isn't really sustainable and might materially impact your gross margins.

EDIT: Further, read the more technical limitations of this approach: https://news.ycombinator.com/item?id=12652021

Exactly. Unless you're ACV is $100k+ this is strategy isn't really sustainable and might materially impact your gross margins.

Why? We have customers with ACVs under $10k (even after adjusting for cost of living/general prices level) with this model. We simply automated the whole process, so the clients can register for our service and get a new database (created from a template) without any manual intervention.

Good luck with schema migrations and backups when you hit 1,000 customers (see Citus rep's comment).

I think they're talking about one-schema-per-tenant, not one-database-per-tenant like e1g.

Still, we'll never hit anything close to 1000 customers per server, we distribute them over more servers instead. In fact, that's an advantage of this model over the single-database one: since every tenant is isolated, the model is perfectly horizontally scalable. You can simply fill up servers without worrying about massively distributed databases or datacenter-wide load balancing. Hell, we just migrated across hosting providers with barely breaking a sweat: we simply moved machine by machine without ever worrying about latency between DCs and all that.

But if your schema changes, it means you have to do a migration to N databases (N being the number of customers you have). If you have to update your database from PG 9.0 to PG 10.0, you have to do N times. And so on. You'll have N number of backups that all have to be stored individually. Etc etc.

I second this strategy. It does cause a few more headaches, but overall clients prefer it to a single DB. We're in enterprise finance, so this comes up a lot and the answer is typically well received.

Almost all the points above are the precise reason we went database-per-tenant. We also chose this method because our system was self-hosted originally (Single DB, non-multi-tenant) for 4 years or so. Adding columns to each table to separate was not ideal and we didn't want to run the risk of 'forgetting a where clause' at any point.

For those wondering, all of our databases have the same schema. We use Visual Studio Database Projects to handle updates to keep it all in sync.

Having spent almost 5 years at a financial company, I get you. To others out there, make sure you charge clients accordingly if you have to do this.

"Captain Obvious" out!

Just curious, with your 5 years with financial companies do you see them do things differently than most organisations? eg.. deadlines?, SLA?, tech-understanding?

I had a couple of colleges working for the banking financial sector and they where pretty high strung most of the time. They where a bit more direct in regards to putting money on the table to get shit done.

I worked in a smaller, "high value investor" unit, within a parent mutual fund company.

It was very much not a tech company, although there were a few good people in the org. There was of course a big focus on buying supported packages, and off-shoring as much as possible. I understand buying "non core competency commodities", but the off shoring was dubious.

The custom stuff in my unit was very much neglected, except for integration projects when changing vendors, which of course have to happen when you move some of your dependencies.

Anyway, uptime was VERY important, which also implies a fair amount of risk management/reduction. In fairness, buying external services does make costs much more predictable. However, if you are on the vendor side of this, you need to make sure you build in pricing to cover what will be a very demanding customer.

Offshoring fun & games: another team bought this expensive Java caching service thing from IBM. They did not bother to learn how to properly code the fetches so that the calls included "cache miss" logic to get missing items (d'oh!). Whenever they restarted the web site, they had to first spend N minutes running a custom program to regenerate the "cache" content. I suspect they didn't really understand the definition of "cache" :-)

(they of course did some other ignorant things which made this an issue, as well)

Separate clusters (not just databases) also gives you the ability to create read only replicas/hot standby on a per customer/application basis.

Same here (European company, hosting a business SaaS platform for companies of all sizes).

I frankly prefer this approach, since it completely bypasses the problems that come with sharding and distributed deployments, since every customer is completely isolated. I just wish there was more tooling available that supported our architecture.

Same here. Also working with enterprise customers, sometimes competitors (or at least, tangentially). Thinking of business reassons, there is absolutely no way that they would accept to "share" a database. On everyday ops, it's reason 8 - Independent upgrades on different time points.

Ditto, here. On top of the fact that our large Enterprise customers have their own data isolation/security requirements (each one their own unique requirements).

Salesforce has successfully deployed multitenancy SaaS at a huge scale, and they have all tenants sharing the same tables. Every row in every table has an Organization-ID column and all queries include the Org Id to keep the data separated. They built their SaaS on top of Oracle RDBMS but are now using PostgreSQL in some not yet publicized way.

Google for Salesforce multitenant architecture and you will find several presentations/videos where they explain much of this architecture. IMHO this is required reading for anyone designing any sort of multitenant SaaS because you are less likely to make disastrous mistakes when you can compare your solution to a known successful solution, and explain where your solution differs and why.

> They built their SaaS on top of Oracle RDBMS but are now using PostgreSQL in some not yet publicized way.

How do you know that?

There have been published reports that they are trying to move to PostgreSQL specifically, and off of Oracle more generally. This is an example (with links to some others): http://fortune.com/2016/05/11/salesforce-two-software-worlds...

Interesting, but I am surprised by the quote:

"A source with knowledge of the effort said a database change could give Salesforce flexible technology that could be more easily used across many data centers."

I though Oracle did pretty much everything, but cost a fortune.

Thanks for the link.

Good video describing their shared multi-tenant architecture:


(Salesforce Multitenant Architecture: How We Do the Magic We Do)

The articles mentions the approach to have one schema per tenant but sadly does not mention this again afterwards.

Are there serious issues with this approach I am not aware of?

Craig from Citus here. There are a number of issues with one schema per tenant, the biggest one is that at a larger scale Postgres will mostly just not work anymore. Things like pg_dump start to fall over and while this has been improved some, there is still an upper limit somewhere between 1,000 and 10,000 tenants. Further having to then run schema migrations against all of them can be quite painful. There are tools that help on the schema migration front, but what we've seen is that again at scale things start to break. If you'll only ever have 100 customers then by schema can work for you.

Performance, management, SLA are some big issues with any "SQL as a service" project. How can you garuntee 1) disk utilization per tenant 2) CPU utilization per tenant and 3) transaction volume per tenant? Most simply, I don't believe you can, and we concluded it's simply not worth going down this path of shared, multi-tenant database (postgresql not withstanding, any RDBMs would have the same issues).

Our conclusion was that the only way to get the required level of management per tenant, and to support truly massive number of tenants, was to use an inprocess database over https ie SQLite and Apache. But, SQLite has an image problem, it's everywhere, and nowhere. It's built with some fundamentally different decisions than other databases, and isn't traditionally used for web applications.

So that's the course we took, links in my profile for more info.

> use an inprocess database over https ie SQLite and Apache

Why not, but you introduced a whole bunch of new issues. SQLite only supports a single writer at a time. This is a problem if you have a lot of users on the same tenant. This is also a problem when you need to create an index, for example, which is not a background operation.

Indeed there are trade offs, and high write applications are a weak spot of this approach. But for low to medium write applications (ie most applications?), SQLite WAL [1] option performs really well. We implement application level caching via a X-Query-Cache header[2], in that case, you're serving directly from redis. This set up can scale really, really well.

[1] https://www.sqlite.org/wal.html

[2] https://www.lite-engine.com/docs.html#caching

One issue we had: pgdump performance was abysmal for dumping a single schema out of thousands.

We had to write a custom backup script to handle individual backups, taking advantage of the knowledge of our own database architecture (basically: we don't have to read the schema list and figure out relationships between them because we already know that).

If you read pgdump's source code, when doing the actual backup it uses postgres COPY command, so it was easy enough to write our custom exporter.

Also interested in this. I architected my system with a single multi-tenant db. My thinking was that it simplified things initially (less admin / backups / pooling etc) but it would be far easier to split into individual dbs later (than to combine).

We are a b2b product and we've picked up customers around the world. Now it seems like it might make sense to shard geographically. Is that common?

We shard geographically, with a single multi-tenant Postgres schema per geographical area.

You run into corner cases when a customer wants to operate into two distinct geographical areas, so basically you may have to maintain a central repository of tenants and, ultimately, under the hood, your tenant primary keys are not handled via local sequences.

I also have a database with schema per tenant. In the tenant code I set the schema. I like it and would like to see more information as well. One issue, I think, is that it does not have full isolation and no easy sharding. As sad by another commenter, I plan to distribute my schemas on multiple servers when neccessary, like the sharding approach.

My personal, recent experience is that my current startup's business model would be poorly served as a result.

You see I originally thought I was building a SaaS service, but it's actually turned out that my customers needed a two-sided network. A per-tenant schema would've been an painful impediment to making that paradigm shift.

Multi-tenancy at scale with RDBMS is just hard to do. Once you get to certain volume, you just have to have different data stores for different needs. Only a subset of business require ACID properties so you would use RDBMS for those and for the rest choose a NoSQL stack that has the least resistance to ramp up WRT team skill set.

You can get away without using NoSQL but you probably would end up spending time and money maintaining several DB clusters, migrations, handling replication lags etc.

> Only a subset of business require ACID properties so you would use RDBMS for those and for the rest choose a NoSQL stack

While ACID is an important and nice set of properties of RDBMS's, its not the only reason to choose them; its a lot more adaptable to changing query patterns and needs than less-structured document-oriented datastores. There are sometimes performance and other reasons to choose some non-relational datastore for a particular load, but "don't need ACID guarantees" is, itself, an insufficient reason for abandoning the RDBMS model for NoSQL (don't need ACID guarantees, of course, is a necessary condition to selecting a non-ACID store, but not sufficient to choose one.)

ACID was just an example. Thanks for clarifying it though.

What are best practices for gracefully allowing end users to use a single login for multiple of your SaaS tenants when the tenants are in separate databases?

I'd say OAuth, like StackExchange does. Then you can run your own authentication and/or accept others like Google. In my experience (we run a db-per-tenant service, with OAuth for internal support users), it works fine.

to;dr denormalise a bit so that primary keys represent a full ancestor path to the "usual" primary key, making shardingnand collocation of customers much simpler

And use Postgres:-)

Seems a nice article but I kind of assume that to be basic knowledge amount DBAs ?

How does this compare to having a separate Postgres schema per tenant? You'd need to make your app a bit more complicated (to set the right schema before every query), but you won't need to denormalise.

This is really a judgement call. If you are expecting volumes of tenants in the thousands, a schema per tenant is slightly crazy. I would personally only think of a schema or database per tenant if I had either customisation issues (If most of the sale is in professional services) or worse if we have data regulation issues (an Indian tenant, an EU tenant and a US tenenat would be challenging to manage the complex data transfer laws around, and it might be easier to have geographically located servers)

But unless you have few small expensive clients then I cannot see much value in multi-tenanting via schema or database. It's hard enough keeping that in mind when using Dev/UAT/prod schemas !

Schema per tenant from the application's point of view is much the same as database per tenant. The key difference is that sharing objects is easier if you have common procedures/data as you can keep objects that are not specific to each tenant in a separate schema (commonly the default schema, "public" in postgres or "dbo" in SQL Server).

Is the primary key in the hierarchical database a composite key of the full "traditional pk" lineage?

Or do you keep the actual PK on the adgroup table the traditional key (ie "adgroupid") then you just also throw customerid on adgroup (even though you will never join adgroup directly to customer) so you can shard each table by customerid?

What is the advantage of using a composite PK instead of the traditional PK so long as all ancestral ids are also stored on the table (so sharding is collocated correctly).

Yes, the primary key in the hierarchical database is a composite key of the full lineage.

The motivation for that is enforcing uniqueness. When the primary key (PK) contains the sharding key, the database can easily push down the PK constraint to the related shard and enforce the PK's uniqueness.

It might be among DBAs, but not so much among others who might be interested. You have got to see the "clever" tricks some people come up with to solve some basic db problems.

Is not clear to me how they make it. If I understood, the idea is not use a column to differentiate tenants (because costly joins) but use "hierarchical database model.". But how is that? How is the structure of the tables?

I use uuid as primary key when I plan for scale and future sharding. It's negligible performance hit of at all and I can move entire datasets from one db to another without worrying about references.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact