
Ask HN: Has anybody shipped a web app at scale with 1 DB per account? - bradgessler
A common way of deploying a web application database at scale is to setup a MySQL or Postgres server, create one table for all customers, and have an account_id or owner_if field and let the application code handle security. This makes it easier to run database migrations and upgrade code per customer all at once.<p>I’m curious if anybody has taken the approach of provisioning one database per account? This means you’d have to run migrations per account and keep track of all the migration versions and statuses somewhere. Additionally, if an application has custom fields or columns, the differences would have to be tracked somehow and name space collisions managed.<p>Has anybody done this? Particularly with Rails? What kinda of tools or processes did you learn when you did it? Would you do it again? What are some interesting trade offs between the two approaches?
======
sky_rw
My startup currently does just this 'at scale', which is for us ~150 b2b
customers with a total database footprint of ~500 GB. We are using Rails and
the Apartment gem to do mutli-tenancy via unique databases per account with a
single master database holding some top-level tables.

This architecture decisions is one of my biggest regrets, and we are currently
in the process of rebuilding into a single database model.

FWIW, this process has worked well for what it was originally intended to do.
Data-security has a nice db level stopgap and we can keep customer data nicely
isolated. It's nice for extracting all data from a single customer if we have
extended debugging work or unique data modeling work. It saves a lot of
application layer logic and code. I'm sure for the most part it makes the
system slightly faster.

However as we have grown this has become a huge headache. It is blocking major
feature refactors and improvements. It restricts our data flexibility a lot.
Operationally there are some killers. Data migrations take a long time, and if
they fail you are left with multiple databases in different states and no
clear sense of where the break occurred.

Lastly, if you use the Apartment gem, you are at the mercy of a poorly
supported library that has deep ties into ActiveRecord. The company behind it
abandoned this approach as described here: [https://influitive.io/our-multi-
tenancy-journey-with-postgre...](https://influitive.io/our-multi-tenancy-
journey-with-postgres-schemas-and-apartment-6ecda151a21f)

Happy to expand on this if anybody is interested. It's currently a cause of
major frustration in my life.

~~~
ryanbrunner
Echoing this as well, I worked for Influitive and was one of the original
authours of apartment (sorry!)

There are a lot of headaches involved with the "tenant per schema" approach.
Certainly it was nice to never have to worry about the "customer is seeing
data from another customer" bug (a death knell if you're in enterprisish B2B
software), but it added so many problems:

\- Migrations become a very expensive and time-consuming process, and
potentially fraught with errors. Doing continious-deployment style development
that involves database schema changes is close to impossible without putting a
LOT of effort into having super-safe migrations.

\- You'll run into weird edge cases due to the fact that you have an
absolutely massive schema (since every table you have is multiplied by your
number of tenants). We had to patch Rails to get around some column caching it
was doing.

\- Cloud DB hosting often doesn't play nice with this solution. We continually
saw weird performance issues on Heroku Postgres, particularly with backup /
restores (Heroku now has warnings against this approach in their docs)

\- It doesn't get you any closer to horizontal scalability, since connecting
to a different server is significantly different than connecting to another
schema.

\- It will probably push the need for a dedicated BI / DW environment earlier
than you would otherwise need it, due to the inability to analyze data cross-
schema.

I still think there's maybe an interesting approach using partioning rather
than schemas that eliminates a lot of these problems, but apartment probably
isn't the library to do it (for starters, migrations would be entirely
different if partioning is used over schemas)

~~~
sitkack
This leads me to believe that everything you mentioned is already subtly
broken, it is the new DB/account model that just exposes it.

Is there something between the two solutions or pieces that could be modified
that collapse the problem?

What about going with DB account per app account and using views to limit
exposure to data. If user level views are applied before the business logic
has access, then the death knell scenario should be avoided.

Migrations seem hard regardless, they have to be automated anyway and have a
verified rollback/undo mechanism. Code should be multiversion aware if you are
doing live upgrades.

I am personally a fan of read-only mode while doing the last phase of an ETL
upgrade on the backend. Snapshot databases, run a bunch of Hive jobs and throw
it back down on a new set of DBs.

In the cases you outlined where it was dangerous or fragile, I think those
systems were already that way, and that fragility is a system of a different
problem. Fix that and lots of other stuff disappears.

~~~
ryanbrunner
I agree that migrations are painful at the best of times, but dealing with the
complexity of migrating a single database is far simpler than dealing with
migrating hundreds of schemas:

\- Migrations will first of all just take longer - you're multiplying the
number of schema changes by the number of tenants you have.

\- While in an ideal world migrations should be purely run within a
transaction, occasionally performance considerations mandate that you run
without DDL transactions - when some tenants fail and your migrations are in a
partially completed state for some of your tenants and not others, it can be
scary and painful.

\- In my experience, almost no one approaches data migrations in a way that is
purely backwards compatible 100% of the time without exception. You certainly
can, but there's a significant tax associated with this, and if you're in a
traditional scoped environment, you can often get away with the potential for
errors in the minuscule time that a schema change is operating (of course,
some schema changes aren't run in minuscule times, but those are the ones
you're more likely to plan for)

Going read only during migrations is an interesting approach, but there's real
business costs associated with that (particularly if your migration speed is
multiplied by running it across tenants).

I don't want to say that you should never isolate data on a schema level, but
I do think it's something that shouldn't be a standard tool to reach for. For
the vast majority of companies, the costs outweigh the benefits in my mind.

------
boulos
Disclosure: I work on Google Cloud.

tl;dr: Wait until you need it, but there are good reasons for it!

Since I didn’t see anyone mention it, the term I’ve seen a lot of people use
for this pattern is “multi single tenant”.

Part of the reason we have Tenant Projects [1] is precisely so you can do a
1:1 mapping of “Customer A can have different controls, settings, and blast
radii from Customer B”.

Many of our first-party services do this, but then again, many of the largest-
scale ones do “true multitenancy” instead. There’s no great way to offer a
scale-to-zero database like Datastore or BigQuery without “internalizing” the
multitenancy. But then it’s on you as the implementor to ensure isolation
(both security and performance).

In your scenario, if you could make each database small enough (or use a
database that handles the multitenancy for you) you gain a lot in terms of
“capability” for enterprises. Customer X wants a feature that you know how to
do but aren’t sure it should be for everyone? Their instance is separate.
Customer Y has a drastically different read pattern and should have a high-IO
instance? Great.

The easiest advice though is: wait until after you need it. A single
replicated pgsql or MySQL can go insanely far. Let your business needs push
you to the more complicated architecture, later. Prefer whatever gets you to a
success disaster. Just don’t forget backups, PITR, and actually testing that
:). (Another thing that’s harder with a shared database!).

Edit: Adding a newline and a tl;dr.

[1] [https://cloud.google.com/service-
infrastructure/docs/manage-...](https://cloud.google.com/service-
infrastructure/docs/manage-tenant-projects)

~~~
neeleshs
Somewhat tangetial: If we use something like bigquery which handles
multitenancy well, there still doesn't appear to be a good way to expose it to
a customer directly (say for a BI tool). Like with a simple username/pwd.

Any pointers?

~~~
jpgvm
In GCP you can grant permissions to entities outside of your org, either to
users or to an external service account. This is how things like Cloud Build
and other services that require Google to have access to resources inside your
projects work.

~~~
neeleshs
Doesn't it still need the end users to have some kind of google account? At
the very least a google email account?

~~~
jpgvm
If you want to expose BQ directly yes.

------
gomox
If an "account" is an "enterprise" customer (SMB or large, anything with
multiple user accounts in it), then yes, I know at least a few successful
companies, and I would argue in a lot of scenarios, it's actually advantageous
over conventional multitenancy.

The biggest advantage is flexibility to handle customers requirements (e.g.
change management might have restrictions on versioning updates) and reduced
impact of any failures during upgrade processes. It's easier to roll out
upgrades progressively with proven conventional tools (git branches instead of
shoddy feature flags). Increased isolation is also great from a security
standpoint - you're not a where clause away from leaking customer data to
other customers.

I would go as far as saying this should be the default architecture for
enterprise applications. Cloud infrastructure has eliminated most of the
advantages of conventional multitenancy.

If an account is a single user then no.

PS: I have a quite a lot of experience with this so if you would like more
details just ask.

~~~
polote
What would be the best, different tables per customer or different db?

~~~
brianwawok
There’s overhead both ways. Multi-tenant is more efficient.

What is the most possible users you could have in 5 years? Can your choice be
supported in that technology?

~~~
gomox
Enterprise systems have little to no cross-customer functionality, and
customer isolation is paramount. I think you are assuming that they will have
a database per individual user, but the original question pointed to "tenants"
(i.e. companies using the system).

------
barrkel
Yes, for multi-tenancy. Database per tenant works alright if you have
enterprise customers - i.e. in the hundreds, not millions - and it does help
in security. With the right idioms in the codebase, it pretty much guarantees
you don't accidentally hand one tenant data belonging to a different tenant.

MySQL connections can be reused with database per tenant. Rack middleware
(apartment gem) helps with managing applying migrations across all databases,
and with the mechanics of configuring connections to use a tenant based on
Host header as requests come in.

For async jobs in Java, allocating a connection can idiomatically only be done
via a callback - e.g. `executeInTenantContext(Tenant tenant, Runnable block)`
- which ensures that all connections handed out have the right tenant
selected, and everything is reset to the default tenant (an empty database)
when exiting. Per-tenant jobs either iterate through all tenants, or have a
tenant parameter under which they get executed, and the rest of the code can
be more or less tenant unaware.

It gives you the "freedom" to move tenants to separate servers, or consolidate
them into single servers, if tenants are much larger or much smaller than
expected. In reality this is sufficiently painful it's mostly illusory. We're
looking at Vitess to help scale out, and move away from a mix of multi-tenant
servers and single-tenant servers.

~~~
thayne
It also makes sharding/ horizontal scalability a non-issue, at least until you
get a big enough customer.

~~~
krainboltgreene
But like, every database is easy to shard/horizontally scale until you get a
big enough customer. Why do the extra leg work?

~~~
gomox
If you have 500 medium customers and you need to add sharding to a multitenant
production behemoth, it's not trivial.

------
thomascgalvin
The inability to reuse database connections would be a huge performance hit.

In a traditional webapp backend, you have a pool of connections to the
database. User01 hits your service, and grabs a connection off the pool.
User02 does the same, and so on. These connections get put back in the pool
for reuse once a user is done with them.

In your design, every time a user hits your service, a new connection,
specific to that user, will have to be made. This will incur network traffic
and the overhead of logging in to the DBMS.

If you're thinking about using something like SQLite, you will hit a hard wall
when the OS isn't able to open any more file descriptors, as well.

Like you said, DB administration will be a huge pain in the ass. Rather than
having Flyway or Liquidbase or whatever run a migration on one database,
you'll have to run it on thousands of databases. There will be significant
downtime when your databases are not in a consistent state with one another.
There will also be bloat from the migration tool's record keeping, which will
be duplicated for every user, rather than every database.

A lot of the tools a database gives you for free will also need to be
implemented in application logic, instead. For example, you might want to run
a query that says "show me every user using more than 1GB of storage," but
under your schema, you'll have to log into every user's database individually,
determine storage used, and add it to an in-memory list.

If you ever want to allow users to collaborate, you will end up replicating
the owner_id field type metadata anyway, and the entire benefit of this schema
will evaporate.

Most frameworks are not set up to handle this style of database access,
either. I don't use Rails, but Spring Boot would fight you every step of the
way if you tried to do this.

~~~
toast0
> In your design, every time a user hits your service, a new connection,
> specific to that user, will have to be made. This will incur network traffic
> and the overhead of logging in to the DBMS.

If connecting to your DB is significantly increasing your page times, you've
got seriously fast pages. Even back when I was working with a MySQL database
regularly in 2010, connect + login was 5 ms at maximum (and I think it was
much less, I just don't remember that far).

> If you're thinking about using something like SQLite, you will hit a hard
> wall when the OS isn't able to open any more file descriptors, as well.

You could always just only keep open the most recently used 1 Million
databases. It's pretty easy to tune the FD limit, FreeBSD lets you set it to
one FD per 16k of ram without modifying the kernel, but it's a challenge to be
so memory and cpu efficient that that's a limit.

All that said, it really depends on the application. If this is consumer
facing, and each account is an end user, one database per user is probably
excessive overhead; one database (or sharded, if you've got the users) or one
(flatish) file per user makes a lot more sense.

If it's business facing, than one database per account could make sense. You
would have isolation between customers and could put big customers on
dedicated boxes and let the customer drive upgrades and migrations etc. Just
please please please consider that corporations merge and divide all the time,
don't be like G Suite and not offer a way to merge and divide accounts to
reflect their corporate ownership.

~~~
llimllib
> connect + login was 5 ms at maximum (and I think it was much less, I just
> don't remember that far).

I try to get all my endpoints under 10ms! 5ms per call would be huge.

(Obviously I can't succeed all the time, but 5ms is big numbers imo)

~~~
toast0
Ok, you and I can be friends. A lot of people are using 'lightweight'
frameworks where hello world is 30 ms, and then they call slow services and
run slow queries, etc.

If your target is 10 ms, then you probably should worry about db connection
time.

------
redmattred
In the past I worked at a company that managed thousands of individual MSSQL
databases for individual customers due to data security concerns. Effectively
what happened is the schema became locked in place since running migrations
across so many databases became hard to manage.

I currently work at a company where customers have similar concerns around
data privacy, but we've been to continue using a single multitenant DB
instance by using PostgreSQL's row level security capabilities where rows in a
table are only accessible by a given client's database user:

[https://www.postgresql.org/docs/9.5/ddl-
rowsecurity.html](https://www.postgresql.org/docs/9.5/ddl-rowsecurity.html)

We customized both ActiveRecord and Hibernate to accommodate this requirement.

~~~
onefuncman
Also have thousands of MSSQL databases, but with significant investment in
tooling it really is transparent from a feature development standpoint, and
our feature toggles are dirt simple.

Another comment suggested doing queries across so many databases is
challenging, and it's just not, we have both adhoc query capabilities across
the fleet and a traditional data warehouse...

Now, trying to make additional infrastructure changes is challenging, but the
architecture is robust enough to solve all the immediate business needs.

------
patio11
I am aware of at least one company which does this from my consulting days,
and would caution you that what you get in perceived security benefits from
making sure that tenants can't interact with each others' data you'll give
back many times over with engineering complexity, operational issues, and
substantial pain to resolve ~trivial questions.

I also tend to think that the security benefit is more theatre than reality.
If an adversary compromises an employee laptop or gets RCE on the web tier
(etc, etc), they'll get all the databases regardless of whose account (if any)
they started with.

(The way I generally deal with this in a cross-tenant application is to ban,
in Rails parlance, Model.find(...) unless the model is whitelisted (non-
customer-specific). All access to customer-specific data is through
@current_account.models.find(...) or
Model.dangerously_find_across_accounts(...) for e.g. internal admin
dashboards. One can audit new uses of dangerously_ methods, restrict them to
particular parts of the codebase via testing or metaprogramming magic, etc.

~~~
djd20
This is true if your application is running on a shared servers - however if
you have fully isolated application and database deploys then you really do
benefit from a security and scalability perspective- and by being able to run
closer to your clients. I'd also say that it works better when you have 100s,
rather than thousands of clients, most probably larger organisations at this
point.

------
abhishektwr
For Postgress you can use and scale one schema per customer (B2B). Even then,
depending on the instance size you will be able to accommodate 2000-5000
customers at max on a Postgres database instance. We have scaled one schema
per customer model quite well so far ([https://axioms.io/product/multi-
tenant/](https://axioms.io/product/multi-tenant/)).

That said, there are some interesting challenges with this model like schema
migration and DB backups etc. some of which can be easily overcome by smartly
using workers and queuing. We run migration per schema using a queue to track
progress and handle failures. We also avoid migrations by using Postgres JSON
fields as much as possible. For instance, creating two placeholder fields in
every table like metadata and data. To validate data in JSON fields we use
JSONSchema extensively and it works really well.

Probably you also need to consider application caching scenarios. Even you
managed to do one database per customer running Redis instance per customer
will be a challenge. Probably you can run Redis as a docker container for each
customer.

~~~
Keats
there's a typo in the title of that page: Mulit-tenant

~~~
abhishektwr
Thank you. Saved me.

------
geofft
Slack does (or did, as of a few years ago) something like this - they shard
teams onto a fleet of MySQL servers. See
[https://www.infoq.com/presentations/slack-
infrastructure/](https://www.infoq.com/presentations/slack-infrastructure/)
starting around the 10-minute mark and [https://www.infoq.com/podcasts/slack-
keith-adams/](https://www.infoq.com/podcasts/slack-keith-adams/) starting
around 7m15.

If I'm understanding this right, every team gets its own database, which is
active-active replicated across two MySQL servers. There's also a separate
mapping from teams to MySQL servers (which is itself a separate pair of MySQL
servers), and of those two servers, one is primary for each team, distributed
roughly evenly. Each MySQL server instance in the fleet is hosting many Slack
workspaces, and they can scale that up/down as needed (i.e., presumably the
tiny Slack I created years ago and forgot about is hosted on the same pair of
MySQL servers as lots of other Slack instances in the same state and my
employer's active Slack instance with thousands of people is on a less-crowded
pair of MySQL server).

One user-visible effect is that it's possible to have outages that affect some
fraction of workspaces. I also suspect this plays into things like rolling out
changes over the course of many months - they don't need to do a database
migration for everyone at one.

I don't think they use this for security - while yes, a buggy query can only
affect one workspace at once, it doesn't sound to me like they do sharding at
the PHP layer, it sounds like they're running a fleet of generic PHP instances
that have the ability (and authorization) to talk to any workspace's database,
not that they're running separate web/PHP instances per customer. But it
definitely sounds like they rely on this for scalability/performance.

~~~
dmlittle
I'm not sure if they were using it at the time Keith gave his presentation but
today Slack is using Vitess[1] to handle MySQL clustering. There are a few
KubeCon talks about it you're more interested in it. If I remember correctly
they're on the tail-end of the Vitess migration and should finish sometime
this year.

[1] [https://vitess.io/](https://vitess.io/)

------
swlkr
I worked for a company that did this, we had hundreds of database instances,
one per customer (which was then used by each of those customers' employees).

It worked out pretty well. The only downside was that analytics/cross customer
stats were kind of a pain.

The customers all seemed to like that their data was separate from everyone
else's. This never happened, but if one database was compromised, everyone
else's would have been fine.

If I were starting a B2B SaaS today where no customers shared data (each
customer = a whole other company) I would use this approach.

~~~
Polylactic_acid
It has been an actual requirement from our customers that they don't share an
instance or database with other customers. It also seriously limits the scope
of bugs in permissions checks. Sometimes I will find a bit of code that should
be doing a permissions check but isnt which would be a much bigger problem if
it was shared with other companies.

------
eigilsagafos
Early stage B2B startup ShiftX here. We are successfully doing this with
FaunaDB. In Fauna databases are as lightweight as tables and are nested in a
hierarchy. This enables us to do management in the “root” database and keep
all customer data separated in child databases. So when a user signs in to our
app he gets a session that is tied to the specific tenant database. This model
would also allow us to do completely separate database deployments for
customers with special requirements.

------
troydavis
I’ve seen this done for performance. At one point, it was more common than you
might think. These days, most data stores have a way to indicate how to store
the data on disk, or one is using SSDs so random access/seeks are less
expensive than they were with spinning disks.

As one example, New Relic had a table per (hour, customer) pair for a long
time. From [http://highscalability.com/blog/2011/7/18/new-relic-
architec...](http://highscalability.com/blog/2011/7/18/new-relic-architecture-
collecting-20-billion-metrics-a-day.html) (2011):

> Within each server we have individual tables per customer to keep the
> customer data close together on disk and to keep the total number of rows
> per table down.

In a situation like that, all queries and table operations are customer- and
time-specific anyway. At the time, dropping an entire table took less I/O than
deleting specific rows in a multi-customer table (for MyISAM tables, this may
still be true: [https://mariadb.com/kb/en/big-
deletes/](https://mariadb.com/kb/en/big-deletes/)). Also, there was no risk
from locking the table.

[https://www.slideshare.net/newrelic/how-to-build-a-saas-
app-...](https://www.slideshare.net/newrelic/how-to-build-a-saas-app-with-
twitterlike-throughput-on-just-9-servers) has a bit more. I think Lew Cirne
gave that presentation in 2011 but I can’t find a video of it.

In the example you gave, if the goal is to support customer-defined fields, I
don't think most people would map the customer-defined fields directly to SQL
columns. Consider something like Postgres hstore (with indices as needed) or
the many similar implementations in other data stores.

------
scosman
I’ve managed a system with millions of users and tens of billions of rows, and
I always dreamed of DB per user. Generally, ~1% of users were active at a
given time, but a lot of resources were used for the 99% who were offline (eg,
indexes in memory where 99% of the data wouldn’t be needed). Learned a few
tricks. If this is the problem you're trying to solve, some tips below.

Start by optimizing your indexes. Ensure customer_id is in every index and is
the first item. This allows the query to immediately filter to only rows for
this customer. Ensure all queries include a customer_id (should be doing this
anyway in multi-tenant environment). Even single row lookups by primary key
can be really sped up this way; once the index becomes bigger than memory it
has to be paged in/out. However with this approach the entire sub-tree of the
index for “hot” users can remain in memory without paging, increasing cache
hit rate, speeding up queries, and reducing IO.

The above is generally enough. However, some consumer apps have a ton of data
per user and relatively low revenue per user. In this case there’s one more
big trick to keep DB costs down: cluster the whole table by customer_id. Like
the index problem, the row data can be inefficient. If your disk layout
randomly places rows into pages, chances are there’s only one row for a given
client on a given page. If you need to evaluate 1000 rows to do a a query,
you’ll have to read close to 1000 pages, and IO is slow/expensive. You’ll use
a ton of memory caching pages where very few of the rows are for hot/active
users Note: this problem only really matters if your rows are small and you
can fit many rows per page. To fix, cluster the table by customer_id. On
MySQL+InnoDB this is easy ([https://dev.mysql.com/doc/refman/5.7/en/innodb-
index-types.h...](https://dev.mysql.com/doc/refman/5.7/en/innodb-index-
types.html)). On Postgres there’s a CLUSTER command but it’s one time, locking
and it will take forever; MySQL is the easier solution (and I really prefer
Postgres). Lots of the NoSQL DBs allow you to control the physical layout.
Once setup, your cache hit rates will go way up, total IO will go way down,
and you'll see a pretty good speedup.

As always, benchmark a lot for your use case before implementing (but don't
forget to benchmark the hot/cold use cases).

PS — I remember a post a few days ago about a company running a distributed
SQLite (many DBs). Looked really interesting but can’t find the link. For
smaller scales than millions of users, look into partitions.

~~~
ec109685
It was Expensify: [https://blog.expensify.com/2018/01/08/scaling-sqlite-
to-4m-q...](https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-qps-on-
a-single-server/amp/)

------
mailbag
I've maintained an enterprise saas product for ~1500 customers that used this
strategy. Cross account analytics were definitely a problem, but the gaping
SQL injection vulnerabilities left by the contractors that built the initial
product were less of a concern.

Snapshotting / restoring entire accounts to a previous state was easy, and
debugging data issues was also much easier when you could spin up an entire
account's DB from a certain point in time locally.

We also could run multiple versions of the product on different schema
versions. Useful when certain customers only wanted their "software" updated
once every 6 months.

------
squiggleblaz
We do that where I am. I think it's been in place for about twenty years -
certainly more than a decade. We're on MySQL/PHP without persistent
connections. There have been many questionable architectural decisions in the
codebase, but this isn't one of them. It seems quite natural that separate
data should be separated and it regularly comes up as a question from
potential clients.

Each client's db is a little different due to some more questionable decisions
(e.g. different features can result in different databases turning on, and
most tables will have variable names). But it's not really any harder to write
code that says "`select identifier from features_enabled` -> derive table name
-> `alter (derived table name) add column baz`" (actually, nowadays we have an
inhouse database migrator that does that for us, we just say "the table
abstractly known as `foo` should have columns bar, baz, quux" and the migrator
says "for this client, they have the feature uses_foo enabled with the
identifiers A and B, so A_foo and B_foo should both have the columns baz, but
do not; I will add it after `bar`".)

Perhaps it has discouraged us from seeking to use persistent connections. But
that is not the biggest inefficiency in our system at this stage.

If I was designing the application from scratch, I think I would want a static
schema, but database-per-tenant seems fine.

------
ericol
Not quite there... but almost.

At some point in the life of the product my company has, we were having a real
bottleneck in the DB.

The solution was to "split" the data in different DBs, that would hold several
clients. So, we could say we have the worst of both worlds (As code still has
to deal with the "security" part).

It is even more complicated, as some table exist only in the "main" db, and we
fetch data from there constantly.

So far (Probably 10+ years since we implemented this).

We haven't had any real issue because of this, and we have developed small in
house tools to keep things tidy (like schema updates or data inspection).

Server is MySQL.

------
hardwaresofton
Schemas[0] are the scalable way to do this, not databases, at least in
Postgres.

If you're going to go this route you might also want to consider creating a
role-per-user and taking advantage of the role-based security features[1].

That said, this is _not_ how people usually handle multi-tenancy, for good
reason, the complexity often outweighs the security benefit, there are good
articles on it, and here's one by CitusData[2] (pre-acquisition).

[0]: [https://www.postgresql.org/docs/current/ddl-
schemas.html](https://www.postgresql.org/docs/current/ddl-schemas.html)

[1]: [https://www.postgresql.org/docs/current/ddl-
rowsecurity.html](https://www.postgresql.org/docs/current/ddl-
rowsecurity.html)

[2]: [https://www.citusdata.com/blog/2016/10/03/designing-your-
saa...](https://www.citusdata.com/blog/2016/10/03/designing-your-saas-
database-for-high-scalability/)

~~~
krainboltgreene
Schemas should be approached with extreme caution. They still suffer from the
same issues as multiple databases, but now also you have to juggle search path
and when I used schemas our databases backups were insane. This is because two
same name tables with same name indexes can't be compressed together, as I
understand it.

I would also be very careful about using roles this way as it makes connection
pool formulas a lot more magical and less deterministic.

------
conductr
I’ve done this. But the service was suited for it in a couple ways;

1\. Each tenant typically only has <10 users, never >20\. And load is
irregular, maybe only ever dealing with 2-3 connections simultaneously. Maybe
<1000 queries per hour max. No concerns with connection bloat/inefficiency.

2\. Tenants creates and archives a large number of rows on some tables.
Mutable but in practice generally doesn’t change much. But >100M row count not
unusual after couple years on service. Not big data by any means, limited
fields with smallish data types, but...

I didn’t want to deal with sharding a single database. Also given row count
would be billions or trillions at a point the indexing and performance tuning
was beyond what I wanted to manage. Also, this was at a time before most cloud
services/CDNs and I could easily deploy close to my clients office if needed.
It worked well and I didn’t really have to hire a DBM or try to become one.

Should be noted, this was a >$1000/month service so I had some decent
infrastructure budget to work with.

------
kiloreven
We're a small company (~50 customers) delivering SaaS using
Django/Postgres/uWSGI for a niche B2B market where privacy and data
confidentiality is paramount.

Currently we deploy one DB + unique uWSGI instances for each customer. This
has some drawbacks which has made us look a bit into multi-tenancy as well.
Everything is served on dedicated hardware, using common codebase, and each
customer is served on a unique sub-domain.

The two primary drawbacks of running unique instances for each customer are
ease of deployment and utilization of resources.

When a new customer is deployed we need to set up the database, run
migrations, set up DNS, deploy the application, deploy the task runner, set up
DNS and configure the HTTP vhost. Most of this is painfully manual right now,
but we're looking into automating at least parts of the deployment.

In the future, we aim to offer an online solution for signup and onboarding,
where (potential) customers can trigger the provisioning of a new instance,
even for a limited demo. If we were doing multi-tenancy that would just
require a new row in the database + some seed data, which would make the
deployment process exceptionally simpler.

The other issue is the utilization of resources. Running a few instances of
the application with a big worker pool would be much easier to scale than
running 50+ instances with their own isolated worker pool.

We're considering maybe going for a hybrid multi-tenant architecture, where
each customer has their own isolated DB, but with a DB router in the
application. That would give us a compromise between security (isolated
databases - SQL queries don't cross into another customer's data) and
utilization (shared workers across customers). But this would add another
level of complexity and new challenges for deployment.

Do anyone have a similar case as this?

~~~
Fiahil
We're deploying unique kubernetes cluster per client with their own
application, database, task runner and, yes, DNS. Unlike your situation, most
of it is completely automated ( :) ) on Azure and AWS using terraform, good
old bash scripts and a custom go CLI we maintain.

Each client is billed for their own resource usage and we can have version
disparities between clusters.

On the downsides, maintenance, upgrades and deployments take more time, but we
are thinking about potential solutions for managing a fleet of k8s clusters.

This approach makes a lot of sense for B2B customers, and I would add that
it's better to separate everything down to the infrastructure level, rather
than stopping at the database schema. I would probably do it again in a
similar situation !

------
Mooty
Actually it might have some simple benefits : clients are willing to pay to
have a separated database and a separated server from anything else for
security purpose. All of our clients at
[https://www.bluelearning.fr](https://www.bluelearning.fr) have their own
separated DB when they choose it. So far it has been a huge hit, as most of
them paid more just for this.

------
mariocesar
The now closed Ubuntu One FileSync service (a Dropbox like service) had a 1
database per user approach. And they were actually using it in top of SQLITE I
think. The project is opensource now. And is based on U1DB
[https://launchpad.net/u1db](https://launchpad.net/u1db) but I know it didn't
get lot of traction.

------
SkyMarshal
In Postgres, Schemas are intended to solve this problem - enable user account
isolation without creating multiple databases:

[https://www.postgresql.org/docs/current/ddl-
schemas.html](https://www.postgresql.org/docs/current/ddl-schemas.html)

Given how popular Postgres is, I’m sure there are lots of teams using this
architecture.

------
a-wu
CloudKit actually does this.

[https://www.vldb.org/pvldb/vol11/p540-shraer.pdf](https://www.vldb.org/pvldb/vol11/p540-shraer.pdf)

> within each container CloudKit uniquely divides the data- space into many
> private databases that manage user-scoped data

> Each private database belongs to exactly one container and one user

------
oldmanhorton
Azure has a product built specifically for this, so it must be at least
vaguely common. The rationale given in docs is: "A common application pattern
is to provision a single database for each customer. But different customers
often have varying and unpredictable usage patterns, and it's difficult to
predict the resource requirements of each individual database user."

[https://docs.microsoft.com/en-us/azure/sql-database/sql-
data...](https://docs.microsoft.com/en-us/azure/sql-database/sql-database-
elastic-pool)

~~~
vyrotek
We used this successfully at my company many years ago when it was first
released. Even automated updates through DevOps. Have there been any recent
big changes that have improved it? Been looking at it again.

------
_hl_
This will be very inefficient due to the way DBMS commonly lay out data in
pages. And if you want to do any kind of aggregate queries (e.g. analytics)
you're probably in for some royal pain.

If you want to do this for security, why not layer the DB behind some system
that requires and verifies the users access tokens for each request?

The only situation where such a setup might make sense is when you actually
need per-user migrations to cater to specific customer's needs, but then
you'll make it very hard to work with all customer's data through a generic
interface.

~~~
gomox
Most enterprise systems either don't require or deliberately forbid mixing
different customers' data in aggregate queries. So it's a bad use case to
optimize for.

------
crazygringo
I guess the question is, why do you want to?

The only real reason you mention is security, but to me this sounds like the
worst tool for the job. Badly written queries accidentally returning other
users' data, that makes it into production, isn't usually a common problem. If
for some reason you have unique reasons that it might be, then traditional
testing + production checks at a separate level (e.g. when data is sent to a
view, double-check only permitted user ID's) would probably be your answer.

If you're running any kind of "traditional" webapp (millions of users,
relatively comparable amounts of data per user) then separate databases per
user sounds like crazytown.

If you have massive individual users who you think will be using storage/CPU
that is a significant percentage of a commodity database server's capacity
(e.g. 1 to 20 users per server), who need the performance of having all their
data on the same server, but also whose storage/CPU requirements may vary
widely and unpredictably (and possibly require performance guarantees), then
yes this seems like it _could_ be an option to "shard". Also, if there are
very special configurations per-user that require this flexibility, e.g.
stored on a server in a particular country, with an overall different
encryption level, a different version of client software, etc.

But unless you're dealing with a very unique situation like that, it's hard to
imagine why you'd go with it instead of just traditional sharding techniques.

------
paulmendoza
I have used this architecture at 2 companies and it is by far the best for B2B
scenarios where there could be large amounts of data for a single customer.

It is great for data isolation, scaling data across servers, deleting
customers when they leave easily.

The only trick are schema migrations. Just make sure you apply migration
scripts to databases in an automated way. We use a tool called DbUp. Do not
try to use something like a schema compare tool for releases.

I have managed more than 1500 databases and it is very simple.

------
wim
We do this for our wiki/intranet SaaS app. In our case we don't use it for all
data in the app though, but rather for a specific feature. One of the blocks
customers can add to their wiki pages are form fields so they can build custom
forms. The app then automatically generates a corresponding database in the
backend to store records for their form, using a separate SQLite db.

Our original idea was to allow customers access to their form db directly so
they could even run custom SQL queries on it and so on. In the end we actually
never used that part (users just view and filter their records in the UI), so
unless we still need this in the future, one could argue it's.. just slightly
overengineered. It works well otherwise though, and migrations aren't really a
problem for us, because they're done dynamically by the app (when people make
changes to their form). The scaling is also no issue as one db will always be
small enough to fit on one server, so we can "shard" easily by file. So I
think the only obvious lesson here was make sure you really really need it
(and if migrations/scaling would have meant a lot of overhead, I don't think
we would have even considered it).

------
amanzi
WordPress Multisite gives each blog a set of tables within a single database,
with each set of tables getting the standard WordPress prefix ("wp_") followed
by the blog ID and another underscore before the table name. Then with the
hyperdb plugin you can create rules that let you shard the tables into
different databases based on your requirements. That seems like a good model
that gives you the best of both worlds.

~~~
gomox
My hunch is that they did this because that was a reasonable way to operate in
the old shared hosting days (where you got a single application and a single
database server). A lot of people would also run multiple "blogs" from one
account.

This requires application level code to manage the different sets of tables
and the corresponding access permissions. Unless your use case requires this
flexibility, I think one database per tenant is a better default architecture.

------
codecurve
We set off building a new product using this approach a few years ago and
wrote about our motivations here: [https://medium.com/hackernoon/exploring-
single-tenant-archit...](https://medium.com/hackernoon/exploring-single-
tenant-architectures-57c64e99eece)

I can't remember all of the reasons that we ditched the idea, but it quickly
became clear that we would be writing a lot of the core tooling ourselves, and
we kept running into problems with AWS around dynamically provisioning the
load balancers and certs for N boxes.

I wouldn't dream of trying to manage different schemas across all the
databases either, that sounds like a serious headache that could only be
mitigated by having separate codebases and separate development teams managing
them.

If a customer needs custom fields, I would make them a first class citizen and
the required `fields_definitions / field_values` tables to your database and
let them manage those fields themselves.

I'm glad we ended up defaulting to a multi-tenant architecture even though we
lost some of the proposed benefits (isolation, independent scaling etc).

------
xmodem
I have a bit of experience with this. A SaaS company I used to work with did
this while I worked there, primarily due to our legacy architecture (not
originally being a SaaS company)

We already had experience writing DB migrations that were reliable, and we had
a pretty solid test suite of weird edge cases that caught most failures before
we deployed them. Still, some problems would inevitably fall through the
cracks. We had in-house tools that would take a DB snapshot before upgrading
each customer, and our platform provided the functionality to leave a customer
on an old version of our app while we investigated. We also had tools to do
progressive rollouts if we suspected a change was risky.

Even with the best tooling in the world I would strongly advise against this
approach. Cost is one huge factor - the cheapest RDS instance is about
$12/month, so you have to charge more than that to break even (if you're using
AWS- we weren't at the time). But the biggest problems come from keeping track
of scaling for hundreds or thousands of small databases, and paying
performance overhead costs thousands of times.

~~~
gomox
You don't need an RDS per customer. In MySQL jargon, an RDS instance is like a
MySQL "server". You just need a "database" per customer.

On the other hand, if a tenant pays you less than $12 a month or any amount
where the hosting costs are significant in the grand scheme, then I agree that
this is likely not a good architecture.

~~~
sk5t
If you make one database per tenant, then some issues to consider might
nevertheless include maintaining many versions of the schema, separate backup
scripts (other than backing up the entire cluster/instance), busting query
cache, busting connection pooling, baking into the app layer (uncommon) logic
to get a connection to the proper database, some complexity to collect usage-
based info for billing or anomaly detection; others?

~~~
gomox
Many versions of the schema: indeed, that's one of the main tradeoffs. I
consider this to be a feature, not a bug, in an enterprise application
context. The correct version of the schema that corresponds to a given version
of the code is unique, which dramatically simplifies debugging and migration.

Separate backup scripts: only if you want, you can always backup the whole
database server/instance at once (i.e. an RDS snapshot). On the other hand,
you can do more fine grained backup controls if you want to (and with
enterprise customers, you likely will). Need short data retention for an EU
based healthcare provider? Good luck purging that data out of your
consolidated cluster-wide backups.

Busting query cache: I don't think this setup is detrimental to query cache
functionality, unless you have many queries shared between tenants, which you
generally don't in this type of setup (i.e. frequent queries like "SELECT *
FROM post_categories WHERE tenant_id = XX" is not a cache entry that can be
shared between tenants anyway).

Busting connection pooling: not necessarily, but if you want separate DB
credentials for each tenant, yes. If that is not an acceptable tradeoff, you
can reuse connections, at the expense of lessened data isolation between
tenants.

Connection to the proper database: I would argue there's nothing uncommon
about connecting to a specific database, but I'm not sure I see your point.
You just issue a "use XX" statement before you start querying, or connect with
different credentials if not reusing connections.

Collection of data across tenants: yes, cross tenant analytics gain overhead
and complexity, but those are typically not part of the application code
anyway because most of the time you need to manage separate clusters anyway
(region specific, fault tolerance, dedicated, etc).

Anomaly detection at the database level becomes more complex, but (a) in my
opinion, it's the kind of custom problem that you will custom solve anyway (so
no huge cost savings) and (b) a lot of it happens at higher layers (New Relic,
etc) which are not impacted by this architecture choice.

~~~
sk5t
I should have included the context that my notes pertain somewhat more to
Postgres than other database systems; "USE" may work for MSSQL (and others)
where the database is treated as more of a namespace concept, whereas Postgres
requires a new connection per database. If you've got 1,000 databases on a
Postgres cluster and are limited to 1,000 connections, there's going to be a
tradeoff in connection setup latency vs. connection concurrency vs. efficient
resource use. Or, spend time writing a dynamically weighted set of pools? Oof.

A parameterized query like "SELECT * FROM post_categories WHERE tenant_id = ?"
would very much be cacheable by the query planner. Stuff like "does tenant_id
have an index? is it worth using the index based on its histogram? are those
index pages already in memory?" etc.

~~~
gomox
I can see how that can be a problem. I run MySQL but other commenters said
that Postgres has "schemas" which appear to be more similar to MySQL's
"databases".

In the same way I thought you meant the query result cache, not the query plan
cache. In my experience query planning hasn't been a significant bottleneck,
but that's got to be workload dependent.

------
alberth
Virtual Private Databases.

What a lot of enterprise SaaS vendors do is have one single database for all
customer data (single tenant). They then use features like Virtual Private
Database to hide customer A data from customer B. So that if customer A did a
“select *” they only see their own data and not all of the other customers
data. This creates faux multi-tenancy and all done using a single db account.

~~~
tedeh
This sounds very much like Row Level Security, but I've never heard the term
"Virtual Private Database" to describe it.

~~~
alberth
It’s the same. Sometimes also referred to as Label Security. Conceptual all
accomplishing that same goal.

------
Thristle
What we do at my current job is server per multiple accounts each server holds
500-1000 "normal sized" customers and the huge or intensive customers get
their own server with another 10-50 customers Currently moving from EC2 +
mysql 5.7 to RDS, mainly for ease of managing.

However, we dont use a tenent id in all tables to differentiate customers we
use (confusingly named) DB named prefix + tenent id for programatically making
the connection.

Have a single server + db for shared data of tenents like product wide
statistics, user/tenent data and mappings and such things. In the tenent table
just have column for the name of the DB server for that tenent and that's
pretty much it. Migrations are handled by an internal tool that executes the
migrations on each tenent DB and 99% of the time everything works just fine if
you are careful on what kind of migration you do and how you write your code

Some pitfalls concern column type changes + the read replicas going out of
sync but that was a single incident that only hurt the replica.

------
gregjor
Seems impractical and slow at scale to manage even a few hundred separate
databases. You lose all the advantages of the relational model — asking simple
questions like “Which customers ordered more than $100 last month” require
more application code. You might as well store the customer info in separate
files on disk, each with a different possible format and version.

~~~
vyrotek
Those queries are definitely convenient early on but eventually you shouldn't
be making those against that system and instead aggregate the data into
warehouse.

Technically, there are DBs that let you do cross shard queries. See Azure
Elastic DB.

------
mozey
When using a single db I'd highly recommend adding `account_id` to every
single table that contains data for multiple accounts. It's much easier to
check every query contains `account_id`, as opposed to checking multiple
foreign keys etc. Depending on the db you can then also easily export all data
for a specific account using filters on the dump tool

------
otar
In my case this worked out pretty well. Other than data separation and ease of
scaling database per-customer (they might have different behavior of
read/write operations), they other benefit was that we could place customer's
database in any jurisdiction, which for some enterprise customers appeared an
important point, regulations wise...

------
harel
I considered it a few years ago for a project. My options were one db per
account, one schema per account or good old fashioned multitenant. The first
two options dropped after realising what a maintenance nightmare that can be.
In my scenario I would have thousands and thousands of accounts. So a
migration becomes a crazy notion you'll be trying to avoid. The same applies
for multi schema - there is a limit of how many schemas you can run reasonably
before performance is affected.

What I ended up doing is going for a multi tenant architecture, but I built it
in such a way where the data is transient from one db to another. All account
data lives in a separate 'master db', and the data in another. All ids are
UUIDs. I build database routing into the application so each "account" can
have it's data on a different database, and if needed I can move one account's
data from a loaded db to a more available one.

------
mildavw
The apartment gem enables multi-tenant Rails apps using the Postgres schemas
approach described by others here.

It’s slightly clunky in that the public, shared schema tables, say, the one
that holds the list of tenants, exists in every schema — they’re just empty.

I rolled my own based on apartment that has one shared public schema, and a
schema for each tenant. Works well.

------
jacquesm
As usual, it depends. If you have a few large customers and a large number of
smaller ones I'd isolate the large ones in their own DBs and put the rest in a
single DB. If any of the smaller ones become dominant then you move them out
as well.

If all your customers are individuals or small accounts then I'd put them all
in a single DB, but I'd still build in the option to redirect to another DB if
the software has applicability to enterprise level customers.

Note that it is perfectly OK to upsell the capability to isolate your
customers data from each other, segregation options could be offered at
multiple levels (all in one bucket, different DB, different cluster), each
with their own pricepoint. Some customers will simply demand this and move
elsewhere if you don't offer it (and they are typically quite insensitive to
price as long as the right boxes get checked for their auditors).

------
zupa-hu
I did something like this early on for a website builder using MySQL but it
very quickly hit bottlenecks.

I also wanted one-filesystem-per-customer support and full isolation as in
Docker, so I ended up writing a new OS to support all of this natively. It
runs in production on boomla.com, a hosted website builder that is also a
programmable platform.

WARNING: it gets even weirder, databases can be nested, a database entry is
also a file and a directory at the same time, or a nested database if you
want. There is no way understood this sentence. :) For example, you can store
files in files. As in, style.css/background.png, etc.. Every change also
creates a new filesystem snapshot, which may also be nested. All right I stop
here.

This is not as much an advice for OP (to write a new OS) but more like a plug
for the intellectually curious to explore a parallel universe. :)

------
wheelerwj
Seems pretty odd. The closest example I can think of would be maybe
salesforce? Which basically, as far as I can tell, launches a whole new
instance of the application (hosted by heroku?) for each client. I'm not a
100% sure about this, but i think this is how it works.

~~~
snuxoll
Not at all how Salesforce works, they take a lot of pride in their multi-
tenant setup (for better or worse). Every org on a given instance shares the
same application servers and Oracle cluster.

If I were to make a Salesforce competitor that’s one thing I would do
differently, with tools like Kubernetes it’s a lot easier to just give every
customer their own instances. Yes, it can take up more resources - but I
cannot imagine the security nightmare involved with letting multiple customers
execute code (even if it’s theoretically sandboxed) in the same process, plus
the headache that is their database schema.

~~~
nitwit005
Salesforce has a lot of trial or developer orgs, which would be quite
expensive if you tried to host them all as some sort of VM in a cloud hosting
companies.

I assume most of them end up as a few rows in some set of DBs, which is far
cheaper.

~~~
snuxoll
Salesforce charges and arm and a leg for extra sandbox instances anyway, and
at that scale you aren’t paying Amazon for compute either.

A full-copy sandbox past the included one for Unlimited Edition orgs is
something like 30% your annual spend.

~~~
nitwit005
You can fill out a form and cause a free org to exist on their main site, so
hundreds or thousands probably get created a day. It looks like they expire in
30 days, so if you spun up an app and DB for each it would get expensive
quick.

------
shakna
There aren't a lot of benefits to doing it. If you have frequent migrations,
then it probably isn't something you ever want to do.

For a site I run, I have one large shared read-only database everyone can
access, and then one database per user.

The per-user DB isn't the most performant way of doing things, but it made it
easier to:

\+ Encrypt an entire user's data at rest using a key I can't reverse engineer.
(The user's DB can only be accessed by the user whilst they're logged in.)

\+ Securely delete a user's data once they delete their account. (A backup of
their account is maintained for sixty days... But I can't decrypt it during
that time. I can restore the account by request, but they still have to login
to access it).

There are other, better, ways of doing the above.

~~~
programmarchy
Can you elaborate on how you achieved encryption at rest with a key you can’t
access? I’m assuming the key is sent in an authorization header, then lives in
memory for the duration of the session, but wondering what your tool chain
looks like.

~~~
shakna
Pretty much.

The encoded key is sent, once, in a header, and then stored in a secure
session cookie, that has a reasonable timeout on it, and is user-revokable,
and is encrypted in memory server-side, unless it is being accessed.

(Setting up session cookies to only decrypt when being accessed sort of
required reinventing the wheel, as that's apparently not something anyone goes
to the effort of usually, and sends you down some optimisation paths around
timing that will have you pulling out your hair).

User-revokable session cookies are simple enough - each user gets their own
session cookie key, and they can roll that over from a settings page.

Worth noting: This is a great way to decimate your server performance, because
most websites aren't constantly handling decryption.

The prototype was written for Flask [0], then rewritten for Bottle [1] when it
was clear I wasn't using 90% of the Flask stack, and monkeypatching most of
what I was using. Nowadays it's a strange mix of Hug [3] and Bottle.

But there's nothing there that's unique to Python or even the framework. It's
easily doable in just about any language. I made three prototypes when I was
coming up with this batty idea, the Flask prototype, one for vibe.d (D), and
one for Go. I settled on Python for no particular reason. They all had similar
performance, because encryption became the bottleneck.

[0]
[https://flask.palletsprojects.com/en/1.1.x/](https://flask.palletsprojects.com/en/1.1.x/)

[1] [http://bottlepy.org/](http://bottlepy.org/)

[2] [https://hugapi.github.io/hug/](https://hugapi.github.io/hug/)

~~~
programmarchy
Thanks for sharing, that’s an interesting approach. Does seem very hard to
scale. Do they just set their key from a settings page and then off to the
races? i.e. no login credentials?

~~~
shakna
Certificate file. Generated on registration and handed over as a download and
shredded server-side. Not as trust-fulfilling as a user supplying one, but
less of a learning curve. (Still need validation on it either way, which can
be painful).

Which, of course, means "forgot my password" doesn't work.

------
withinboredom
Haha, oh, for a second there I thought you meant tables. But leaving the
below..

How about dozens per account? :) I didn’t ship this, but I work for Automattic
and WordPress.com is basically highly modified WordPress MU. This means every
time you spin up a site (free or otherwise) a bunch of tables are generated
just for that site. There’s at least hundreds of millions of tables. Migrating
schema changes isn’t something I personally deal with, but it’s all
meticulously maintained. It’s nothing special on the surface.

You can look up how WordPress MU maintains schema versions and migrations and
get an idea of how it works if you’re really curious. If you don’t have
homogeneous migrations, it might get pretty dicey, so I’d recommend not doing
that.

------
brigandish
This talk may be helpful[1]. It's given by Jeremy Evans, the maintainer of
Sequel, and it's about how he's made Roda (a Rack-based web framework) more
secure than your average web framework does by using database security and
some of the features of databases that all too often are overlooked by app
developers. You could possibly use Roda for the authentication phase of a
Rails app (among other things) but the insights will be helpful regardless.

[1]
[https://www.youtube.com/watch?v=z3HZZHXXo3I](https://www.youtube.com/watch?v=z3HZZHXXo3I)

------
mdoms
Jira Cloud and Confluence use a DB per user architecture at reasonable, but
not outrageous, scale. I can't share numbers because I am an ex-employee, but
their cloud figures are high enough. This architecture requires significant
tooling an I don't recommend it. It will cause you all kinds of headaches with
regards to reporting and aggregating data. You will spend a small fortune on
vendor tools to solve these problems. And worst of all despite your best
efforts you WILL end up with "snowflake" tenants whose schemas have drifted
just enough to cause you MAJOR headaches.

------
JamesBarney
We weren't able to find any silver bullets. We used an orm to split
applications for non enterprise clients. And then just installed a duplicate
application+db in the cloud for enterprise clients and made sure to charge
them for it.(30k+ annual subscription fee)

So to upgrade the application we'd upgrade the everyone application and then
do the same with each enterprise client. We only had a few.

I'd recommend trying to avoid it if you can, and if you can't charging the
appropriate amount of money which is an arm and a leg.

------
edoceo
I have similar. One PG database per tenant (640). Getting the current DSN is
part of auth process (central auth DB), connect through PGBouncer.

Schema migrations are kind of a pain, we roll out changes, so on auth there is
this blue/green decision.

Custom fields in EAV data-tables or jsonb data.

Backups are great, small(er) and easier to work with/restore.

Easier to move client data between PG nodes. Each DB is faster than one large
one. EG: inventory table is only your 1M records, not everyone's 600M records
so even sequential scan queries are pretty fast.

------
andrea_sdl
We did it many years (11) ago for a B2C Product, around 200K users (free, 5k
paid). We both had a common Postgres where we stored generic info and the
customer database in Derby (product was in java).

At the time we weren't that much experiences so we also implemented migrations
(whereas today I would have chosen flyway or something similar).

Customer data was text based so we didn't have that many custom fields but at
the sime time we had many migrations during the years.

So, would we still do it? Yes and no. It was amazing because it let us isolate
each user. With Filesystem backups we could restore just in time the data from
each user without having to reset the whole database and with little to know
system knowledge.

At the same time it restricted our choices when improving the product. If we
had to do a whole refactor of the project we always thought "How do I migrate
all the users together?" We had to think about versioning, upgrade in
different times and so on.

From a tech standpoint it wasn't much of a problem, but it burdened the
project a little bit. Bigger refactorings are harder and you need much more
depth when you think about big new features.

I still think it was a nice solution, but this solution came also because we
didn't knew a thing about partitioning in databases and we weren't expert in
clustering. All in all it allowed us to create a product sustaining a decent
customer base with low knowledge for that time.

With today technology I'd test if we can do it other ways.

------
joshuanapoli
Database per tenant makes a lot of sense for enterprise applications where a
hard partition is desired. This does let you run different versions of your
service for different customers... but managing the replicated infrastructure
will be much easier if you can keep them all on the same version. Enterprise
customers may want to control when upgrades are released to them... I think
that it's much easier to continuously upgrade the server but hold back
features via flags.

------
winrid
One company in SF using MeteorJS deploys a whole container with DB and
everything per customer.

I think their primary reasoning is that Meteor doesn't scale super easy, so it
was easier to just "shard" the whole stack per customer.

Personally, it's a lot of work. It depends on what you're doing to make the
tradeoffs worthwhile.

I see this as an optimization. Build everything so you can deploy once for all
your customers. If you need to shard by customer later, it's just an
infrastructure problem.

------
formition
I did this with MS SQL and .NET CORE for members.org.au ,they share a common
app infrastructure with is scalable and based on the url the database
connection string changes.

~~~
vyrotek
How did you roll out DB schema changes to everyone? I've used Azure Elastic DB
do something similar through DevOps in the past.

------
DecoPerson
I worked for one of the biggest boarding school software companies. The only
option was full-service, but clients could chose between hosted by us or
hosted by them. We didn’t just do 1 database per school, we did entirely
separate hardware/VMs per school. Some regions have very strict data
regulations and the school’s compliance advisors tended to be overly cautious;
they interpreted the regulations and translated them to even stricter
requirements. These requirements were often impossible to satisfy. (How can
the emergency roll call app both work offline AND comply with “no student PII
saved to non-approved storage devices”? Does swap memory count as saving to a
storage device?? Is RAM a “storage device”??? Can 7 red lines be parallel!?!?)

Shared DB instances would have been completely off the table. Thankfully, most
boarding schools have massive IT budgets, so cost minimization was not as
important as adding additional features that justified more spend. Also the
market was quite green when I was there. Strangely, the software seemed to
market itself; the number of out-of-the-blue demo requests was very high, so
first impressions and conversion to paying clients was the primary focus.

~~~
michaelt
_> How can the emergency roll call app both work offline AND comply with “no
student PII saved to non-approved storage devices”?_

By having its offline data store approved as a storage device?

------
planxty
I worked for a company that did this, and our scale was quite large. It took a
lot of work to get AWS to give us more and more databases on RDS. We had some
unique challenges with scaling databases to appropriately meet the needs of
each account. Specifically, it was difficult to automatically right-size a DB
instance to the amount of data and performance a given customer would need. On
the other hand, we did have the flexibility to manually bump an account's
database to a much larger node size if we needed to help someone who was
running into performance issues.

I think the biggest problems had to do with migrations and backups. We
maintained multiple distinct versions of the application, and each had a
unique DB schema, so there was frequent drift in the actual schemas across
accounts. This was painful both from a maintenance POV, and for doing things
like change data capture or ETLs into the data warehouse for data
science/analysis.

Another big problem was dealing with backup/restore situations.

I suspect this decision was made early in the company's history because it was
easier than figuring out how to scale an application originally designed to be
an on-prem solution to become something that could be sold as a SaaS product.

Anyway, I think choosing a solution that nets your business fewer, larger
database nodes will probably avoid a lot of maintenance hurdles. If you can
think ahead and design your application to support things like feature flags
to allow customers to gradually opt in to new versions without breaking
backwards compatibility in your codebase, I think this is probably the better
choice, but consider the safety and security requirements in your product,
because there may be reasons you still want to isolate each tenant's data in
its own logical database.

~~~
planxty
Forgot to mention: we were not a Rails shop, but I think the kind of code or
framework being used isn't the most important challenge here.

------
tobyhede
Years ago I worked for a startup that provided CMS and ecommerce software for
small business. Each of our 3000+ customers had their own MySQL database.

We had a long tail of customers with negligible usage and would run several
thousand MySQL databases on a single server. As customers scaled we could
migrate the database to balance capacity. We could also optionally offer
"premium" and "enterprise" services that guaranteed isolation and higher
durability.

Scaling was never a real issue, but the nature of our clients was steady
incremental growth. I don't think we ever had a case of real "overnight
success" where a shared host customer suddenly melted the infrastructure for
everyone.

However, managing and migrating the databases could be a real issue. We had a
few ways of handling it, but often would need to handle it in the code, `if
schemaVersion == 1 else`. Over time this added up and required discipline to
ensure migration, deprecation and cleanuop. As a startup, we mostly didn't
have that discipline and we did have a fair bit of drift in versions and old
code lying around.

------
superdex
What’s ‘scale?’

We do it, but everyone gets the same schema and same app. .net/IIS/sql server
and when we update the schema, we apply to all dbs via in-house scripts. It
provides security warm fuzzies to our clients that their data is separated
from other clients. If you want to try and version the app/schema for
different accounts, that’s where your headaches are, regardless of db model

------
madmax108
B2B CRM space startup. We have somewhat of a middle-ground approach. Our level
of isolation for customers is at a schema-level.

What this means is each customer has her own schema. Now, large customers want
to be single tenant, so they have a single schema on the entire DB. Smaller
(SMB) customers are a bit more price conscious so they can choose to be
multitenant i.e multiple schemas on same DB.

Managing this is pushed out to a separate metadata manager component which is
just a DB that maps customer to the DB/schema they reside on. Connection
pooling is at the DB level (so if you are multitenant then you may have lower
perf because some other customer in the DB is hogging the connections)... But
this has not happened to us yet.

Large customers are more conscious in terms of data so want things like disc
level encryption with their own keys etc, which we can provide since we are
encrypting the whole DB for them (KMS is the fave here).

We are not really large scale yet, so dunno what they major gotchas will be
once we scale, but this approach has served us well so far.

------
Nican
Stackoverflow's DBA had just posted about this:
[https://twitter.com/tarynpivots/status/1260680179195629568](https://twitter.com/tarynpivots/status/1260680179195629568)

He has 564,295 tables in one SQL Server. Apparently this is for "Stack
Overflow For Teams"

------
skissane
One model I have seen used successfully is a hybrid model in which the product
is designed to be multi-tenant, but then it is deployed in a mix of single
tenant and multi-tenant instances. If you have a big mix of customer sizes
(small businesses through to large enterprises) – single-tenant instances for
the large enterprise customers gives them maximum flexibility, while multi-
tenant for the small business customers (and even individual teams/departments
within a large enterprise) keeps it cost-effective at the low end. (One
complexity you can have is when a customer starts small but grows big –
sometimes you might start out with just a small team at a large enterprise and
then grow the account to enterprise scale – it can become necessary to design
a mechanism to migrate a tenant from a multi-tenant instance into their own
single-tenant instance.)

------
taylorcooney
There are definitely downsides to scaling out thousands of tenants - I've been
told Heroku supports this, and at a glance I found this doc that says it may
cause issues, [https://devcenter.heroku.com/articles/heroku-
postgresql#mult...](https://devcenter.heroku.com/articles/heroku-
postgresql#multiple-schemas) but it really doesn't change whether you're on
Heroku or not. At the end of the day it's just about your application
structure, how much data you have, how many tables you have etc. Unfortunately
the Apartment gem even has these problems, and even its creators have
expressed some concern ([https://zeph.co/multitenancy-without-subdomains-
rails-5-acts...](https://zeph.co/multitenancy-without-subdomains-rails-5-acts-
as-tenant#why-acts_as_tenant)) about scalability with multiple schemas.

The acts_as_tenant gem might be what you’re looking for:

> This gem was born out of our own need for a fail-safe and out-of-the-way
> manner to add multi-tenancy to our Rails app through a shared database
> strategy, that integrates (near) seamless with Rails.

My recommended configuration to achieve this is to simply add a `tenant_id`
column (or `customer_id` column, etc) on every object that belongs to a
tenant, and backfilling your existing data to have this column set correctly.
When a new account signs up, not a lot happens under-the-hood; you can create
a row in the main table with the new account, do some initial provisioning for
billing and such, and not much else. Being a multi-tenant platform you want to
keep the cost really low of signing up new accounts. The easiest way to run a
typical SQL query in a distributed system without restrictions is to always
access data scoped by the tenant. You can specify both the tenant_id and an
object’s own ID for queries in your controller, so the coordinator can locate
your data quickly. The tenant_id should always be included, even when you can
locate an object using its own object_id.

------
carapace
It's not really the same thing, but the question reminds me that we almost had
SQLite in the browser[1], but Mozilla spiked it in favor of IndexedDB[2] (yet
"Firefox saves Web storage objects in a SQLite file"[3] so I dunno what to
conclude from all that. SQLite is good enough for FF devs but not users?)

Anyway, if you have a web app you already have a DB-per-user, FWIW.

[1]
[https://en.wikipedia.org/wiki/Web_SQL_Database](https://en.wikipedia.org/wiki/Web_SQL_Database)

[2]
[https://en.wikipedia.org/wiki/Indexed_Database_API](https://en.wikipedia.org/wiki/Indexed_Database_API)

[3]
[https://en.wikipedia.org/wiki/Web_storage](https://en.wikipedia.org/wiki/Web_storage)

------
peterwwillis
At scale? It's not only usually unnecessary, it's a bad application<->data
model, and will lead to fractured code, deploy processes, and bit rot.
Depending on design.

Remember that a database is supposed to be shared. It's designed that way for
performance reasons. If your big issue is you're constantly doing flyway and
it's unmanageable, go schemaless, or refactor where/when/how you place your
data. Rethink the architecture/design, not the tool. If it's a microservice-
based application, remember that you are supposed to have independent state
per service, so there shouldn't be one giant database anyway.

But for like 12 customers, sure, go for it. It's very common for "whale"
accounts to get dedicated infra/databases while the rest get a common sharded
layer.

------
rpg3
Seems like a ton of extra work with no real upside. For one, if your
migrations fail to complete across all databases for whatever reason then you
could hit a point where you have databases with differing schema.

Additionally, like someone else pointed out, trying to run any reporting data
across multiple customers will become difficult code wise and less performant.

Realistically, if you are handling the sort of scale that would require more
interesting scaling solutions for typical db software, you are most certainly
making enough money to implement better approaches.

FWIW, I worked for a company that was handling a few hundred thousand
customers with millions of order records on a relatively small AWS RDS server.
Set up a database cluster and you're rolling for a while.

------
LeonB
I believe that FogBugz used this approach, back in the day (with a SQL Server
backend).

The reasoning was that customers data couldn't ever leak into each other, and
moving a customer to a different server was easier. I vaguely recall Joel
Spolsky speaking or writing about it.

------
tomaszwro
I worked on a project with PostgreSQL schemas per tenant (almost like
databases). Also worked on another one with "normal" account_id field
separation.

I documented how they compare in a blogpost:
[https://blog.arkency.com/comparison-of-approaches-to-
multite...](https://blog.arkency.com/comparison-of-approaches-to-multitenancy-
in-rails-apps/) \- funnily it was waiting unpublished for some time, stumbling
on your post made me finally publish it.

Looking forward to go through this comments question and see what others have
experienced.

I have another draft in the making - about some of the pitfalls of PostgreSQL
schemas approach, should be released soon.

------
rudyrigot
Yes, we did it at Kenna Security. About 300 paying customers, but over 1000
with trials, and overall about 6B vulnerabilities being tracked (the largest
table in aggregate). Some of the tables were business intelligence data
accessible to all customers, so they were on a “master” DB that all could
access; and some of the tables were fully multi-tenant data, so each customer
had their MySQL DB for it.

The motivation was that we were on RDS’s highest instance and growing, with
jobs mutating the data taking a less and less excusable amount of time.

The initial setup was using just the Octopus gem and a bunch of Ruby magic.
That got real complicated really fast (Ruby is not meant to do systems
programming stuff, and Octopus turned out very poorly maintained), and the
project turned into a crazy rabbit hole with tons of debt we never could quite
fix later. Over time, we replaced as many Ruby bits as we could with lower-
level stuff, leveraging proxySQL as we could; the architecture should have
been as low-level as possible from the get-go... I think Rails 6’s multi-DB
mode was going to eventually help out too.

One fun piece of debt: after we had migrated all our major clients to their
own shards, we started to work in parallel on making sure new clients would
get their own shard too. We meant to just create the new shard on signup, but
that’s when we found out, when you modify Octopus’s in-memory config of DBs,
it replaces that config with a bulldozer, and interrupts all DB connections in
flight. So, if you were doing stuff right when someone else signs up, your
stuff would fail. We solved this by pre-allocating shards manually every month
or so, triggering a manual blue-green deploy at the end of the process to
gracefully refresh the config. It was tedious but worked great.

And of course, since it was a bunch of Active Record hacks, there’s a number
of data-related features we couldn’t do because of the challenging
architecture, and it was a constant effort to just keep it going through the
constant bottlenecks we were meeting. Ha, scale.

Did we regret doing it? No, we needed to solve that scale problem one way or
another. But it was definitely not solved the best way. It’s not an easy
problem to solve.

------
enigma20
This question reminds me of some legacy system which I've seen in the past :D
:D :D

In summary it was working in the following way:

There was table client(id, name).

And then dozens of other tables. Don't remember exactly the structure, so I
will just use some sample names: \- order_X \- order_item_X \- customer_X \-
newsletter_X

"X" being ID from the client table mentioned earlier.

Now imagine dozens of "template" tables become hundreds, once you start adding
new clients. And then in the code, that beautiful logic to fetch data for
given client :D

And to make things worse, sets of tables didn't have same DB schema. So
imagine those conditions building selects depending on the client ID :D

------
tilolebo
I have no experience in that field, but the isolation brought by a 1 customer
<-> 1 DB relationship sounds VERY appealing to me. I'm talking about schemas,
no dedicated database servers.

Creating dedicated database app users with the proper permissions pretty much
guarantees that you'll never serve data to the wrong customer.

It also probably makes bug and performance troubleshooting much easier.

The biggest downside is probably the maintenance overhead, but I suppose
automation can mitigate it quite well.

It maybe makes internal reporting a bit harder, as you can't get the whole
data from a single SQL query. You'd have to handle this in the application
layer.

------
fergie
My memory might be a bit shaky on this, but I am pretty sure that Facebook was
running "one solr instance per user" in the early days and that (Apache)
Cassandra was developed from this idea.

Generally any large social network will need to follow the 1 database per
account strategy to some extent because of the tricky many-to-many
relationship that groups/users have with other groups/users- this creates a
cartesian product that is too large to traverse in reasonable time with one
large database.

This of course leads to a world of pain, and shouldn't be attempted unless
there is really no other way to make it work.

~~~
sah2ed
> _Generally any large social network will need to follow the 1 database per
> account strategy to some extent because of the tricky many-to-many
> relationship that groups /users have with other groups/users- this creates a
> cartesian product that is too large to traverse in reasonable time with one
> large database._

Once cartesian products start to reach certain sizes, a few trade-offs can be
made.

The compute cost of traversing complex entity relationships in a relational
database can be mitigated by using denormalization—allowing lots of duplicates
since storage is cheap.

Another alternative is to switch to a network database so entity relationships
can be traversed using graph semantics.

------
hootbootscoot
This sounds like a job for CouchDB with PouchDB in-client. You'll need to
configure replication per-db, but this can be scripted via cURL or wget, as
CouchDB has a REST API.

That's precisely what this system was designed for.

------
CodeWriter23
A client of mine used some SaaS from a vendor. The vendor sold you an
“Instance“ that basically is an EC2 instance. Each instance has a self-
contained app and self-contained database. Migration and app update are
handled a single client/single instance at a time. Standard backup and restore
tools can be used. It’s a more expensive approach but the software was
specialized and expensive too. An upside of this model, it lends itself well
to clients who require on-premise deployment. And it seems for heavy use
clients, you can scale up their instance and database as needed.

------
lemiffe
We use a database per account, it is necessary for some ISO (and other)
certifications to have single-tenant DBs.

Of course this requires a bunch of extra tooling, like upgrade scripts that
don't ALTER tables directly but rather lock-copy-delete-rename, etc.

There are many tools out there which help out with this, and whatever we
couldn't find we built ourselves. Tools like JOOQ can update code entities
based on the database, so a database-first approach is what we used, but you
can go either way.

The benefit of this approach is ultimately security and less multi-tenant
catastrophes leaking data from customers, etc.

~~~
chrisacky
Can you explain why the ALTER approach isn't feasible? If you are locking
anyway, is it not the same thing?

~~~
lemiffe
My mistake, I didn't check before posting.

The use case was: Keep the table online and don't bring down the Galera
cluster (which happened when running an ALTER on a table with millions of
rows).

We went for pt-online-schema-change (from Percona) which copies, alters the
new table, keeps them in sync, and then replaces it. All automated which is
pretty sweet.

One of the answers on here has more info:

[https://stackoverflow.com/questions/463677/alter-table-
witho...](https://stackoverflow.com/questions/463677/alter-table-without-
locking-the-table)

------
theshrike79
We did this in a company long long time ago, each customer had their own
Access database running an ASP website. Some larger migrations were a pain,
but all upgrades were billed from the customers, so it didn't affect anything.

If you can bill the extra computing and devops work from your customers, I'd
go with separate environments alltogether. You can do this easily with AWS.

On the plus side you can roll out changes gradually, upgrade the environments
one user at a time.

Also if Customer X pays you to make a custom feature for them, you can sell
the same to all other users if it's generic enough.

------
tobyhede
If you are using Rails, have a look at using PostgreSQL schemas. Single
"physical" database but the schemas give you distinct logical databases.
Perfect for multi-tenant situations.

~~~
nstart
I think a lot of companies that provide hosting services of open source
software follow this model. They'd probably be the best example for the OP to
refer to in their search for relevant designs. I think ghost blog follows this
model even when you self host. So a single machine can host multiple blogs and
each blog creates their own random named schema.

------
zerkten
This is a common approach outside of the SaaS space. I'd worry less about
Rails and tools, and more about the outcomes you need. If you have a smaller
number of high value customers (big enterprises or regulated industries), or
offer customers custom add-ons then it can be advantageous to give each
customer their own database. Most of the HN audience will definitely not need
this.

In some industries you'll also have to fight with lawyers about being allowed
to use a database shared between customers because their standard terms will
start with this separation. This approach is helpful when you have to keep
data inside the EU for customers based there. If you want to get creative, you
can also use the approach to game SLAs by using it as the basis to split
customers into "pods" and even if some of these are down you may not have a
100% outage and have to pay customers back.

This design imposes challenges with speed of development and maintenance. If
you don't know your requirements (think: almost any SaaS startup in the
consumer or enterprise space) which is trying to find a niche, then following
this approach is likely to add overhead which is inadvisable. The companies
that can use this approach are going after an area they already know, and are
prepared to go much more slowly than what most startup developers are used to.

Using row-level security or schemas are recommended for most SaaS/startup
scenarios since you don't have N databases to update and keep in sync with
every change. If you want to do any kind of split then you might consider a
US/EU split, if your customers need to keep data in the EU, but it's best to
consider this at the app-level since caches and other data stores start to
become as important as your database when you have customers that need this.

Consideration should be given to URL design. When you put everything under
yourapp.com/customername it can become hard to split it later. Using URLs like
yourapp.com/invoice/kfsdj28jj42 where "kfsdj28jj42" has an index for the
database (or set of web servers, databases, and caches) encoded becomes easier
to route. Using customer.yourapp.com is a more natural design since it uses
DNS, but the former feels more popular, possibly because it can be handled
more easily in frameworks and doesn't need DNS setup in developer
environments.

------
oblib
IBM's "Cloudant" is not a web app per se, it's db as a service, but the
service is a db per user.

You might want to check out the CouchDB docs, and maybe take a look at their
mailing list for both users and developers. Their dev team can provide the
answers you're looking for as far a CouchDB goes.

It's my understanding that scaling up or down is a key feature of CouchDB.
It's designed to make it easy to create clusters of them working together. But
I really do not know much about that myself.

------
spicyramen
My apps was a call center using Twilio. I did initially have a db for
prototyping, then for demos, then we started growing and needed security and
multi tenancy support. The pace and new customers started growing so fast that
it just made sense to spin off new db instances (Postgres). We later started
having issues updating schema, adding features, upgrading, troubleshooting. We
needed to redesign the schema, backend and frontend but it was worth it at the
end and saved us a lot of time.

------
PeterStuer
The original design at tasksinabox.com was one DB per tenant. This became
untenable due mainly to cost after the business gained traction. After
experimenting with a single DB we settled on a sharded approach. This allows
us to have different parameterization as needed in terms of performance,
release group, availability, location etc.) . A shard can host thousands of
customers or just a single one. I think this is for typical B2B SaaS
operations the architectural sweet spot.

------
janstice
No, but previous workplace did fake it with a postgres-compatible front end
(in node.js) which pointed to per-customer postgres schema with no data, but
views/etc pointing to a multi-tenant schema with the actual data. Between the
views and the fake front-end we could isolate customer data and provide
hierarchical data access, allowing our customers to point pg-compatible tools
to access their data.

I suspect there was negative ROI, and the DBAs avoided me.

------
dillondoyle
We have separate google project -> bigquery for each of our clients. their
data is big to us, tiny by big org standards (~ a billion rows /cycle
depending on ads). It's political - and BQ views don't work well with metabase
and some permissions things. There's a master google project with only a few
people can access, and then as data comes in it's duped to a client google
project with different IAM account.

------
hackettma
I would take a look at this article for a good primer on multi-tenancy
patterns as it relates to Rails. I have not used the apartment gem but there
are numerous tutorials on how to set this up.

[https://rubygarage.org/blog/three-database-architectures-
for...](https://rubygarage.org/blog/three-database-architectures-for-a-multi-
tenant-rails-based-saas-app)

------
abraxas
I frequently toy with the idea of creating a platform like this using
Raspberry Pi or some other SBC where every customer gets not only their own
database but their own app server and everything else on a small piece of
hardware. Due to heavily localized data and application code you can likely
get away with the puny hardware and the cost per account is very sensible even
with a pretty high spec SBC.

~~~
jcims
Seems like you could start with containers to test the model out then move to
hardware if the economics worked out. A couple SaaS applications I use right
now seem like they are operating multi-tenant on raspberry pi.

------
paxys
You don't have to pick one or the other. We have a standard horizontally
sharded database setup where large customers get their own servers while
smaller ones are colocated.

One thing we do strictly enforce is that the schema of the entire database
must be consistent, so no one-off changes or migrations per customer. Database
partitioning is completely opaque from the application's perspective.

------
ransom1538
Yes!! In hosted forum software this is the norm. If you want to create an
account you create an entire database for this user. It isn't that bad!
Basically when a user creates an account you run a setup.sql that creates the
db schema. Devops is pretty complex but is possible. EG! Adding a column -
would be a script.

Scaling is super easy since you can move a db to another host.

------
qaq
We did this for 2 large projects I worked on. Works really well for env. where
you can get a lot of data per customer. We had customers with up to 3-4 TB
databases so any other option would either be crazy expensive to run and or to
develop for. You need to invest a bit of time into nice tooling for this but
in a grand scheme of things it's pretty easy to do.

------
econcon
We've a huge app, we use managed document database from major PaaS and we've
our own mysql which syncs to Document database. Problem with document database
is that it can't run complex queries but advantage is that data is safe in the
hand of the major cloud operator.

We've never lost a single piece of data since we started using it.

------
lurker213
I like having one DB and defining RLS rules (in postgresql) for the majority
of my tables. I also have all 'business entity' tables relate back to one
table called records, which has a tenant_id on it and timestamps on it. This
way I can keep FK constraints without doing the silly polymorphic rails model
stuff.

------
thrownaway954
i did this a loooong time ago with coldfusion and the saas i wrote. in the
beginning each customer had their own database and instance of the
application. at the time i had 25+ customers and doing update to the
application or the database took the entire weekend. over the course of 2
months i wrote everything into a multi tenet app with a single database and
never looked back.

i think that my thinking at the time that it would be easier from a security
perspective since everyone had their own data and also speed wise since
multiple smaller database could fit into memory better.

security wise, especially with an orm, it's not a huge concern as long as you
use the top level model as an entry point, it will isolate everyone 95% of the
time.

as for database sizes... we live in a world now where you can pretty much
scale in an instance by throwing more virtual resources at anything, so i
wouldn't worry.

------
oftenwrong
What advantages do you envision for the db-per-account approach?

Depending on that answer, you may be interested in using row-level security:
[https://www.postgresql.org/docs/current/ddl-
rowsecurity.html](https://www.postgresql.org/docs/current/ddl-
rowsecurity.html)

~~~
jbverschoor
Better separation

Easier restores if needed

~~~
majewsky
> Easier restores if needed

Depends on the reason why you need a restore. If something botches many
databases at once (because the filer holding them dies or whatever), you might
be looking at a fun time restoring hundreds or thousands of databases with a
playbook that was meant for one or maybe ten databases and thus isn't
sufficiently automated.

Not saying that these kinds of errors are likely. But you cannot just make
these assertions without the context of your actual threat model. Same for the
"better separation" part. How much separation you need depends on what you're
protecting against what.

------
simonw
This is pretty much how WordPress.com works - or used to work, I don't know if
they changed this.

Each account gets its own set of database tables (with a per-account table
prefix) which are located in the same database. Upgrades can then take place
on an account-by-account basis. They run many, many separate MySQL databases.

------
yomly
This [0] is probably not at all your use-case but it sprung to mind in any
case.

A fun read for sure

[0][https://www.colinsteele.org/post/27929539434/60000-growth-
in...](https://www.colinsteele.org/post/27929539434/60000-growth-in-7-months-
using-clojure-and-aws)

------
gigatexal
At my previous company we did. Every unique user each had a SQLite db. This
sharding by user worked well for us.

------
jokull
At QuizUp, a mobile gaming startup, we did not have 1 per user but 1024
databases to be able to shard and scale. We needed pgbouncer to decrease
connection counts and overhead. There was some extra pain, but managable.
Postgres allows you to have this logical seperation on one database instance.

------
ruben88
Oracle is doing this for their cloud software I think. Benefit is that they
can migrate your environment when you are ready. This way they can ensure
someones environment always keeps working. Downside is that there is a lot of
admin and a lot of things that can go bad.

------
Shank
I'm not sure if Roam technically uses separate databases, but it certainly
calls each user's environment a "database." They're on Firebase (Cloud
Firestore?) though, so it might just be a way of naming things and not a true
db-per-user model.

~~~
abhishektwr
Firebase has this interesting feature called "namespace". If you are building
the multi-tenant app using namespace it will give you probably desired
results. So I guess you can call each user's environment a database if you are
using namespace.

------
niciliketo
I would be interested to know what is driving you to consider this approach?

For example, I imagine one database per account would make it easier to
provide your customers with a flexible reporting solution.

Most of the reporting solutions available do not have good support for row
level security.

------
sergiotapia
Already some great answers. Some color: A lot of B2B contracts require this
sort of "isolation". So if you read 1 database per account and think that's
crazy, it's not that rare. Now you know! I certainly didn't 2 years ago.

------
gtsteve
My multi-tenant web app does this but I don't know if you'd call 100 unique
users a day "at scale".

I believe it will be helpful if it's necessary to separate customers into
"pods" as we grow.

The main advantage I feel we get however is that it was quite easy to write a
wrapper around mysqldump to retrieve data for development purposes.

I worked at a company that stored all customer data in a single database. The
performance was comparable but the agility was poor. Firstly, you had to
download all customer data to get a copy to debug a problem. This was a
security concern I had, and eventually we had to build a serialisation format
to retrieve the "slice" of data we needed. This tool needed frequent updating
as new tables were added.

You might argue that we should just try to imagine the bug and recreate it but
we have some pretty complicated data structures which can make investigations
very hard.

------
kfk
I am working on a similar use case. I was going to go the Nomad route for
this. I have no idea if this is a good idea or not but I wonder if using an
orchestrator and then saving state in separate volumes will do the trick.

------
hartator
Yes, I did that with PunBB with [http://forumcrea.com](http://forumcrea.com).
Mostly more like a shortcut to transform an one tenant into multi without too
much trouble.

------
donpark
There is an alternative solution: account-specific set of tables. Just add
account id to table names like 'id_12345_posts' and 'id_12345_invoices'. To
scale, shard them by account ID.

------
crmrc114
Work in healthcare, our applications commonly do per-org databases for legal
reasons. I have never personally seen anything at scale that is per
account/user so this is an interesting read.

------
zknz
Worked at a large B2B SaaS from near beginning. You want a hybrid; shared DB,
but with the ability to move to a 'shard/pod' architecture where you separate
out your customers/users into different dbs / apps servers as you scale.

We did it about 3 years in, when DB became a scale challenge. Eventually
you'll also get to the point where you want to be able to rebalance and
migrate data between each shard.

All of this is nothing you should be trying to solve too early; i struggle to
think of any real benefits of single DB per user, unless you are separating
out all architecture- including app servers - and that might only be relevant
for large enterprise customers? Selling in that market is hard.

------
andyfowler
Nutshell does this! We have 5,000+ MySQL databases for customers and trials.
Each is fully isolated into their own database, as well as their own Solr
"core."

We've done this from day one, so I can't really speak to the downsides of
_not_ doing it. The piece of mind that comes from some very hard walls
preventing customer data from leaking is worth a few headaches.

A few takeaways:

\- Older MySQL versions struggled to quickly create 100+ tables when a new
trial was provisioned (on the order of a minute to create the DB + tables). We
wanted this to happen in seconds, so we took to preprovisioning empty
databases. This hasn't been necessary in newer versions of MySQL.

\- Thousands of DBs x 100s of tables x `innodb_file_per_table` does cause a
bit of FS overhead and takes some tuning, especially around
`table_open_cache`. It's not insurmountable, but does require attention.

\- We use discrete MySQL credentials per-customer to reduce the blast radius
of a potential SQL injection. Others in this thread mentioned problems with
connection pooling. We've never experienced trouble here. We do 10-20k
requests / minute.

\- This setup doesn't seem to play well with AWS RDS. We did some real-world
testing on Aurora, and saw lousy performance when we got into the hundreds /
thousands of DBs. We'd observe slow memory leaks and eventual restarts. We run
our own MySQL servers on EC2.

\- We don't split ALBs / ASGs / application servers per customer. It's only
the MySQL / Solr layer which is multi-tenant. Memcache and worker queues are
shared.

\- We do a DB migration every few weeks. Like a single-tenant app would, we
execute the migration under application code that can handle either version of
the schema. Each database has a table like ActiveRecord's migrations, to track
all deltas. We have tooling to roll out a delta across all customer instances,
monitor results.

\- A fun bug to periodically track down is when one customer has an odd
collection of data which changes cardinality in such a way that different
indexes are used in a difficult query. In this case, we're comparing `EXPLAIN`
output from a known-good database against a poorly-performing database.

\- This is managed by a pretty lightweight homegrown coordination application
("Drops"), which tracks customers / usernames, and maps them to resources like
database & Solr.

\- All of this makes it really easy to backup, archive, or snapshot a single
customer's data for local development.

------
apapli
I recall ServiceNow use a single tenant model, not multitenant like most other
SaaS apps. I suspect (but am no authority on the subject) this means a
dedicated database per customer.

------
beobab
I learned to always have one database with every custom schema change in, and
run my database unit tests (tSQLt is my favourite) on that one database.

------
timwis
If you’re worried about an application bug giving access to other customers’
data, perhaps row-level security at the database level would help?

------
nwatson
Where I work we're about to move from a single DB across all tenants to a
separation of sorts, due to scaling and customer demands. Very large
enterprise customers will get their own DB as a "group of one", and "groups"
of smaller customers will share a DB. Certain groups will get more up-to-date
software with more software version churn, likely a higher number of issues.
Other groups will get only rock-solid older versions with back-ported bug
fixes ... both kinds of groups will then see benefits along a feature-to-
stability curve. Tenants who pay will get test tenants and a chance for those
to be in a "group" that's ahead, software-version-wise, of their normal formal
tenant.

We do not generally want to fork the product for different versions or schemas
or special features -- the goal instead is to roll upgrades through different
groups so we have more time to react to issues. We still want one single
software version and data storage layout lineage. This matches the
Salesforce.com model, so we won't need to deal with lots of different data
migration histories, custom fields, etc. (I'm curious to see how long we stick
with that). (I realize SFDC is all about custom objects, fields, UIs, etc. ...
but their underlying software is same for all tenants. We also have some
measure of customization, but within the same underlying DB layout that's the
same across all tenants.)

The backend tenants use is written largely in Java / Spring with managed-RDBMS
and other data-storage technologies from one of the big cloud vendors.
Orchestration is into a Kubernetes/ISTIO environment provisioned from raw
cloud-compute, not a managed service. The coordinator between the managed
storage-and-other services, Kubernetes/ISTIO, the Docker backend-software
registries, the secrets-managers, etc., is a custom Django REST Framework
(DRF) server app that lets DevOps provision "groups", attached to them fixed
data resources (that don't usually change from deployment-to-deployment) as
well as periodically revised/upgraded software resources (i.e., Docker
containers with backend software).

The DRF server app's main job is to let DevOps define the next-desired-state
aka "deployment" for a "group" (upgrading one or more of the backend servers
... changing the provisioning parameters for a fixed resource ... etc.), and
then the kick off a transition to that desired state. Each such "deployment"
reviews and validates once again all resource availability, credentials,
secrets, etc. ... stopping along the way as appropriate for human
verifications. Each step is done within a Django transaction, leading from
"old deployment" to "new deployment". Any failure in any step (after an
appropriate number of retries) leads to a rollback to the previous deployment
state. There's only one low-level step whose failure would lead to an
undetermined "emergency" state getting stuck "between deployments", and that's
very unlikely to fail since by that point all elements needed for the crucial
"switch" in upgraded software have been touched multiple times such that
failure at that point is real unlikely. There's a fairly straightforward
recovery from that state as well, after human intervention.

We chose this custom method because there are so many elements in so many
different infrastructures to verify and tie together that wrapping all the
operations in transaction-mediated Python made sense, plus the Python APIs for
all infrastructure elements a very good, and mostly involve
sending/receiving/inspecting JSON or JSON-like data. There's plenty of
logging, and plenty of side-data stored as JSON blobs in DB records for proper
diagnosis and accounting when things to go wrong. Groups can have their
software upgraded without impact to other groups in the system. Another
advantage is that as the "architecture" or "shape" of data and software
resources attached to a "group" changes (changes to how configuration is done;
introduction of a new backend service; introduction of a new datastore), the
DRF server app can seamlessly transition the group from the old to the new
shape (after software revision to make the DRF server app aware of what those
changes are).

The DRF server app itself is easy to upgrade, and breaking changes can be
resolved by an entire parallel deployment of the DRF server app and all the
"groups" using the same per-group backend datastores .. the new deployment
listens on a "future" form of all tenant URLs. At switchover time the pre-
existing DRF server app's tenant URLs get switched to an "past" form, the new
DRF server app's groups tenant URLs get switched.

In any case, these are some of the advantages of the approach. The main
takeaways so far have been:

    
    
      - there was major commitment to building this infrastructure, it hasn't been easy
    
      - controlled definition of "groups" and upgrades to "groups" are very important, we want to avoid downtime
    
      - Kubernetes and ISTIO are great platforms for hosting these apps -- the topology of what a "group" and its tenants look like is a bit complicated but the infrastructure works well
    
      - giving things a unique-enough name is crucial ... as a result we're able to deploy multiple such constellations of fake-groups-of-tenants in development/test environments, each constellation managed by a DRF server
    
      - the DRF will host an ever-growing set of services related to monitoring and servicing the "groups" -- mostly it can be a single-source-of-data with links to appropriate consoles in Kibana, Grafana, cloud-provider infrastructure, etc.,
    

We're still early in the use but so far so good.

------
tyingq
Not an app per se, but old school shared hosting does this. Might be worth $5
for a month of poking around to see what they do.

------
danmoz
I can confirm that BigCommerce does this.... or at least they did 9 years ago
when I interviewed there!

------
kimi
We do this with Docker, on a few thousand customers across 4 datacenters.

------
neeleshs
If you are using postgres, schemas are a good way of doing it.

------
amadeuspagel
Shopify works like that and uses rails.

~~~
byroot
No. That's inaccurate.

Shopify is classic multitenant, but sharded.

------
liveoneggs
[https://www.actordb.com/](https://www.actordb.com/) check this out

------
zbentley
A big healthcare company I worked for did this. It worked extremely well,
though it wasn't without its drawbacks. They adopted the database-per-tenant
pattern in the early '00s, and I truly think it was one of the major things
that allowed them to scale to a large number of (increasingly larger in
data/access patterns) clients. It also made regulatory compliance a bit easier
(everyone's data is pretty firewalled off at the database-access-credentials
layer) I think, but that wasn't really my department.

We ended up in the "thousands to tens of thousands" of clients range, with
thousands of tables per client and a pretty hairy schema.

Each customer had their own schema on one of a few dozen giant database
servers. The company pushed this idea out to other parts of their
infrastructure: separate webserver/message broker/cache tiers existed for each
underlying database server, so outages or brownouts in one component couldn't
affect other customers' data that much.

Schema migrations, interestingly, weren't much of a problem. The practice was
nailed down early of "everyone gets migrated during a release, no snowflakes".
That, plus some pretty paranoid tooling and an acceptable-downtime (well, on
paper it wasn't acceptable, but everyone kinda understood that it was) in
seconds-to-minutes during a release made migrations roughly as traumatic as
migrations anywhere I've worked (which is to say, "somewhat"), but not too
much more. It did take a lot of work to get the tooling right across multiple
schemas in the same database server though. Investment in tooling--up to and
including dedicated teams working on a single tool without a break-up date or
firm mandate other than "make this not suck and keep the lights on"\--is
critical here, as in most areas.

Things that were hard:

\- Connection management. Others on this thread have pointed that out.
Connection pooling and long-lived queue workers were essential, and the
web/request tier couldn't "scale out" too far without hitting connection
limits. Scheduled jobs (and this company _loved_ cron jobs, thousands of
distinct invocations per tenant) were a problem in the connection-management
department. Carefully written tooling around webserver connection reuse, cron-
job execution harnesses (they didn't really run as cron jobs, they got shipped
to a worker already running the code with warm/spare database connections and
run there--all highly custom), and asynchronous jobs was needed. That occupied
a team or three for awhile.

\- The "whale" problem. When an individual tenant got big enough to start
crowding out others on the same database server, it caused performance
problems. We eventually worked on a migration tool that moved a client's
entire footprint (and remember, this isn't just databases, but
webs/caches/queue worker hosts/etc.) onto another shard. Building this tool
was a lot of work, but when it was done it worked surprisingly well. My advice
in this area: build a good manually-initiated/semi-supervised migration
system. Leverage underlying database technology (binlog based replication).
Don't hesitate to get very dirty and custom with e.g. replication logfile
formats, and don't assume that $off_the_shelf_data_replicator isn't gonna
collapse when you want to do online per-schema replication in massive parallel
from the same database (not even if that tool cost you millions of dollars).
Do _NOT_ succumb to the allure of "we can automate the bin-packing and it'll
constantly rearrange clients' datasets for optimal resource usage!" Manual is
just fine for data migrations that big. Worst case, part of someone's job is
to initiate/supervise them.

\- SPOFs sucked. Some datasets weren't per-tenant at all; sometimes client
companies merged together or split up; some data arrived intended for a tenant
but wasn't tagged with that tenant's ID, so it would have to go into some
separate database before it found a home. These systems were, bar none, the
biggest liabilities, causes of production issues, and hardest things to code
around in the entire company. You'd think that having to write application
code for thousands of logical databases across all the per-tenant schemas
would suck, but in reality it wasn't too hard. It was making sure your code
didn't accidentally talk to a SPOF that was the problem. My advice here:
microservices do not help with this problem. HTTP, gRPC, or raw database wire
protocol: if you have dependencies on a "tragedy of the commons"-type used-by-
everyone server sneaking into your nicely sliced up per-tenant architecture,
those callsites are going to be the cause of your sleepless nights. Get good
visibility into where they occur. Favor "push" into per-tenant models over
per-tenant code doing a blocking "pull". Even if the push approach causes
massive additional complexity and work. The costs of pull are too great.

\- Some database specific shit (even on polished hosted offerings from AWS, or
big-budget Oracle installs) will start acting really squirrely when you're
talking to thousands of identical schemas on the same database server (and
thus tens or hundreds of thousands of identical tables with different data).
If you double down on this route, be prepared to have a few really, really
good database folks on staff. I don't mean "help me fix my slow giant
reporting query" people, I mean "familiar with the internals" folks. Example:
query plans can be cached based on query _text_ , globally, across an entire
database server. Different schemas have _super_ different clients, and thus
super different data distribution among 100s of GBs of data. The plan that
gets cached for query X against client A is the product of running
heuristics/histograms/etc. across client A's data. That plan might perform
pathologically when query X runs against client B (on the same database
server)'s data, and finding out how/why is really annoying. Solution: bust the
cache by a) happening to know that SQL comments aren't stripped from query
texts before the text is used as a plan-cache key and b) prepend each tenant's
identifier to each query at the database-driver level to prevent cache
pollution. Result: you have traded a spooky query performance issue for a
query-plan-cache-size issue; now your queries are all predictably slow because
all your tenants' different queries are thrashing the plan cache. Tradeoffs
abound.

------
zaroth
I’ve done this at several companies. Each enterprise account (in my case, each
site) gets their own database. IMO it works extremely well.

You will need a way to detect schema version and bulk apply (and optionally
rollback) schema updates. A ‘Schema’ table in each Site database with rows
inserted/deleted after each update/rollback is sufficient.

A separate ‘Hosting’ database keeps track of all the sites and knows about
each schema package, which is a version number, a function which can detect if
the change was applied, and the SQL code to apply the schema change. Don’t
ever store any site specific information other than the name/ID of the site in
the Hosting database - because it could get out of sync when you restore a
site backup, or if you have to restore a Hosting backup.

Ideally you would want to make schema changes always backward compatible, as
in an old version of the code can always run fine against a newer schema. So,
e.g. new columns are always nullable, as are new parameters to stored
procedures. This has been a very useful property a number of times during
deployments when you can switch the app binaries around without worrying about
schema rollbacks.

You’ll of course need to script the database setup/creation process, so you
can click a button to bring up a new site/customer/tenant. As much as possible
don’t ever touch the database by hand, and if you follow this rule from the
start you will stay in a sane happy place without much overhead at all.

I’ve done this with up to 4-figure number of databases and it’s served me just
fine. There were many times that certain customers would get ahead in the
schema and then later everyone would catch up as new code rolled out.

I think it would be a heck of a lot scarier doing DB operations if it was all
a single database. For example, you’ll have a new customer who is using a new
feature which you are beta testing with them. Easy to have just their database
ahead of the mainline, and make any fixes there as you go, and then deploy the
final GA schema worldwide.

The only cardinal rule I always followed was that a single binary had to work
for all customers. I would not want to cross the line into customer-specific
code branches at practically any cost. There were certainly feature flags that
were only enabled for single customers, but ultimately every site could run on
the same binaries and indeed the same app farm.

It’s particular useful to be able to backup/restore easily on a per-customer
basis, and to be able to pull in just one customer DB into dev to reproduce
the issue - without needing to pull everything over.

Not with Rails but with SQL Server and C#/ASP.NET. In this case it’s easy to
setup so that the domain name would map to a database connection string at a
very low level of the code. Everything above would have no concept of what
site it was operating on. You never had to worry about writing any kind of
code to isolate sites except for one thing — mixing the domain name into the
session token so that a malicious user couldn’t try to reuse a session from
another domain. Because of course it’s all the same set of app servers on the
front-end.

