
Scale Out Multi-Tenant Apps Based on Ruby on Rails - joeyespo
https://www.citusdata.com/blog/2017/01/05/easily-scale-out-multi-tenant-apps/
======
brandur
One word of warning about choosing a partitioning scheme: if you select
something like customer (as recommended in the article's example), you might
get to a point down the road where you realize that some customers are much,
_much_ bigger than other customers. This can put you in an awkward place as a
single tenant is approaching the bounds of their shard because you may not
have a lot of choice but to figure out how to further subpartition their data.

That said, per-customer/user partitioning might still be your best bet given
the advantages that it conveys around isolation. If I understand correctly,
Citus can guarantee atomicity, consistency, and isolation (ACI*) within a
transaction localized to any single partition which is a _huge_ benefit for
building apps that are more tolerant to failure and problematic edge cases
with very little effort on your part (compare this to Mongo, which in its
latest versions is just starting to give you the "consistency" part and
nothing else).

Anyway, nice work from the Citus team!

~~~
craigkerstiens
We actually have something on the roadmap to help address the case of a very
large customer. In future versions of Citus you'll be able to isolate a single
very large tenant to their own shard so their resource consumption won't
compete as directly with other smaller users, and then if needed you could
scale them out to their own physical node with no changes to your app.

~~~
ddorian43
You'll need multishard table for big clients. And with ability to split shards
to grow more.

------
tomschlick
If you don't have to share data between users, a multi-database setup is the
way to go. I wrote a post about how I do it a little over a year ago...

[https://tomschlick.com/2015/11/29/lessons-from-building-
mult...](https://tomschlick.com/2015/11/29/lessons-from-building-multi-tenant-
apps-part-1-databases)

~~~
raarts
What is your experience with doing schema upgrades across many databases? Is
it usually possible to break up schema changes in small pieces, so that it
works across multiple versions of your software?

~~~
tomschlick
Yeah we try to minimize the changes that would require a lock. For those cases
we have a flag where we can turn off each tenant individually for maintenance.

For long running / many record migration we shove those into a queue to they
can be done in the background in parallel.

------
dayjah
An important aspect of the infosec philosophy my employer pushes is "blast
radius". If you assume you will be compromised then you want to ensure as
little data as possible ends up in that compromise.

Coupled with the opportunity for things like mass assignment vulnerabilities
(i <3 the strong params pattern on rails) I am a little perturbed by the
notion of my data being housed next to other customers. If that customer is a
more attractive target, and a compromise is found, now I'm just along for the
ride.

That all said, I'm not well appraised of what problem Citus Data is solving --
so maybe I've just read this wrong?

------
lunaru
As someone who's seen too many gems get abandoned, I'm hoping that this one
sticks! Fortunately, it seems ultimately aligned with the underlying business
so that's usually a good indicator.

Another problem with these model<->database gems is that they're impossible to
mix and match. In particular, if you use something like octopus
([https://github.com/thiagopradi/octopus](https://github.com/thiagopradi/octopus))
for selecting replica databases, then you're out of luck when trying to match
it with this gem, at least at first glance.

------
lfittl
Thanks for sharing - author here.

I've been working on this library for 2-3 months now, if somebody gives it a
try let me know how it goes :)

I'd be happy to answer any questions as well.

~~~
jaxn
Does this have easy support for some kind of sub-tenant setup?

Following your example, let's say that most customers have multiple
departments and that most tables have both a customer_id and a department_id.

Oh, and this looks solid. Nice work :)

~~~
lfittl
Thats a good question - there is nothing explicit for it, yet.

You could probably do it nonetheless with Rails' own has_many/belongs_to
relationships for that sub-tenant, but the automatic adding of department_id
into queries wouldn't happen.

Depending on what storage you end up using, that might matter or not. For
example with Citus, you'd typically take the top-level tenant (customer_id) as
a shard key, and just have all the departments of that tenant on the same node
- so it would work fine.

------
sbayona573
What's the benefit of this approach vs doing scoped queries? i.e.
current_user.pages.find(params[:id])

~~~
smileysteve
My experience is that it's easy for a developer (new or not) to forget to
scope some query. So, by using a gem like this or tenancy, your infrastructure
prevents the tragic mistake of missing it.

A middle ground solution might extend active record to warn you whenever a
query doesn't have current_user / current_customer.

------
smileysteve
I've used the gem 'tenancy' which extends ActiveRecord to set a where
statement for the tenant on every query of every object that belongs to the
tenant.

~~~
ehsanu1
What about objects belong to objects belong to the tenant? Or further levels
thereof?

------
33degrees
I see in the readme that the gem is based on acts_as_tenant which I currently
used; what has been changed/added? I'm wondering if it's worth the switch.

~~~
lfittl
There are a few things, the most important ones:

    
    
      - Including the tenant_id in modification queries as well (e.g. UPDATE/DELETE)
      - Support for specifying the current tenant as an ID (through `MultiTenant.with_id(123) { ... }`)
      - Helpers for multi-column primary keys in your migrations
      - Specific integration with Citus as a distributed multi-tenant data store (it'll just work without modifications)
    

I've considered contributing these back to acts_as_tenant, but it seemed
easier to have a clean-cut, since the shared lines of code only ended up being
about ~50 lines - mostly to set a default scope.

------
mperham
How is this different from existing multitenacy gems like `apartment`?

~~~
lfittl
Let me start by saying that apartment is great - but it takes a different
approach.

The most important difference is that apartment shards by database or schema,
vs activerecord-multi-tenant puts all your data into the same table, and then
relies on a PostgreSQL extension like Citus to do the distribution.

There is trade-offs to both, but part of the reason we prefer the latter
approach is that schema changes become a lot easier, and you can do shard-
splitting based on a shard count, as opposed to just by tenant, effectively
grouping smaller tenants together on a node.

I'll also give a shout-out to acts_as_tenant, which our library is based on,
although we've deviated a bit since then.

------
hsh
can't the same be achieved with has_many :through association?

~~~
lfittl
Its a bit more complex than that unfortunately.

Since Rails isn't used to include something like a tenant_id, and doesn't
support composite primary keys either, you'd have to always hand-write your
SQL.

Feel free to take a look at the source - it effectively ends up being a
default scope with some additional glue code.

