That said, per-customer/user partitioning might still be your best bet given the advantages that it conveys around isolation. If I understand correctly, Citus can guarantee atomicity, consistency, and isolation (ACI*) within a transaction localized to any single partition which is a _huge_ benefit for building apps that are more tolerant to failure and problematic edge cases with very little effort on your part (compare this to Mongo, which in its latest versions is just starting to give you the "consistency" part and nothing else).
Anyway, nice work from the Citus team!
For long running / many record migration we shove those into a queue to they can be done in the background in parallel.
Coupled with the opportunity for things like mass assignment vulnerabilities (i <3 the strong params pattern on rails) I am a little perturbed by the notion of my data being housed next to other customers. If that customer is a more attractive target, and a compromise is found, now I'm just along for the ride.
That all said, I'm not well appraised of what problem Citus Data is solving -- so maybe I've just read this wrong?
I've been working on this library for 2-3 months now, if somebody gives it a try let me know how it goes :)
I'd be happy to answer any questions as well.
Following your example, let's say that most customers have multiple departments and that most tables have both a customer_id and a department_id.
Oh, and this looks solid. Nice work :)
You could probably do it nonetheless with Rails' own has_many/belongs_to relationships for that sub-tenant, but the automatic adding of department_id into queries wouldn't happen.
Depending on what storage you end up using, that might matter or not. For example with Citus, you'd typically take the top-level tenant (customer_id) as a shard key, and just have all the departments of that tenant on the same node - so it would work fine.
That is a problem in systems that expect the tenant_id to always be in the query, so you can locate which node the query needs to go to.
Since Rails isn't used to include something like a tenant_id, and doesn't support composite primary keys either, you'd have to always hand-write your SQL.
Feel free to take a look at the source - it effectively ends up being a default scope with some additional glue code.
