Remember you can always just store "where" data is. You could create a simple map [user_id,inbox_cluster_id, ...] that holds the hosts per user_id. Then moving users to other clusters, etc all maintenance is smooth sailing.
At scale, the map itself can become too big for a single node. Now - you can resolve this with more levels of indirection (lookup in meta index to find which index to find where the data is), but this impacts performance.
But you're right - it's a totally valid option and works more than it doesn't (you can index a LOT of data with a few GB).