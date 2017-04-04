If your distributed system allows you to do a hash or range based sharding, for example by user_id, then you can do an accurate count(distinct user_id) across the system without a reshuffle of the data, knowing that all the data for a particular user lives on the same node.
