I recently read Crypto.com's original "On-Chain Market Sizing" publication  and I have a question about the methodology.
In order to determine the number of wallet-holders on the Bitcoin blockchain, they count the number of deposit addresses connected to various exchanges:
However, I did not think that each exchange would use a single deposit address for each user. In the past when I have deposited funds in some exchanges I would get a different address each time. Therefore, wouldn't this mean that rather than counting the number of users, this method is more likely counting the number of deposits? Since each user may have several deposits with a single exchange, this may be inflating the projected count of users by anywhere from 2 to 100+ times, depending on the average number of deposits made per user per exchange.
Most exchanges use an architecture called “deposit sweeping” to
handle crypto deposit inflows. By counting the deposit addresses via
on-chain data, we can approximate the number of users for that
The methodology doesn't make any attempt to correct for this, and actually goes on to further inflate this value by scaling it by the inverse of the 'deposit rate', calculated from one of Crypto.com's surveys:
This assumption that "the number of users on an exchange approximately equals to the number of exchange addresses" feels like a very strong assumption to me, doesn't align with my own experience of using exchanges, and can potentially skew the results by an order of magnitude!
By assuming that the number of users on an exchange approximately
equals to the number of exchange addresses, we can state our first
formula [Formula 1]
However, on-chain addresses will only appear once people have
performed a deposit / withdrawal action, to account for the fact that
some people have never deposited into / withdrawn from the
To estimate the true number of exchange users, we need to further
scale up the number in formula (1): it should be divided by the ratio
that people have deposited into an exchange.
We calculated the deposit ratio by blended metrics from our internal
data, survey data and industry benchmark.
Can someone with a better understanding of the different exchanges included in this report, and the 'deposit sweeping' methods and wallet structures used by these exchanges, provide their take on this assumption and the potential for error here?