
Generating Random Numbers from a Specific Distribution by Inverting the CDF - Atrix256
https://blog.demofox.org/2017/08/05/generating-random-numbers-from-a-specific-distribution-by-inverting-the-cdf/
======
pash
Statisticians call this _inverse transform sampling_ [0]. The dual result,
that Y = F(X) is uniformly distributed for F the cdf of any random variable X,
is called the _probability integral transform_ [1].

0\.
[https://en.wikipedia.org/wiki/Inverse_transform_sampling](https://en.wikipedia.org/wiki/Inverse_transform_sampling)

1\.
[https://en.wikipedia.org/wiki/Probability_integral_transform](https://en.wikipedia.org/wiki/Probability_integral_transform)

~~~
nerdponx
Related fun fact: the distribution of F(X) is uniform on the support of F. Not
hard to prove, either.

------
todd8
For those needing more detail on generating numbers from a specific
distribution there is Knuth's _The Art of Computer Programming_ , Vol. 2
Seminumerical Algorithms, see section 3.4 _Other types of random quantities_.
Special treatment is given for many distributions: normal, exponential, gamma,
chi-squared, f-distribution, binomial, Poisson, along with discussion of
general methods. Although there have been more recent advances in the
generation of high-quality, high-speed generation of uniformly distributed
random numbers, Knuth covers a whole range of issues related to pseudorandom
number generation.

------
gautamdivgi
I used inverse transform sampling extensively in my thesis. I used it for
fitting distributions to data (new methods)and similarity analysis of CDF's.
In the software engineering field in am frequently appalled by how often basic
probability techniques are ignored in favor of just the "mean" and "std.
deviation".

~~~
d-sc
I'm curious if you could elaborate on this more. What other probably
techniques would you recommend knowing?

~~~
tempay
I'd recommend knowing about bootstrapping[1], there must be simpler articles
but the basic idea is to randomly sample with replacement to generate new
datasets which can then be used to test the variation of properties of the
original dataset.

[1]
[https://en.wikipedia.org/wiki/Bootstrapping_(statistics)](https://en.wikipedia.org/wiki/Bootstrapping_\(statistics\))

------
petters
I'm sorry, but this post is not a very good explanation. Here's why we should
invert the CDF:

> The issue is that if we flip x and y’s in a PDF, there would be multiple y
> values corresponding to the same x. This isn’t true in a CDF.

Surely there must be a better reason to invert the CDF than that it's
possible?

Any textbook will explain this much, much better.

~~~
Atrix256
Have another way to explain it that you think is more appropriate?

------
procedural_love
This is awesome! I wanted to do this for a procedural generation algorithm
during a game jam but didn't have the time to figure it out.

