
The Central Limit Theorem and Sampling - luminousmen
https://luminousmen.com/post/data-science-central-limit-theorem
======
clircle
> According to the central limit theorem, the average value of the data sample
> will be closer to the average value of the whole population and will be
> approximately normal, as the sample size increases.

This is a pretty loose statement of the central limit theorem. Sample averages
converge to the population average by the strong law of large numbers (almost
surely, under some mild conditions). The central limit theorem is a statement
about the differences between sample averages and population mean. Multiply
the differences (sample avg. - pop mean) by sqrt(sample size) and let sample
size go to infinity to converge to a Normal distribution. (under some stronger
conditions).

------
Rainymood
This is 1st year undergraduate statistics and has nothing to do with data
science. The graphics are cute though. I'm sorry for being so blunt.

------
deepsun
> this is true regardless of the distribution of population

Nope. Not true for all the distributions. For example, stock prices cannot be
used in central limit theorem.

Distributions must have second moment, in other words, have finite variance.

------
gran_colombia
It is easy to find this material. This link adds nothing to the vast amount of
tutorials in probability.

~~~
heavenlyblue
They are giving sampling examples as if I, as a data scientist - would be
sampling actual human population.

Give me real examples of data sampling in the wild - how did you obtain this
and that dataset? How did you clean up data x? How did you infer that the API
provider’s server was misconfiguring the parameter X and therefore 10% of our
cashflow was attributed to Wednesday last week instead of today?

This feels more like wikihow.

