
Sets and Probability - stopachka
https://stopa.io/post/243
======
Sharlin
Probability theory with sets actually generalizes well to infinite and even
uncountable cardinalities—rather than _counting_ events you just have to
switch to the more general notion of the _measure_ of a (sub)set [1].

[1]
[https://en.wikipedia.org/wiki/Measure_(mathematics)](https://en.wikipedia.org/wiki/Measure_\(mathematics\))

~~~
benrbray
For anyone looking properly (i.e. with proofs) learn measure theory, I highly
recommend "Introduction to Measure Theory" [1] by Terrence Tao (has an
excellent collection of exercises -- they're crucial to the reading!)

I think it's OK to skim the first few chapters about constructing Lebesgue
measure from Jordan measure, as it makes the whole topic appear more difficult
than it needs to be. From Chapter 4 (abstract measure spaces) onwards, the
book is full of great insight.

A good one on probability & stochastic processes is "Measure Theory and
Probability Theory" by Athreya & Lahiri.

[1] [https://terrytao.files.wordpress.com/2011/01/measure-
book1.p...](https://terrytao.files.wordpress.com/2011/01/measure-book1.pdf)
[2]
[https://www.springer.com/gp/book/9780387329031](https://www.springer.com/gp/book/9780387329031)

~~~
t_serpico
How useful do you think measure theory is from a practical perspective? I've
seemed to get by without more or less during my career (which is pretty stats
heavy).

~~~
benrbray
Learning measure theory has eliminated a lot of blind spots for me when it
comes to stats / probability / ML. I find that I get "stuck" much less
frequently when reading papers, and I'm able to better in the gaps when
authors are too sparse with the mathematical details. Before, I would always
wonder, "how on earth did they come up with this idea?!", but the more math I
learn, the more I'm able to recognize that certain ideas are inspired by XYZ
branch of math.

It's also been useful for understanding the "novelty" papers like "Neural
Differential Equations" and so on.

I was lucky enough to take a good measure theory class in college, where it
was fine to spend an entire day working through a proof from a textbook. Now
that I'm working, though, it would be harder to justify that time commitment.
So it really depends on your priorities.

------
yboris
One of my favorite books _Epistemology and Psychology of Human Judgment_ by
Bishop & Trout [0] argues that humans are bad at making judgments, and what we
want is reliable methods for arriving at truth.

The authors of the book provide several heuristics that yield better outcomes
than typical strategies people use.

With respect to probabilities, since humans, on average, suck at it, they
recommend a "frequentist" approach. A great illustration is one about a 99%
accurate medical test that claims a patient is positive. The probability the
patient has the disease is _not_ 99% (over 60% of Harvard doctors get this
question wrong). The authors show how reframing the question with a sample
population (10,000 people for example) and the relative prevalence of the
disease (and noticing the false negatives) the problem becomes almost trivial
to calculate.

[0] [https://www.amazon.com/Epistemology-Psychology-Judgment-
Mich...](https://www.amazon.com/Epistemology-Psychology-Judgment-Michael-
Bishop/dp/0195162307)

~~~
fredgrott
note, folks the book can also be found on archive.org as well

------
canjobear
The set-based intuitions are a nice guide, but knowing how to turn the crank
and churn out the calculations is also crucial for the cases where intuitive
thinking doesn't scale.

~~~
Tarsul
yeah, I also liked this guide. Would love more articles about probabilities
(but i always loved this topic :)), there's only one exception: the "3 door
problem" (Monty Hall problem). In my opinion the general misconception about
this problem does not stem from a misunderstanding of probabilites but a
misconception about the moderator (who plays by his own rules [as a bad actor]
- which is often not really discussed although it changes the whole game).
Well, I would read an article about it by this author anyway ;)

~~~
Pyramus
Not sure what you mean with "bad actor" \- the host/moderator is not an
adversary, his rules are fixed from the outset (see e.g. Standard Assuptions
in [1])

These rules imply that new information is revealed, and based on the new
information it is advantageous to switch.

The trickyness of the Monty Hall Problem lies in identifying that indeed new
information has been revealed, which seems counter-intuitive.

[1]
[https://en.wikipedia.org/wiki/Monty_Hall_problem](https://en.wikipedia.org/wiki/Monty_Hall_problem)

~~~
im3w1l
Good on wikipedia for having them, but in my experience those "standard
assumptions" are often left out when stating the problem.

~~~
Pyramus
Not really in my experience, I've never come across the problem without being
stated explicitly how the moderator acts - it would be a different problem.

What I think is true is that people underestimate the role of the host (i.e.
the rules according to which he behaves to).

~~~
vertere
The first statement of the problem in the wikipedia page (from "Ask Marilyn")
does not state it explicitly (though it does say the host knows what's behind
the doors). There is a reason wikipedia refers to them as "assumptions".

Or from [0]: "The standard annunciation of the MH problem, does not make
explicit what I am assuming here: namely, that Monty will always open a door
which does not contain a prize."

[0]:
[https://www.montyhallproblem.com/#F2](https://www.montyhallproblem.com/#F2)

~~~
Pyramus
While you are technically correct, it's implicitly stated ("game show") and
correctly understood by the majority of readers. Take it from Marilyn vos
Savant herself:

"Virtually all of my critics understood the intended scenario. I personally
read nearly three thousand letters (out of the many additional thousands that
arrived) and found nearly every one insisting simply that because two options
remained (or an equivalent error), the chances were even. Very few raised
questions about ambiguity, and the letters actually published in the column
were not among those few." [0]

The crux of the problem is the counter-intuitive nature of probabilities and
information, possible choices and choices that could have been. It's a
_difficult_ problem, purported to have tripped up even Paul Erdős.

[0]
[https://en.wikipedia.org/wiki/Monty_Hall_problem#Other_host_...](https://en.wikipedia.org/wiki/Monty_Hall_problem#Other_host_behaviors)

------
zokier
I'm bit confused, isn't counting desirable outcomes vs possible outcomes
standard way of teaching probability? How is nCr/nPr stuff explained if not in
the context of counting outcomes?

I'm also confused where does set theory come in here?

~~~
senthil_rajasek
To begin, Sample Space of Events is a Set.

Simple cases like probability of A or A's complement etc., all rely on basic
set theory.

The rabbit hole is sample spaces that are really really large.

What happens to probability distribution of desired events in those cases?

------
vertere
This method only works because the boxes have the same number of balls. To
calculate probabilities by counting outcomes you have to start with outcomes
that all have equal probabilities.

~~~
wizzwizz4
It's a special-case of a general principle that works regardless of the
weighting. Instead of integer counting, do real counting with unity.

~~~
vertere
Sure you _can_ generalize it by weighting the outcomes by their probabilities
(not mentioned in the article). But how are you going to calculate those
probabilities any more easily than you can solve the posed problem?

The more "intuitive" way of generalizing it (integer counting) does not work
in general. I think the example is likely to be misleading to anyone who
doesn't already understand this stuff.

As suggested by the footnote, the reason the given example can be solved
elegantly is because of the symmetry.

------
petters
Ironically, this is the sort of probability theory Nassim Taleb hates. :-)

~~~
atdixon
Exactly. This is the brand of neutered, ludic probability on which The Black
Swan is a holistic dump. Very ironic indeed that it leads with Taleb as the
inspiration.

------
somethingsome
(From the name of OP I deduce he may read this.)

You should read the first and second chapter of 'Probability Theory: the Logic
of Science' by Jaynes. It offers a derivation of probability theory using only
Logic and Boolean algebra as foundations. I found it delightful, I hope you
will also enjoy it :)

It's somewhat more intensive than Taleb, but the derivations are really
beautiful.

~~~
stopachka
Will do, thank you! : }

------
BeetleB
I don't recall how I learned it in high school, but for my undergrad
statistics & probability course, sets were very much the basis for everything.
I'm surprised that this isn't the norm ...?

------
RgueNkeScientst
This was so satisfying to read.

~~~
stopachka
Thank you for the kind words :)

------
sAbakumoff
References to textbooks that OP read would be a nice addition to the essay

~~~
stopachka
My favorite was this textbook:

[https://www.amazon.com/gp/product/1292025042/ref=ox_sc_act_t...](https://www.amazon.com/gp/product/1292025042/ref=ox_sc_act_title_3?smid=AMMEOJ0MXANX1&psc=1)

This one was more dense, but very good in the beginning if you take the time:
[https://www.amazon.com/gp/product/0070484686/ref=ppx_od_dt_b...](https://www.amazon.com/gp/product/0070484686/ref=ppx_od_dt_b_asin_title_s00?ie=UTF8&psc=1)

\---

Will update post later, but wanted to share with you now : } -- thanks for
taking the time to read!

