With calculus and linear algebra your gut feel is about right no average. You can quickly get a feel for trajectories, acceleration and distances (derivatives and integrals), areas, volumes, amounts, etc. But on probability your gut-feel will always fool you.
In the end, you see a handful of math bloggers bemoaning the lack of education in probability and the nonsense being discussed by journalists and politicians. And it hardly matters whether it's an election or a pandemic. The lack of understanding of uncertainty and the false belief that one can reason about these without looking at the numbers too closely is dangerous.
Sorry about the rant. But...
Dear creator of seeing-theory.brown.edu,
if there is one thing you could change about the project to make it different and infinitely more useful: Please start the first chapter with the goat problem, then go through a couple of examples from chapter 10 in Thinking Fast and Slow, the discuss information (maybe with a simplified version of Mendel's pea experiment), discuss distributions and leave expectations and variances for much-much later.
Here's what works for me:
- the switching strategy always gives you the opposite of your initial choice
- the initial probabilities are 2/3 goat and 1/3 car so by switching you get 2/3 car and 1/3 goat
After he opens 999,998 doors he has given you quite a bit of information. There is a 1/1000000 chance though that he has given you no information (you picked the correct door)
But you're right that thinking about it in partitions also makes sense. You try to pick a partition size 1 that contains the prize, while Monty picks the partition size 999,999, if you agree with his partition and it has the prize you get it
To muddy the waters further, it's not always understood that in the 1000000-door case, 999998 other doors are opened (as evidenced by discussion elsewhere in these threads). Sometimes people think it's still just one door. I suspect this is because the original problem is usually stated as "...Monty Hall then opens one of the doors you didn't pick" and because people suggesting the 1000000-door often just say "...what if there were one million doors?"
Get a piece of paper. Draw all possible outcomes, 9 total. ( Car is behind door 1 you pick 1, Car is behind door 1 you pick 2...). 3 of the 9 result in success.
Now draw the outcomes again but switch every time. 6 out of 9 outcomes are a success.
(He also opens one of those two doors to reveal a goat, but you already knew that one of them had a goat so that doesn't give you any additional information.)
It SHOULD be clear, because you have two givens: 1) Monty never reveals the car. 2) He opens all the doors except 1.
How is this a given exactly? In the original problem he only opens 1 other door. Now that also happens to be all doors except 1, but from just the 3 door problem that seems more coincidental than a fundamental part to the question
Obviously, you and I know it is, but the person grappling with the Monty Hall problem is right in not being convinced of that just because someone says it is!
Monty Hall is asking you a simple question, whether or not you should switch, and so in my example of 1,000,000 whether or not you open 999,998 doors, or 1 door, you will always have worse odds to win if you don't switch to another door. Removing 999,998 doors just takes the proposition to an extreme.
Another component to utilize one's intuition using the 999,998 example, would be to imagine the game being played 3 times in a row. What are the odds that not switching will help you? So basically, not switching is disregarding everything Monty Hall is doing. You are either behind a door or you are not. You don't switch. If that is how you play the game, your chance of choosing right when not switching is 1/1,000,000 each game, or 1/10^18 for it to happen 3 times in a row. Now, consider what Monty is doing. He's removing every chair but two, yours and another. If the odds of you winning are 1/1,000,000 if you don't switch, What are the odds of doing _the opposite_? Since there are only two options, the probability of winning if you switch is 1-1/1,000,000, or 999,999/1,000,000, as the sum of the probabilities of all possible events has to add up to 1.
The "999,998" chairs removed example is an attempt at making the dichotomy between "stay" and "switch" more extreme, so that you would feel it in your gut rather than trying to mentally account for the moving pieces.
I'm always interested in improving my ability to explain these kinds of phenomena, and I appreciate the pointing out of why the dots don't get connected for some with the example.
And the rationale for opening 1 other door in the million door example is that in both examples the host is opening 1 other door. The normal Monty Hall problem is usually formulated such that the host opens 1 other door, not that he opens all other doors. As you noted, the two formulations are equivalent in with 3 doors, but with more than 3 doors, they're not. I just don't see why it's "intuitive" that if the number of doors is increased, the natural extension of the game is that the host opens all other doors that don't have the prize. In fact I'd argue the opposite.
Imagine an actual Monty Hall game with 4 doors. The contestant opens 1 door with a goat, and the host might open (a) 1 other door with a goat or (b) 2 other doors with a goat. Both are valid, reasonable, but different extensions of the game. In both versions, the best strategy for the contestant is to switch, because in both versions the host is giving her extra information. But in version (b) he's giving her much more information than in version (a). Of course it's much easier to intuit in version (b) that switching is better, but it's not clear to me why version (b) rather than (a) is the natural 4-door analog to the 3-door Monte Hall game. If a 4 door version were played in real life, it's far more likely IMO that version (a) would be played. In (a) the host gives a little bit of extra info where the prize is, without giving away the solution. And in this case you need a much better model to see why this is the case in stead of relying on intuition and analogy.
 A usually unstated assumption in most formulations is that the host must open another door with a goat if the contestant initially chooses a wrong door. In an actual TV show the host will likely have discretion whether he opens another door at all, to increase suspension and not become predictable in repeated games. In this case the problem becomes much more difficult as you need to model the strategy of the host. All of thise and pretty much any other solutions and explanations on pretty much every forum is already extensively documented in Wikipedia (https://en.wikipedia.org/wiki/Monty_Hall_problem#Other_host_...).
If Monty doesn't know where the car is, then if 999,998 doors were opened showing goats, leaving two doors, the odds that the car is behind your door or behind the remaining door is 1:1 ... this defies many people's intuition.
The difference between the two cases is that, if Monty knows where the car is, then his opening 999,998 doors with goats behind them is exactly what we expect, whereas if he doesn't know where the car is, then his opening 999,998 doors with goats behind them is an extraordinarily unlikely event. But if that does happen despite being extraordinarily unlikely, then there's still a 50% chance that the car is behind your door.
1) I will probably lose when Monty opens a car door.
2) If I don't, I am really gambling between whether I made a 1-in-a-million pick or Monty did (in the choice of which door to leave shut), which obviously has even odds.
Interestingly, by compressing this problem back down to the 3-door version, it makes it pretty obvious why that's the case (and aligns with people's intuition about the original problem). Also interesting that in this case, even if the intuition is wrong (that 'obviously' they must have picked the car), the outcome (sticking with the chosen door) is an optimal strategy.
The really hard part is the modelling part, where you transform the problem to a mathematical statement and vice versa. It's very easy to misinterpret both the problem in terms of mathematics and the mathematical result in terms of the problem. All the wrong answers to brain teasers like the monty hall problem, the tuesday boy problem etc., are right answers to the wrong question.
Unfortunately, in education we do not seem to want to discuss the modelling part on equal terms with the theory. We seem to be okay with solving the entire problem, or solving just the theoretical part with no regards to the application, but expressing just the mathematical problem to be solved is never appreciated. In a calculus setting, this could be deriving the answer to some physical problem depends on the solution of some partial differential equation -- even if you do not have the tools to solve it outright.
My guess is that it's just easier to teach theory with clear cut answers. Modelling the real world is ambiguous and hard.
The Monte Hall problem is more of a curiosity than a fundamental principle!
(Was a TA in undergrad engineering probability for 2 years, saw my share of learners.)
Convincing as many people as possible that statistical intuition is not something we are born with should be the key priority of any probability and statistics class.
Monte Hall was one example. The birthday problem and the base rate fallacy are two more . The result seems obvious but most people get these wrong.
With a couple of papers or books by Kahneman and Tversky in hand we can generate an almost infinite list of simple statistics/probability questions, which most people get wrong.
Let people make some mistakes, before dumping the theory on them.
Monty Hall is not a good example, unless it is explicitly stated that Monty knows where the car is and that he deliberately opens a door with a goat. Just look at the discussions in the comment here.
Again, strong disagree. Probability has been understood at a quantitative level since Laplace (1812). Modern measure-theoretic probability dates from Kolmogorov's foundational work (1933). All these years later, we really know this stuff.
Specifically: A lot of general-purpose, powerful tools have been developed. Distribution theory, the strong law of large numbers, the CLT, maximum likelihood, L2 theory for estimation.
Depending on your goals, these or related tools are capable of addressing a wide range of problems. The priority of the first few courses should be to impart mastery of a selection of these general-purpose tools, so that students know how to analyze problems probabilistically. This is where intuition comes from.
Gotcha problems like Monte Hall are not getting you to this goal!
One could argue that MHP can motivate the notion of conditioning, but I think fundamentally the MHP is verbal legerdemain. That is, you state the problem such that the conditioning is implicit in the actions, and people don't notice it. Recall that the questioner obtains "victory" when, after presenting the problem, the answerer is confused and gives the wrong answer. I don't like that approach as a teaching tool.
I'm also skeptical of the Birthday Problem and the Kahneman-Tversky surprises. I see value in these surprising conundrums (the Birthday Problem is in volume 1 of Feller, so it has a pedigree) only to the extent that they motivate the utility of general-purpose analytic tools. They are an appetizer, not the main dish.
Which indicates it is roughly as hard as partial differential equations, the theory of relativity and just a tiny bit easier than some of the quantum mechanics.
This is pretty unintuitive for a subject, which mostly relies on multiplication and addition.
The dozen or so posts discussing the intuition of the Monty Hall Problem are a case in point.
> They are an appetizer, not the main dish.
This is certainly true.
That's a tautology.
Plenty of studies, such as the work by Kahneman and Tversky, show that humans by default have incorrect statistical intuitions. These faulty intuitions are hard to overcome, even by a considerable amount of training.
> The Monte Hall problem is more of a curiosity than a fundamental principle!
It's quite straightforward conditional probability. That so many people, including trained mathematicians, get it wrong is quite illustrative. And it's not unique ... the coins and drawers problem is similar, and one can craft many others. MH is not a mere curiosity, it's simply well known.
No it is not, unless it is explicitly stated that Monty knows where the car is and that he deliberately opens a door with a goat. Just look at the discussions in the comment here.
That has been part of the explicit problem ever since it was first presented back in 1975.
"Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, 'Do you want to pick door No. 2?' Is it to your advantage to switch your choice?"
- allways open a door
- allways open a door with a goat
- and not open a door at random
Often discussion of the solution reveals that this is not clear.
No, if both you and Monty pick a door at random, there’s 1/3 chance of a car behind each door. If Monty’s door reveals a goat, it’s 50/50 for the remaining two doors.
It’s mandatory to specify Monty’s procedure precisely.
I won't respond further.
Agree, but my point is that the Monty puzzle is a bad example to use educational if not careful.
My point is that the goal of the course should be to understand principles, not to teach people that their existing intuition is faulty. Who cares about their prior condition of ignorance?
For more, see my reply nearby.
Why does multiplication coupled with some sort of integral calculus work the way it does? We multiply to get moments of a distribution, we multiply to convolve, we multiply to get the work done on an object. I suppose the answer is multiplication allows us to scale some function f(x) with some function g(x). But I guess I want something deeper and I feel like I'm missing it.
You are wrong about people being good at finding correlations.
I rarely met people who can process a sufficiently large sample size in their memory to calculate any significant correlation results. Whereas guessing correlations from charts exposes you to a number of optical illusions, which will fool the brain into seeing things that don't exist.
There may be a propensity to make more type 2 errors and see correlations between any random things such as 5G and COVID, but I haven't seen any research on that.
Yes, this is especially true when first learning; however, one can still develop intuition so that it serves as an invaluable motivator and guide through difficult problems.
It is known in academic circles as Monty hall and when it pops up in popular media, it is also referred to as Monty hall.
... It became famous as a question from a reader's letter quoted in Marilyn vos Savant's "Ask Marilyn" column in Parade magazine in 1990 (vos Savant 1990a):
Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?
For one, by doing the simulation part directly it's easier to see the "under repeated sampling..." logic inherent in frequentist procedures. Additionally, it's possible to do simulation-based procedures where traditional methods break down (think: permutation tests).
Seeing Theory interactivity is very interesting. I think if there is one canonical example to tie it all together it would be something akin to "estimate the likelihood of an extremely rare event". Say, you're a top astrophysicist at NASA and you have to give the President a briefing on the improbability not impossibility of an extinction level asteroid event. And you must justify how those beliefs are informed by and change with data. It ties everything together: physically based world models, event spaces, conditional probabilities, monte carlo sampling and entropy estimation. And would be really fun to boot!
Imagine how much their thought process would change if they intimately understood how scientific modelling works.
Generative models map well to programming concepts. Mixtures are quite similar to composition, and hierarchical models can be understood as inheritance. Lots of classical models like HMM, LDA, etc are quite similar to those presented in the GoF book in the sense they combine composition and inheritance in some particularly interesting manner.
What textbook(s?) would you recommend for a thorough self-learning of statistics? I’m looking for both intuition _and_ mathematical rigor — not all proofs, but not all fluff either.
I’m a bioinformatics student and I will have a semester of combined probability/stats some time this year, but I think that won’t be enough to support me given my preference for DS-based bioinformatics jobs.
I’m reading Feller right now for the probability stuff, but I’m unsure about statistics. I don’t even know what the relation between probability and statistics is — most similar questions I found online (i.e. “How to learn stats?”) are answered with a “Read this probability book and you’re good”.
My background is computer science and I had a similar experience. Just a caveat: I'm not arguing that we should stop teaching theory, quite the contrary: most of the times we err on the side of the fluff. In particular, the fact that many reputable institutions are cutting formal logic, computability theory, etc. from their CS curriculums is an absolute disgrace. Intuition is hard to teach (easy to fall into the 'monads are burritos' trap) and it's something you have to work for yourself if you want to develop.
My point is just that lack of intuition/operative knowledge will lead to your theoretical knowledge of the field being less in-depth and generally less helpful to you.
I honestly don't think it really matters what book you are studying as an introduction to a subject, as usually introductory courses are teaching well-established theory that everyone knows/agrees on.
If you have no prior knowledge, a decent starting point is this: https://www.amazon.com/gp/product/1981369198/
the author's website has similar content: https://www.statlect.com
Bioinformatics at my uni is just the typical CS minus some hardware stuff + molecular biology minus some chemistry stuff; in other words, I'm pretty close to CS, too. And I share your opinion — at least for me, I don't believe things until I see them proven.
> My point is just that lack of intuition/operative knowledge will lead to your theoretical knowledge of the field being less in-depth and generally less helpful to you.
In addition, building an intuition can help make the understanding come faster. To give you an example, I can stare at a proof for half a day and _then_ finally get it, but one clever diagram or a descriptive commentary can save me hours of pushing through the dense text — without cutting down on the rigour (as the proof is still there). Unfortunately, it seems that maths textbooks mostly come only with the former, or the latter, but not both.
> I honestly don't think it really matters what book you are studying as an introduction to a subject
I agree that it probably doesn't matter from the content POV (i.e. the basic definitions and theorems will be there), but it could matter if we take the intuition into account.
For example, in real analysis, there's baby Rudin, but there's also all sorts of books that include all (or most of) the content, but supply it with better commentary and/or illustrations to drive the point home quicker. And I'd day that's a pretty established field, too; probably more so than stats, in fact.
What is the deal with this? Why isn't stats commonly taught in school when it is by far one of the most prevalent disciplines? And why, on the rare occasion when it is taught, is it so abysmal? Statistics forms the basis for all of science, for god's sake. I've since developed a patchwork understanding of statistics on my own from various resources I've found the time to consume. For the record, I grew up in the US.
> I’m reading Feller right now for the probability stuff, but I’m unsure about statistics.
Probability is the study of mathematical objects, and nobody is totally sure if any of them exist even in the approximate. Is anything in the universe random? The question is open, and likely to eternally remain so. Lots of things look similar to a random variable if viewed from the right perspective, but most of them aren't actually random. Not really a problem for the mathematicians, they feel no special need to study things that exist.
Statistics is roughly the study of how to deal with actual results. If you do a census, those results exist. Statisticians then need to make decisions about how to think about their results, and usually fall back on models rooted in probability. Technically speaking, "a statistic" is "any quantity computed from values in a sample". 
Basically, statistics is probability + data.
Are these names of actual books (Google doesn't help) or merely the themes of the stats textbooks you benefited from the most?
Here is one on the Odds Ratio for example https://www.bmj.com/content/bmj/320/7247/1468.1.full.pdf
That's a great question, and I think the lines are more than a little blurry.
My attempt at an answer would be:
Probability: Given a set of dice and coins and an order for rolling and throwing them, what is the chance of a specific outcome?
Statistics: Given a set of outcomes, what dice where rolled?
So if you want to know if smoking kills, you tally up medical history, and use statistics to see if there is a relationship between smoking and dying.
If you want to know the probability of smoking killing you, you look at the risc each cigarette brings to the table and tally it up using probability theory.
More elegantly phrased examples can be found on Stack Overflow: https://stats.stackexchange.com/questions/665/whats-the-diff...
That said, there's a chance I just read it late enough in my career to be more ready for its content.
I love the premise: "if you know how to program, you can use that skill to learn other topics."
Perhaps someone here can speak to their experience with some of these books?
Maybe it is too basic for you, but It is focused on the intuition part and I can recommend it!
Sadly the project is no longer actively developed but if you haven't seen it yet, you should definitely check out for inspiration: https://github.com/metacademy
A nice trick to visually solve this in your head I heard once is:
If you think of rolling two dice as a square. X and Y are each dice. You get a 36 square board. Getting 1 six is just the upper boarder. 6 on the top, 6 on the right (6 and 6 overlap). So 11 out of the 36 squares.
And here is the board:
In studying probability, I found that accounting for the "overlap" as you described it was more tedious in more complicated problems than just always calculating the joint probabilities and inverting them.
Is it because there are many more amateur statistic textbooks in existence, or published attempts at one (so more chance for a runaway success to be picked up)?
Or is it because people in the statistic textbook industry don't feel this frustration and/or don't dare to take any risk?
Numbers do not exist outside of human cultures.