If you had a gambling game that was simply "heads or tails, even money", you wou...

munchbunny · 2023-09-12T21:38:06

The intuitive explanation is that the effect of a single sample on the average diminishes as you take more samples. So, hand-waving a bit, let's assume it's true that over a large number of trials you would expect the average to converge to 0. You just tossed a coin and got heads, so you're at +1. The average of (1 + 0*n)/(n+1) still goes to 0 as n grows bigger and bigger.

That skips over the distinction between "average" and "probability distribution", but those are nuances are probably better left for a proof of the central limit theorem.

hddqsb · 2023-09-13T12:05:45

There are a couple of confusions/ambiguities here.

The Law of Large Numbers is about the average, so it's not relevant here (an average of +1 would mean you got heads every single time, which is extremely unlikely for large n).

If you are looking at the sum, then the value depends on whether the number of trials (n) is even or odd. If n is odd, you would indeed get two peaks at 1 and -1, and you would never get exactly 0. If n is even, you would get a peak at 0 and you would never get exactly 1 or -1.

The expected value (aka average) is a number, not a distribution. The expected value for the sum is 0 even when n is odd and you can't get exactly 0 -- that's just how the expected value works (in the same way that the "expected value" for the number of children in a family can be 2.5 even though a family can't have half a child). If you look at the probability density function for a single trial, then it does have two peaks at 1 and -1 (and is zero everywhere else).

The curve you refer to might be the normal approximation (https://en.wikipedia.org/wiki/Binomial_distribution#Normal_a...). It's true that the normal approximation for the distribution of the sum in your gambling game has a peak at 0 even when n is odd and the sum can't be exactly 0. That's because the normal approximation is a continuous approximation and it doesn't capture the discrete nature of the underlying distribution.

ineptech · 2023-09-12T19:43:38

"The expected value of a random variable with a finite number of outcomes is a weighted average of all possible outcomes." -- https://en.wikipedia.org/wiki/Expected_value

alexb_ · 2023-09-12T19:46:28

That makes sense, I was always thinking of it as "Given an infinite number of trials..."

tnecniv · 2023-09-13T02:40:30

That would be the frequentist interpretation. A Bayesian would say that probability is to be interpreted in belief that an outcome will occur. Neither is really right or wrong, it depends on what you're modeling. If we’re analyzing some kind of heavily repeated task (e.g., a sordid night of us glued to the blackjack table where we play a lot of hands or data transmission over a noisy cable), a frequentist interpretation might feel more sense. However if you’re talking about the probability of a candidate winning an election, you could take a Bayesian view where the probability asserts a confidence in an outcome. A radical frequentist would take umbrage with an event that only happens once. However, I suppose, depending on your election rules and model (e.g., a direct democracy), you could interpret the election winner in a frequentist manner: the probability of winning is the rate at which people vote for the candidate. For a more complicated system I’m not sure the frequentist view is as easily justified.

However to answer your question more directly, the expected value is just another name for the average or mean of a random variable. In this case, the variable is your profit. Assume we’re betting a dollar per toss on coin flips and I win if it’s heads (everyone knows heads always wins, right?). The expected value is probability of heads * 1 - probability of tails * 1. If the coin is fair, the probabilities are the same so the expected value is zero.

Aside: sequences of random variables that are “fair bets” are called martingales and are incredibly useful. It’s a fair bet because, given all prior knowledge of the value of the variable thus far, the expected value of the next value you witness is the current value of the variable. You could imagine looking at a history of stock values. Given all that information, it’s a martingale (and thus a fair bet) if given that information your expected profit from investing is 0.

ineptech · 2023-09-12T21:15:12

Whether/when its better to think in terms of "X has a 37% chance of happening in a single trial" vs "If you ran a lot of trials, X would happen in 37% of them" is kind of a fraught topic that I can't say much about, but you might find this interesting: https://en.wikipedia.org/wiki/Probability_interpretations

bonoboTP · 2023-09-13T10:35:03

You're on the right track. The only thing you're missing is that adding (averaging) two bell curve distributions that are offset by a little does not necessarily give a bimodal distribution. It will only be bimodal if the two unimodal that you are adding are placed far enough away from each other.

See this https://stats.stackexchange.com/questions/416204/why-is-a-mi...