Answer is impossible to know without knowing n. If there are n=0 red balls proba...

simiones · on Jan 31, 2024

There are 9900 possibilities for the state of the urn after the first ball is picked:

1. n = 1, you can pick any of 99 green balls 2. n = 2, you can pick any of 98 green balls and a red ball 3. ... 100. n = 100, you can pick any of 99 remaining red balls

If you count the situations in which you will pick a red ball next, you'll see that there are more than the ones where the next ball is green.

Intuitively, given that you picked a red ball, you should expect there to be more red balls in the urn. And the effect of having more red balls than green ones outweighs the effect from removing just one ball.

pvaldes · on Jan 31, 2024

> Intuitively, given that you picked a red ball, you should expect there to be more red balls in the urn.

This is what a math person would say, I had seen it many times in science, but is not the correct answer. The correct answer would be: "my sample is too small to carry so much information as I'm claiming that it has".

Intuitively If I walk on the street and I see a woman, I can't say anything about the proportion of men and women in the area, absolutely nothing, except: "number of women > 0". This is my result. Is dull and not publishable but also the only that I can infer about the population, because is under-sampled.

If the experiment is to take a ball from a bag, see its color and think that I can calculate the number of red balls with only this info, my goals are not realistic

I need to keep sampling. (not repeating, sampling the same population, doing ten times an experiment that provide tiny amounts of info will not generate new info about the first population from the air. The populations are independent). You need to improve the size of your sample over the same population.

Lots of the early ecology models that are flawless from a math point were basically useless when applied to the real life exactly for this.

simiones · on Jan 31, 2024

Sure, the actual difference in probability is minute, so it essentially has 0 predictive power.

That is, statistics also tells us that P(n>50 | first ball is red) ~= P(n>50) ~= P(n<50) ~= P(n<50 | first ball is red).

Of course, the problem statement is not realistic in the slightest because it gives you too much other (critical) information as well: you are told that n has a uniform probability distribution. In reality, you never know the probability distribution a priori (even when analyzing a die or coin, you can't be a priori certain it is fair). And the conclusion in this problem, even weak as it is, depends critically on knowing the probability distribution. Not only would the conclusion be different is n was not uniformly distributed between 1 and 100, but you can't even do a similar analysis over all possible probability distributions.

j7ake · on Jan 31, 2024

You're almost there with your intuition.

You just now need to simulate drawing two balls without replacement for each of your 101 urns (from first urn with n=0 to last urn with n=100).

Then you take the conditional probability of seeing the first ball as red, and calculating the number of times you see the next ball as red.

You will find that you are more likely to get red than green.