
A Counterintuitive Probability Game - EvanWard97
https://www.evanward.org/posts/a-counterintuitive-probability-game
======
dsukhin
This is a super facinating observation. At first it does seem very
counterintuitive but after a little examination, the resulting probabilities
make a lot of sense.

For sake of easy math, pick C, your threshold, to be fixed at 0, which is
halfway between -inf and inf.

Now there are 4 scenarios. The numbers player 1 chooses, A and B, can be both
above or both below C with probability 1/4 each. In both those cases, you have
a 50/50 chance of being correct on which number is larger depending on which
number you chose to see.

Then, with the other 1/2 of the times, one number will be above and the other
will always be below C. In those cases, the selected strategy will have you
always choose the correct highest number.

All together you are right 1/2 + 2 * 1/4 * 1/2 = 3/4 = 75% of the time. Simple
probability but totally counterintuitive from the onset.

Now what if C!=0 or numbers are not being selected uniformly by your opponent?
You can replace the 1/2's with (p) and (1-p) in the right spots without making
any distributional assumptions and it seems (without doing the math out fully)
that things cancel nicely and show that you always have a strictly greater
than 1/2 chance of guessing right no matter what. Exercise left for the reader
:P

~~~
esrauch
I think it can be more intuitively explained without getting into math: if you
know someone wrote down two random numbers between 1 and 10. You look at one
and have to guess if the other one is higher or lower.

Say you flip over a 2: what do you think is the right guess?

~~~
ChristianBundy
Of course, but I'm having trouble understanding why this works on an infinite
number line. Say you flip over a 2,387,290,723,013,172,348,238,987: what do
you think is the right guess?

~~~
tobbykop
Its the largest because its positive, 75% I’m right.

And if we are only looking at positive numbers in an infinite axis I’d need to
borrow your generator for a test to approximate its midpoint (which obviously
it doesn’t have), which I then weigh against the original number with the
logic “if it’s larger than what I got, so is your other number. If it’s
smaller so is the other number.” Now since the chance for my number being the
middle of the three drawn is 1/3(only case I lose), that gives me 66% chance
of winning that bet.

------
greenbay20
Note that the paper claims that no matter how player A chooses the two numbers
to write down, you can always guarantee >50% chance in guessing which one is
the higher one after only looking at one of the numbers. The reason this seems
counter-intuitive is that the only information you're given is one of the
numbers. You are not told anything about how the first player picked her
numbers. Now, the reason why OP's code does not seem counter-intuitive to most
is that it shows a quite different result. The OP assumed a specific
distribution (which is a strong assumption, one the paper does not make) and
found a strategy that yields > 50%.

~~~
cortesoft
Does this hold true if Player 1 is adversarial?

If I was the Player 1 in this case, and was trying to prevent Player 2 from
guessing with a greater than 50% probability, I would use this method:

Pick a random 100 digit number N (using a random number generator to pick 100
digits in order).

Flip a coin. Heads, the second number is N - 1. Tails, it is N + 1.

I would imagine you could slightly beat 50%, but the percentage would approach
50% as you added more digits to your initial number. It would probably be
close to unmeasurable at 100 digits (i.e. you couldn't tell if you were doing
better than 50% or not).

~~~
nimih
The result holds even if Player 1 knows Player 2's strategy and attempts to
foil it, although as you correctly observe, the paper only claims >50%, and
Player 1 can minimize that margin by picking the two numbers to be right next
to each other (and trying to choose them such that they occur in a low-density
area of Player 2's distribution f(t) ).

~~~
cortesoft
Yeah, as I was working it out in my head, I was realizing that it would
approach 50% but never hit it, since no matter the distribution you choose,
the higher number will always be greater on average than the smaller (by
definition). You can approach an equal average as you choose a larger and
larger range, but never hit it.... which I guess is what the paper is proving.

Practically, though, you could get close enough to 50% that it would be
immeasurable by sampling.

------
nabdab
I’m really struggling to see how this is counter intuitive? You pick one of
the numbers, and if it’s positive you say “this is likely the largest” and if
negative “this is likely the smallest”. Since there’s a 50/50 chance for each
sign, and all positive numbers are larger than all negative numbers. That
already gives you the minimum 1/2 probability they are asking for. But you’ve
got an improvement because not only does the other number have to match the
sign for you to lose, it also has to be larger in magnitude, and that happens
in half the cases where they have the same sign, so that gives you 1/4 extra.
You end up being correct 75% of the time. The thing with a random number in
between is just an odd way of reducing your probability of winning by adding
randomness to the split, what that’s supposed to show I’m really confused by.

It feels like the “puzzlement” you are supposed to feel that you can beat 50%
comes from people ignoring the fact that you can look at the number before
making your call.

~~~
com2kid
It is confusing (and I am still confused) because my understanding is:

1\. Person A picks 2 random #s 2\. Person B picks a random split #. 3\. Person
B looks at one of the random #s. Based on its relationship to the arbitrary
split # he choose, he then decides it the other random # is larger or smaller
than the one he has in hand.

I am confused as to how choosing a random split # has any impact on the
probability of the other paper in hand. I'd normally think that each random #
choice is an independent event and that my "split #" has no impact on the
system as a whole.

Like, you choose the numbers -500 and 250.

I choose to split at 1000. Or 200. Or ten billion.

Obviously the math works, but why does my picking another random # make a
difference in the system as a whole? The relationship of all the #s to each
other is still presumably completely random.

Would this work if I am instead handed two numbers, and I have to guess if a
third # is greater than the first number, and I make a choice based on the
relationship of the 1st number to the 2nd number? Since the 2nd number is
random, it shouldn't matter who provides it, right?

So stating it that way,

"Here is 200, 750, and some third number, is the third number higher or lower
than 200?"

Does that still work?

~~~
MauranKilom
The important part in the original is that every choice of splitting point has
nonzero probability. This means you will _eventually_ land on a splitting
number between the two where you will then have 100% chance to be right
(instead of 50% chance for all the other times), pushing your average above
50%.

If the splitting point is given to you (by an adversary, not an actual random
process), this nonzero probability for any given number is not necessarily
fulfilled (at least you didn't indicate so).

------
gentaro
I may be missing something here, but I'm not sure this is quite right.

The paper attached describes a scenario where the player A chooses a number
between -infinite to infinite. The simulation uses ranges -1000000 to 1000000,
it's not a surprise that the strategy works in this case.

I think if the ranges were truly -infinite to infinite, this strategy would
fall apart. No matter where you set C, there are infinite values above and
below it.

~~~
dsukhin
If you have a range (-inf, inf) and you split it at some random C, even though
there are still infinite values on either side, the probability you choose
another number on the left side of C is (p) and on the right side is (1-p)
regardless of how you choose the next number. That's what makes these
probabilities finite and makes the math work out.

~~~
OskarS
But... but... you _can 't_ "pick a number uniformly distributed in (-inf,
inf)". Like, that's not a thing that is possible [0], so saying anything about
the probabilities once you've done so is not reasonable. You can pick a random
number in infinite range non-uniformly (e.g. to pick a number in the range
[1,inf), pick a number X in (0,1] and use 1/X), but then the probabilities are
no longer so neat and tidy.

The game only makes sense if you attach limits to the numbers the other person
is allowed to pick, and then the only "split" that works is in the middle of
the range. At which point the game is sort-of "obviously true".

[0]
[https://math.stackexchange.com/a/14169/641751](https://math.stackexchange.com/a/14169/641751)

~~~
MauranKilom
It is not necessary for the distribution to be uniform, it just has to be
nonzero everywhere.

~~~
kgwgk
If the distribution of A and B is uniform, the probability that C from this
proper distribution is between A and B is zero.

If the distribution of A and B is also a proper distribution the probability
that C is between them is positive. But then the distribution of A and B has a
middle point such that 50% of the numbers are above and 50% below and the
problem is trivial.

------
laegooose
Strictly speaking, the _probability_ that Player 2 did correct choice is not
defined unless the random distribution used by Player 1 is defined.

Consider a paradox: Player 1 picks a random number X, writes it on a slip of
paper, and 2X on another slip, and puts them randomly in front of Player 2 as
"left" and "right". Player 2 wins amount of money written on a slip he chose.

Let's say he is about to pick left one (but didn't see it yet). Let's say left
has number Y. The right one has either Y/2 or 2Y, with 50/50 probability.
Which means right one is more profitable to pick, because it has 1.25Y on
average!

~~~
dfelix
Player 1 is not constrained to choose numbers randomly from a distribution.
Player 1's choices are considered to be arbitrary. Player 2's probability of
winning can then be calculated from Player 1's choices and Player 2's
privately chosen density function.

------
bko
From the paper, the only conditions of the first player (the person picking
the initial two random numbers) is that they are distinct. They don't have to
be random and (as the implementation has) bound by the same range as the
second player. If you use a different bound range for the two players, you get
about 50/50 odds.

The part I changed is scaling down the first players range by a factor of 100

a=random.randint(-R//100,R//100)

b=random.randint(-R//100,R//100)

------
markisus
I think I drew a picture that could explain the effect. In the picture, player
2's lower number is A, and their higher number is B.

[https://imgur.com/a/DKCtH81/](https://imgur.com/a/DKCtH81/)

Red is the region where you lose, and Green is the region where you win.

~~~
kgwgk
Now imagine zooming out to show the whole real line and it’s clear why the
“guaranteed win” doesn’t ever happen :-)

~~~
markisus
And it can get even worse because your adversary can choose A and B as close
together as they want.

------
lonelappde
The "solution" relies on the fact that positive numbers arbitrarily close to 0
are still technically positive. If you assume that the range of numbers is
finite, the solution is much more intuitive. If you assume that the range of
numbers is actually infinite, and don't use a misleading python program, a win
ratio of "0.5 + unknowably arbitrarily small episilon" is the least impressive
"strictly greater than 0.5" you can imagine.

You can't win real money gambling in this game if you actually have infinite
range of numbers.

~~~
dTal
The improvement in guessing accuracy over 50% is a measure of the overlap of
the density functions used by the two players (specifically, I believe it is
the area under both curves). The result notes that if Player 2 uses a density
function +/-inf, they can guarantee at least epsilon overlap.

However, it's interesting to note that, practically speaking, if you played
this game in a pub, Player 1's density function would likely be _highly_
predictable -especially if they came up with the numbers mentally. As long as
Player 2 chooses a reasonably similar density function they can likely do
quite significantly better than 50%. It would take some patter, and you might
lose a friend, but I could see it being possible to win "real" money this way.

------
zxcmx
This still makes no sense to me. Picking a random number C adds no
information, so I can't see how it can influence the accuracy of your
decision. Just from a naive information theoretic point of view.

If it's possible to get some epsilon of advantage by choosing a C once,
then... can you do it _more_? Can you combine or average more of them? Is 1
optimal? Why?

I fee like there's some shenanigans here regarding assumptions about the
distribution of A, B and C but I can't really put my finger on it.

------
KwisatzHaderack
Very interesting, thanks for sharing.

If C is chosen from the exact same distribution as A and B, I wonder if the
strategy works even better than if C were any other random distribution. My
intuition says yes.

~~~
ryandrake
The code linked in the article demonstrates this.

------
kgwgk
> the benefit of using the entire Real number line as the possible range for C
> is to ensure that you at least sometimes

This may be a probability zero event unless you make some additional
assumptions about how the numbers are selected. The whole real line is quite
big and the probability of two finite intervals overlapping is null.

> choose a value in between the ranges the other player is selecting numbers
> from

~~~
lonelappde
It's not probability 0, it's a number greater than 0 *but it's impossible to
prove a lower bound greater than 0".

So, 0 in the real (ha!) world.

~~~
kgwgk
[https://rationalwiki.org/wiki/Probability#Zero_probability](https://rationalwiki.org/wiki/Probability#Zero_probability)

------
deckar01
If you have ever played a card game where you have to guess if the next card
is going to be higher or lower than another card, this should not be a
surprising result. You should always assume the next card will be the average
of all possible cards. What makes these kinds of games fun is having to
recalculate the average card in you head based on the cards that you can see.

~~~
lonelappde
This is a different problem. Cards are finite. The OP problem is infinite.

~~~
deckar01
The set of real numbers used in the simulation is finite, but it doesn't
prevent the result from being approximately the same. Using a small number of
cards is no different. The best solution should be obvious, but was not
presented in the cited paper, and was glossed over in this example. The more
interesting result to me is that there is no difference between choosing the
larger of two finite sets and choosing the more probable of two infinite sets.

I think the whole pick a random number and modulate the choice based on it is
rather trivial. Knowing that you are starting with a 75% success rate with the
optimal strategy, you should be able to produce a strategy with an arbitrary
success rate P=[0.25, 0.75] with one very simple extra rule: X% of the time,
choose the non-optimal option. P = 0.75 - 0.5*X. If you want to construct a
66% strategy, choose sub-optimally 0.166% of the time.

------
tomas789
I’m struggling with choosing a random number t from (-inf, inf) where P(t) >
0.

~~~
szemet
Any distribution is good with these properties, e.g. just use standard normal
distribution, it is available in every statistics libraries, but easy to
generate from uniform random reals, for example:

[https://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform](https://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform)

Also wikipedia has an easy to understand summary too, in the two envelope
problem article:

[https://en.wikipedia.org/wiki/Two_envelopes_problem#Randomiz...](https://en.wikipedia.org/wiki/Two_envelopes_problem#Randomized_solutions)

Here it explains with numbers larger than 0, and uses exponential distribution
(even easier to generate from uniform random variable).

~~~
kangnkodos
Almost every number chosen randomly between negative infinity and infinity
would have so many digits that it would not fit on any computer currently in
existence. It would not fit on a piece of paper. In the original problem, a
number is written down on one piece of paper. That limits the possible numbers
involved considerably.

If instead, the game was played between two supernatural beings with access to
an inifinite piece of paper, and infinite time to write the numbers down, that
would be a very different game.

This demonstrates that a sequence of coin flips which could be carried out in
the lifetime of a human, or by a computer while the universe exists, converted
to a number between 0 and 1, and then mapped to the whole number line is not
random between negative infinity and positive infinity. Because of the small
number of coin flips, you will never end up with any of the really big
numbers. And the really big numbers is where it's at.

~~~
szemet
Generally mathematical theorems involving real numbers, do not care about
computability (except if you are a finitist
[https://en.m.wikipedia.org/wiki/Finitism](https://en.m.wikipedia.org/wiki/Finitism)
) ;)

On the other hand my intuition is: flipping a coin (generating next digit of a
real number between 0 and 1) and then trasforming it to e.g. an exponential
random variable should always take you to the point where you can decide about
the ordering in countably finite coin toss, with probability one...

~~~
lonelappde
Mathematicians don't deserve the right to amaze us with counterintuitive
phenomena that can only exist in abstract theory outside the physical
universe. All that shows is that their mathematical structures aren't faithful
to the physical reality they pretend to model.

~~~
rhencke
A Turing machine runs on a theoretically infinitely long tape in both
directions. That doesn't make the concept of a Turing machine useless.

------
godelski
From reading the comments, it seems the counter-intuitive part is that a bound
like (-inf, inf) is different than a bound like [0,10].

------
Ragib_Zaman
Great problem OP.

1 - As some people have already clarified for other commenters, it indeed
makes no difference how Player 1 picks their numbers. They can pick them from
some distribution of their own, or in an adversarial manner. The probability
of winning by following the strategy in the paper is still strictly greater
than 1/2.

2 - In fact, even if Player 1 can read Player 2's mind and knows their
strategy and even the exact distribution they will sample from (but can't see
into the future to see the sample from the distribution), the probability is
still strictly greater than 1/2.

3 - Since it isn't actually included in the paper or any of the comments, for
the sake of completeness I'll write down the computation.

Let P(E) denote the probability of an event E, and W be the event that Player
2 wins by following the strategy suggested in the paper. Let a, b be the
smaller, larger number respectively. A is the event that Player 2 picked a, B
is the event that Player 2 picked b. Then summing over disjoint events,

P(W) = P(A and W) + P(B and W) = P(W | A)P(A) + P(W | B)P(B)

We have P(A) = P(B) = 1/2\. Now let x be the result of Player 2 sampling from
their distribution. Given that they picked A, they win if and only if a <= x,
so P(W | A) = P(a <= x). Given that they picked B, they win if and only if x <
b, so P(W | B) = P(x < b). Therefore,

P(W) = (1/2) [P(a <= x) + P(x < b)] = (1/2) [1 + P( x in [a,b) )]

4 - If I were to show this problem to someone else, I may try to emphasize the
potentially adversarial nature of Player 1 and the odds seemingly being
stacked against Player 2 by phrasing it like this (although this may be _over_
exaggerated):

 __* Player 1 gets to write down any two distinct real numbers on two pieces
of paper, and then flips a coin. Player 2 gets to see the number on the left
if the coin lands on Heads, the number on the right if the coin lands on
Tails. After seeing the number, Player 2 must declare whether they are seeing
the smaller or larger number. If Player 2 guesses correctly, they win $1 from
Player 1. Otherwise, Player 2 pays $1 to Player 1.

Further, now knowing the rules of the game, both players can choose any
particular strategy of playing the game. However, whatever Player 2 chooses as
their strategy, they must inform Player 1 of their strategy and not deviate
from it when the game is played. Player 1 is allowed to adjust their strategy
after hearing Player 2s strategy. Would you prefer to be Player 1 or Player 2?
__*

I think worded like that, many peoples first guess might even be Player 1.
Then their next answer would be that it doesn't matter, and then they're in
for a treat when they see that they should choose to be Player 2!

~~~
kgwgk
> Then their next answer would be that it doesn't matter, and then they're in
> for a treat when they see that they should choose to be Player 2!

But player 1 can make the edge of player 2 as low as he wishes so it really
doesn't matter.

