
Some Useful Probability Facts for Systems Programming - yarapavan
https://theartofmachinery.com/2020/01/27/systems_programming_probability.html
======
unoti
I want to share a trick with you all.

> Because the monster battles are statistically independent, we can make a
> rough guess that about 1 in 9 players still won’t have a special item after
> 20 battles.

From a game design perspective, for some situations this can be a real problem
that leads to players getting discouraged and quitting, which leads to loss of
revenue. In the games I've made, I've often used a special trick for
overcoming this problem. Rather than using plain random numbers, I
conceptually use a shuffled deck of cards, where each player receives a
different random seed for the shuffle and I maintain state on how many "cards"
they've drawn so far. I put one "win" in the deck and some number of non-wins.
This way it's still random, but if people keep at it they'll eventually get
the thing that they want. With plain raw probabilities, you'll always have a
certain number of customers that keep paying to try and keep failing. The
random shuffle is a way better way to approach this from a customer service
perspective.

This makes it easier to reason about from a game design and customer service
perspective. I can say that no player will ever have to go through more than X
attempts. I say it's a customer service issue because in many of my games the
players feel like there's real money on the line, and so it can literally lead
to customer service tickets, wailing, and gnashing of teeth.

~~~
kazagistar
An additional consideration is that predictability can lead to unintentional
gaming of mechanics. For example, there are WoW addons that keep track of your
bonus damage RNG trinkets to tell you when your RNG is hot and when its cold
so you can optimize when you use some big finisher.

~~~
VRay
This got really bad in World of Warcraft: Legion, when people were deleting
and re-creating their characters to try and get better loot.

The game had these "Legendary" items of incredible power you could attain that
were insanely rare, but had a sort of "pity timer" where you'd get a higher
and higher chance of obtaining one based on how long it'd been since your last
item.

The problem is that the legendary items were almost inconceivably rare drops
through pure random luck, but the escalating chance was really predictable.
You were all but guaranteed to never get a legendary item through pure chance
alone, but you were basically guaranteed to get one every two weeks if you did
all the time-limited content available. (Dungeon raids you can do once per
day/once per week, daily chore missions, etc).

For the first few months of the system's availability, after 4 legendary items
you'd no longer get the "pity timer" bonus to your loot rolls on legendary
items, so your character was effectively capped at 4 legendary items.

Some of the available items could increase your character's combat
effectiveness by a relatively huge margin compared to others, and it only took
about a week to power-level a new character and get it one or two legendary
items, so for a while people were abandoning their max level, legendary-geared
characters with 100+ hours invested into them to try again.

~~~
henrikschroder
Speaking of WoW, I got hit by the statistics when they introduced an
achievement for the in-game valentine's holiday.

Each player got a bag that you once an hour could take a random piece of heart
candy from. There were a fixed number of heart candies, the bag worked for a
week, and you needed one of each to get the achievement.

One random draw once an hour for at max a week is a limited number of
attempts, what are the odds of never getting a certain heart piece? About one
in a million or so. How many players did WoW have at that time? Over ten
million. So you're pretty much guaranteed that even if you never missed an
opportunity to draw a piece of candy, some players would never get one of
each, and never get the achievement... Statistics is a bitch.

(They implemented pity timers for every similar achievement afterwards, as far
as I know.)

~~~
thaumasiotes
> what are the odds of never getting a certain heart piece? About one in a
> million or so. How many players did WoW have at that time? Over ten million.
> So you're pretty much guaranteed that even if you never missed an
> opportunity to draw a piece of candy, some players would never get one of
> each, and never get the achievement...

This extends the time to get the achievement from one year to "for 0.0001% of
players, two years".

That will annoy those, um, ten people, but "getting the achievement next year"
is pretty different from "never getting the achievement".

~~~
henrikschroder
Yeah, because the achievement was part of the meta-achievement What A Long
Strange Trip It's Been, and if you were one of those ten people, like me, it
was pretty damn annoying. (There were actually quite a lot more of us, since
very few people drew the maximum amount of candies from the bag, so the
realistic odds of never getting it despite your best efforts was maybe one in
100k or something?

Anyway, they fixed it retroactively for the first year somehow, so that was
nice.

~~~
thaumasiotes
> Yeah, because the achievement was part of the meta-achievement What A Long
> Strange Trip It's Been

This severely _weakens_ the case that it was a problem that needed to be
fixed; if you're getting the Valentine's achievement because you want What A
Long, Strange Trip It's Been, the extra waiting time you suffer from not
getting Valentine's the first time around is lowered. (Because you would have
had to spend a lot of that time waiting anyway, for the other calendar-based
achievements that you didn't already have.)

------
fyp
Birthday paradox is way too important to omit from this list!

For example if you're uploading stuff to a bucket, you can compute its hash
first to figure out if a duplicate already exists and if so, skip the upload.

Why can you do this? What if it was just a hash collision? Shouldn't you still
compare the contents to _really_ make sure they are the same?

Turns out if your hash function is N bits you will need to have 2^(N/2) items
before you see two hashed to the same thing by chance. If you choose a 256 bit
cryptographic hash function like SHA256, that's 2^128. This probability is so
low you have a higher chance of encountering a cosmic ray bit flip!

[https://en.wikipedia.org/wiki/Content-
addressable_storage](https://en.wikipedia.org/wiki/Content-
addressable_storage)

[https://en.wikipedia.org/wiki/Birthday_problem](https://en.wikipedia.org/wiki/Birthday_problem)

~~~
ebg13
> _Turns out if your hash function is N bits you will need to have 2^(N /2)
> items before see two hashed to the same thing by chance._

No no no. That's not how probability works. You will see that many on average,
but you don't " _need to_ " anything before you can see a collision. You could
get collisions on the next 100 files. It's just unlikely. The bitspace bounds
the denominator of match probability for each file independently, it does not
count files. Unlikely events happen all the time _somewhere_.

~~~
fyp
The precise statement is that you need to have O(2^(N/2)) items before the
_probability_ of finding a hash collision is greater than 50% (or whatever
nontrivial percentage). English is hard and I think whoever cares about the
details can look it up themselves.

~~~
ebg13
> _I think whoever cares about the details can look it up themselves_

Your statement was an incredibly common and often repeated misperception of
probability and circumstance. Someone looking it up for themselves, as you
suggest, is more likely than not to encounter dozens of wrong statements on
the subject before they ever find a right one. I think it behooves us to not
add more misinformation to the pile.

------
kwantam
A missed opportunity: the balls-and-bins discussion didn't go on to discuss
the power of two choices.

In short, if you're tossing N balls randomly into N bins, you should expect
the most heavily loaded bin to get log N of the balls.

If instead you "toss" by choosing two random bins and then putting the ball in
the one with fewer balls, you should instead expect the most loaded bin to get
only log(log N) balls.

In practice, it's often not so hard to modify the first strategy into the
second to get dramatically more uniform load distribution among bins.

[http://www.eecs.harvard.edu/%7Emichaelm/postscripts/handbook...](http://www.eecs.harvard.edu/%7Emichaelm/postscripts/handbook2001.pdf)

(You might ask, if two choices are good, are three better? Yes, but only
slightly. You get an exponential improvement going from 1 to 2; after that,
testing more bins improves by constant factors.)

~~~
amalcon
> _You might ask, if two choices are good, are three better? Yes, but only
> slightly. You get an exponential improvement going from 1 to 2; after that,
> testing more bins improves by constant factors._

The intuitive explanation for this is that the largest benefit comes from
avoiding the most full bin. Two choices is always sufficient to do that.

------
nneonneo
Another one to add: if N is big enough, even statistically improbable events
like single bit flips in memory become likely or even nigh-certain. For
example, if you set up a bunch of fake domains that differ from real domains
in a single bit, you might get upwards of a dozen hits _per day_ from clients
clearly trying to reach the real service - this is called Bitsquatting:
[http://dinaburg.org/bitsquatting.html](http://dinaburg.org/bitsquatting.html)

For example, setting up the server "mic2osoft.com" (a one-bit-flip error from
microsoft.com) yielded this request:

    
    
        msgr.dlservice.mic2osoft.com 213.178.224.xxx "GET /download/A/6/1/A616CCD4-B0CA-4A3D-B975-3EDB38081B38/ar/wlsetup-cvr.exe HTTP/1.1" 404 268 "Microsoft BITS/6.6"
    

This is a machine trying to download some kind of update package _from the
wrong server_ because somewhere in its memory the "r" from "microsoft.com" got
flipped to a 2 (0x72 -> 0x32).

~~~
Chris2048
Is it possible someone typed that in manually?

------
air7
Worth mentioning is the The Waiting Time Paradox:

"When waiting for a bus that comes on average every 10 minutes, your average
waiting time will be 10 minutes." And worse "when the average span between
arrivals is N minutes, the average span experienced by riders is 2N minutes."

These realizations about Poisson distributions have a lot of real-world
implications such as packet traffic, call center calls, etc.

\- [https://jakevdp.github.io/blog/2018/09/13/waiting-time-
parad...](https://jakevdp.github.io/blog/2018/09/13/waiting-time-paradox/)

~~~
billforsternz
For some reason this is the post that took me back to a EE numeric analysis
and stats practical exercise nearly 40 years ago. A friend encounters a
miserable looking me standing on the street with a clipboard and a stopwatch.
Friend: "What the hell are you doing?". Me: "Testing the hypothesis that
passing traffic is Poisson distributed".

------
graycat
My favorite of such applications of probability is the Poisson stochastic
arrival (get _arrivals_ at discrete points in time) process. There are two
amazing, powerful, useful, non-obvious biggie points:

(1)Axiomatic Derivation

As in

Erhan Çinlar, 'Introduction to Stochastic Processes', ISBN 0-13-498089-1.

an arrival process with stationary (distribution not changing over time)
independent (of all past history of the process) increments (arrivals) is
necessarily a Poisson process. So there is an arrival _rate_ , and times
between arrivals are independent, identically distributed exponential random
variables.

Often in practice can check these assumptions well enough just intuitively.

Then as in Çinlar can quickly derive lots of nice, useful results.

(2)The Renewal Theorem.

As in

William Feller, 'An Introduction to Probability Theory and Its Applications,
Second Edition, Volume II', ISBN 0-471-25709-5,

roughly, with meager assumptions and approximately, if the arrivals are from
many different independent sources, not necessarily Poisson, then the
resulting process, that is, the sum from the many processes, is Poisson.

E.g., using (2), between 1 and 2 PM, the arrivals at a busy Web site, coming
from lots of independent Web users, will look Poisson and from (1) can say a
lot for sizing the server farm, looking for DDOS attacks, security,
performance, network, and system management anomalies, etc., e.g., do
statistical hypotheses tests.

Similarly for packets on a busy communications network, server failures in a
server farm, etc.

~~~
stuxnet79
Indeed, poisson processes are surprisingly common in networking.

------
pkilgore
We can all use excellent reminders like this about how random outcomes can be
distributed really unevenly.

For example, attributing even _some_ success to randomness (a reasonable
assumption, I would think) at least N fraction of successful people you see in
life are just lucky! If 1000 people flip a fair coin 10 times, there's a
really good chance (60%+) someone gets 10 heads in a row!

Optimize your existence as you see fit with that information! Generate more N,
spend more time with your family, try to move the needle on the amount of
randomness that contributes to your personal definition of success, etc.

~~~
hans1729
> Optimize your existence as you see fit with that information!

That sounds very intriguing, yet I don't really have an aha!-moment. Care to
give an example or two from your personal experience?

~~~
pkilgore
Sure.

I quit a job that put me in the top 5% of salaries to do something I loved
that gave me more time outside work, because I realized the woman I share my
life with was my "ten heads in a row" and not the job that depressed me.

Another example is that rather than sinking my time into one project, one
hobby, one organization, I tend to jump around a lot. That isn't to say I am
constantly in that state, I've just learned that eventually the coin will flip
heads and I'll be working with great people on something really interesting
that is worth my time to go deep. "Worth" here, not necessarily being
financial. It might be educational, or a cause I'm passionate about. Or fun.
This optimizes for N.

~~~
pkilgore
Also moving from law to programming, I changed the effect of randomness on my
success.

In law, I could kick ass, and still lose a case, because of many things that
are likely to be depressing to list in public.

In programming, if I kick ass, it deterministicaly leads to something that
works! That _is good_. Sure there is politics and popularity and all those
normal human problems too, but those same problems existed in law, so I'm not
losing anything there.

------
aequitas
However million-to-one chances crop up nine times out of ten.
[https://wiki.lspace.org/mediawiki/Million-to-
one_chance](https://wiki.lspace.org/mediawiki/Million-to-one_chance)

~~~
exdsq
My favourite quote from the series. He was such a great author.

------
ishi
Thank you for posting this, the explanations are intuitive and the real-world
examples really help one think about the implications of these probability
facts.

------
thanatropism
An easier almost-proof of the N trials with 1/N chance bit uses the
approximation log(1+x)~=x for small x.

See: let's find what is Q such that Prob(no successes in M trials)~=Q. This
is:

(1-(1/N)^M ~= Q

therefore

M log(1-1/N) ~= loq Q

Using the approximation,

-M/N ~= log Q

that is

exp(-M/N) ~= Q

M=N yields the result.

Now, if we slightly change the problem so Q is a probability threshold such
that Prob(no successes in M trials)>Q, we get an exact statement: since
x>log(1+x) exactly, -M/N > log(1-1/N) > log Q.

------
leto_ii
Not exactly related with the content of the article, but I was also reminded
of the German tank problem:
[https://en.wikipedia.org/wiki/German_tank_problem](https://en.wikipedia.org/wiki/German_tank_problem)

------
jedberg
This was a great and readable explanation!

Also it was a great example as to why you should never use “random” as your
load balancing algorithm unless you plan to always have 1/3 extra capacity.

Or conversely why you should always have 1/3 extra capacity if you must use
random.

~~~
skizm
That condition is only met if you send N requests to N routers. If you send
1,000,000*N requests to N routers, they will almost always be evenly
distributed.

~~~
jedberg
But then you’re under capacity. The assumption is that it takes N servers to
service N requests simultaneously.

------
Anon84
In the same line, some of you might be interested on the slides for my
Probability webinar:

[https://drive.google.com/file/d/1qz4wAmwiKadshhrStxcz-8atb0S...](https://drive.google.com/file/d/1qz4wAmwiKadshhrStxcz-8atb0SFcfQ6/view)

where I try to go from the very basic to some useful applications (like Bayes
theorem, A/B testing).

You can also subscribe to my newsletter:
[https://data4sci.com/newsletter](https://data4sci.com/newsletter) where I
also announce future webinars, live tutorials and trainings, etc.

------
dmos62
All these examples (in the article and in the comments here) shatter my mind.
I don't find it intuitive and even after reading the explanations I feel like
there's something I'm not getting. I'd attempt a marathon to be conversational
in probability theory.

~~~
aok1425
Probability is incredibly non-intuitive! If you're willing to run a marathon
to become more conversant in it, spend the ~4-5 hours to run marathon on this
course:
[https://projects.iq.harvard.edu/stat110/home](https://projects.iq.harvard.edu/stat110/home)

~~~
dmos62
Thanks for the link!

------
semitext
In a lot of computer games that use probability game developers will actually
use pseudo-RNG, or other means to deceive players about what their actual
chances are because otherwise the game will seem "unfair" to a segment of
players.

~~~
WilliamEdward
some examples of what you mean?

~~~
patrickmcnamara
Dota 2 definitely does this for a lot it's RNG. For example, your chance of
critically hitting increases whenever you don't critically hit.

[https://dota2.gamepedia.com/Random_distribution](https://dota2.gamepedia.com/Random_distribution)

