

The Leap-Day Birthday Problem: A Bayesian Twist - chimeracoder

I'm curious to see what the data fiends on HN can come up with for this!<p>Here is an interesting twist on the canonical 'birthday problem'[1] in probability: http://twitter.theinfo.org/174910990750728193#id174912339269787648<p>A more precise form (since I now have &#62;140 chars): ' What is the prior (for a group of size N) for having exactly two collision pairs of birthdays, with one on 2/29?'.<p>I spent just a few minutes on this when first thinking about it last summer, but depending on how much information you're willing to incorporate, you could create a fairly complex model for this simple twist.<p>[1] http://en.wikipedia.org/wiki/Birthday_problem
======
pfarrell
We can ignore 100/400 year problem, for now, because our sample population
(working age) were born before 2000.

Since you're looking for a specific birthday which occurs (assuming random
distribution of birthdays), 1/1461 of the time. I think you've got
n(1/(1461^y)) where n is number of employees and y is the number of people who
could have Feb 29. That should give you the percentage likelihood.

1k employees => .4% chance

10k employees => 4% chance

Unless, they're twins :)

Wondering what you'd get from Bayesian. Maybe you wouldn't have to assume
random distribution of birthdays? Like, maybe mothers really want to (or don't
want to) have leap babies, so that slightly influences the likelihood of a Feb
29 birthday?

~~~
chimeracoder
> We can ignore 100/400 year problem, for now, because our sample population
> (working age) were born before 2000.

Actually, it's that they're born after 1900, because 2000 was a leap year, but
you get the idea.

> Wondering what you'd get from Bayesian. Maybe you wouldn't have to assume
> random distribution of birthdays? Like, maybe mothers really want to (or
> don't want to) have leap babies, so that slightly influences the likelihood
> of a Feb 29 birthday?

Exactly - it's rather well-known that the distribution of birthdays isn't
uniform (even for the other 365 days). It's only been ~10 minutes, so I'll
wait a bit (in case any curious person wants to tackle this) before explaining
some of the ways I was thinking that one could incorporate that information
into the model .

