
How Common Is Your Birthday? (2012) - helloworld
http://thedailyviz.com/2016/09/17/how-common-is-your-birthday-dailyviz/
======
ioltas
It would be interesting to have some data like that per country. For example
in Japan, I would suspect that the first places are in April and May. The
fiscal year beginning on the 2nd of April, couples prefer to be able to put a
kid to the daycare in its first year in the 0-year-old class (0歳児) with a kid
the maximum older. So if a kid is born on the 2nd of April, he will be able to
go to the daycare of 0-year class when he is 11 months and 30 days, while if
he is born in March, the kid could just enter the daycare in the 1-year class
(1歳児), which is way harder to join because there are less new places in 1-year
classes than 0-year classes. Note that the kid can just go to the daycare if
older than 6 weeks (I may not recall details correctly here), so it is
possible for kids born in February to go to the daycare of 0-year class, but
most of his/her classmates would likely be closer to turn 1 once they all join
the daycare, which makes for a huge difference at this age.

------
blowski
In the UK, babies are more likely to be born on a Thursday as hospitals do
what they can to prevent babies being born at the weekend when there are fewer
doctors around. No actual data, but that’s what the midwife told us.

~~~
diminish
On this visual you may see that very few babies are born on december 24, 25
and july 4th - US public holidays.

What's confusing the visual are planned birth days.

------
moscovium
Heavy concentration 9 months after the coldest times of the year...baby it's
cold outside.

Personally, there was a pretty significant blizzard ~40 weeks before I was
born. Snowstorm babies unite.

~~~
ekianjo
Would be nice to see if Australia has something similar but at other periods
of the year?

~~~
dhoulb
I’s love to know that too. I assume some part of the September bump is
“festive cheer”, which would still show in Australia. But it’d be fun to know
how much was each.

~~~
ekianjo
Is there any Australian government data available anywhere? it's worth
checking.

------
dhoulb
It’s obvious people are able to influence some contol over this, I huess
through inducement (or just holding it in??).

Avoiding holidays was the obvious trend, but there were two others I saw that
made me chuckle:

1\. People avoiding February 29, so they don’t give their kid that 1-in-4
birthday pain.

2\. The spike on February 14! People want their kid born on Valentines (cute,
but not really romantic...)

~~~
antithesis
I find it curious that there's a spike on February 14, but not one on November
14. There's somewhat of an increase starting at November 14, but it's not
clear if that's a consequence of Valentine's Day. It's as if planning a birth
on the holiday motivates reproduction more than the holiday itself.

~~~
shalmanese
Pregnancies aren't exactly 9 months. They're closer to 40 weeks which is
November 7th which does have a moderate hump compared to the surrounding week.

------
c3534l
As a side note, my stats professor mentioned that power outages have been
linked to clusters of births 9 months later.

------
Zenst
I find it surprising that you are more likely to be born upon the 29th of
Febuary than the 25th of December.

~~~
jcranmer
You aren't. If you drill into the data, the values for each day is AVG(#birth
on 1994-MM-dd, 1995-MM-dd, etc., 2014-MM-dd). For 02-29, the cells for, say,
2001-02-29 doesn't exist, so it's merely AVG(1996-02-29, 2000-02-29,
2004-02-29, 2008-02-29, 2012-02-29). (This is conjecture as to how the
researcher computed the numbers from the raw data tables, but the wording of
the summary suggests this is the algorithm).

~~~
ekianjo
Why not do a frequentist approach compared to the population of all the data
collected for birthdays? It would be much more accurate, since 02-29 would
then appear to be very much an exception.

------
kgwgk
A couple of relevant links: [http://andrewgelman.com/2016/05/18/birthday-
analysis-friday-...](http://andrewgelman.com/2016/05/18/birthday-analysis-
friday-the-13th-update/) [http://andrewgelman.com/2017/01/17/laurie-davies-
time-series...](http://andrewgelman.com/2017/01/17/laurie-davies-time-series-
decomposition-birthday-data/)

------
__MatrixMan__
I had always thought that April/May was prime birthday season. So many of my
family and friends have birthdays in that time of year. I was very surprised
to find that it was not a particularly likely time to have a birthday.

I wonder if this error is common in other people. And if so, if there are
social-network dynamics around birthdays that cause the bias.

------
bogdan_r
There are 50% less births on Sundays:
[https://docs.google.com/a/nimblex.org/spreadsheets/d/1WhpO_C...](https://docs.google.com/a/nimblex.org/spreadsheets/d/1WhpO_C4i-apoBpl3MVYlk1D6Fx0yl_Jh5ZgnNk0FB4A/edit?usp=sharing)

------
amelius
I find it more interesting to correlate birthday frequency with
characteristics like career or health, e.g.:

[https://www.bustle.com/articles/93018-can-your-birthday-
pred...](https://www.bustle.com/articles/93018-can-your-birthday-predict-your-
career-path-maybe-according)

------
ggggtez
It's been mentioned in other articles, but the ability to induce birth allows
for mothers to avoid birthday on holidays. They can plan, and with assistance,
schedule a delivery a day earlier for example.

------
wmil
Any idea what causes the late November dead zone? People seem to be inducing
to avoid a Christmas birth. But what happens on the last week of November?

------
spookyuser
Does anyone have a link to the interactive version, I can't see it anywhere on
the page?

~~~
PhantomGremlin
Are you on mobile? You must not have a "larger" screen?

If you mouse over the individual squares of the image that follows the words:

    
    
       How Popular Is Your Birthday?
       Two decades of American birthdays,
       averaged by month and day.
    

you will get popups showing more details for each particular date.

A sample popup text says:

    
    
       this date,
       10/2, had
       11,572
       births on
       average. It
       ranks 77th.
       The
       conception
       date* is
       around 1/9.

~~~
spookyuser
Oh I was on desktop. Maybe my monitor's resolution was too low :/

------
lurcio
Strange that September 31st sees so few births, given that the month is the
most popular.

~~~
zeitg3ist
September has 30 days...

~~~
ak39
I work a lot with visualisation. It becomes interesting when you release a
chart to end users only to get feedback such as lurcio's.

At first I would simply correct and "educate" the end user how to interpret
the charts. Recently though I have begun to "register" such types of questions
and misunderstandings. I now actively determine WHY the end user made such a
seemingly simple error in usage. And more than likely, I've concluded, it's
the fault in the presentation of the chart than it is of the user's
understanding of it. It is not easy to create data visualisations that do not
mislead!

In this case, we as users are introduced to a gradual shading of a pink/maroon
palette of calendar dates. The shading goes from dark (almost black) to light
pink (almost white). So both white and black squares are tacitly presented as
features. The mind extrapolates this. Using white for an invalid date is bound
to create confusion to a user not paying attention. And charts are always sold
to end users as "you relax, we've done the thinking, just consume this easy
picture". As users, our guard is down by invitation!

I'd make all the invalid dates a colour from a completely different palette or
strikingly different colour (eg. gray, yellow or green) to stand out clearly,
indicating immediately that you are looking at invalid areas of the chart.

------
kqr
This is partly in response to the comment mikeash made, but also a question
I've been carrying around for a long time. Mikeash wrote

> 9/11 has a small but noticeable dip compared to the surrounding dates. I
> wonder if that's people avoiding it or if it's just coincidence

I also see something else: there is a light streak going down the 13th of
every month, as if people were having fewer babies that day.

If I want to figure out whether this is an actual pattern or just a
coincidence, how would I go about that? I mean, as I understand the general
process it's a two-parter:

1\. Come up with a model of the situation: say, uniform probability of giving
birth any given day.

2\. Calculate the probability of getting your actual results based on the
model. If these are lower than some threshold you pick, say, 5% or 1%, then
you discard your model because you have found something that breaks your
assumptions.

But there's lots of nuance here. Is does my model even make sense? Isn't it
more useful to define the number of child births on any given day to be
normally distributed with mu = average over the year? Are the two equivalent
because something something central limit theorem?

Is there a simple and convenient way to do the calculations for the most
common models? And say I'm interested only in the streak of the 13th... do I
need to consider all other days as well before I discard my model? My
intuition says that the streak of the 13th is easier to prove/disprove than
the dip of 9/11, because it's 12 data points (is there?) compared to just one
-- but I also feel like I'm way off base here...

I'm at a bit of a loss here. I haven't studied a lot of statistics because
I've been busy with other things but this is the kind of thing I'd love love
love to learn.

I do remember a bit of the Think Bayes book, and I could plug each month into
a comouterprogram updating the birth probabilities of each day based on
that... but how do I know if I have enough data to get a day-by-day resolving
power? Maybe I should look at weekly averages instead?

I mean, I understand this is a whole field of research and not something you
learn overnight, but is there not some subset of "countertop hypothesis
testing" that you can apply casually over the kitchen table to get you at
least somewhere?

TL;DR: I think I see patterns in this data and I want to statistically reject
my hypotheses. How do I approach that?

Edit: I guess the reason my question differs slightly from high school stats
is because I don't get to design an experiment that will give easily
analysable data. And I'm afraid of creatively interpreting the existing data
to make it easier to deal with, because I think that will lead me to incorrect
conclusions.

~~~
Fuzzwah
I'm confident that this fact plays a role in the dips:

> The rate of caesarean section births in the U.S. was 32.7 percent in 2013

With so many c-sections being performed there is more selection of birthdates.

~~~
dhoulb
Really useful context, thanks!

------
dang
What "might surprise" me is an article dated 2016 when it comes from 2012:
[https://news.ycombinator.com/item?id=3968433](https://news.ycombinator.com/item?id=3968433).
Except now can we trust the 2012?

~~~
hellofunk
Yet the data in the article goes to 2014!

~~~
dang
Perhaps they wanted the average of the dates to be accurate.

