
Every person with a Wikipedia article in a frequency graph by birth year - frostmatthew
http://www.nefariousplots.com/figures/3
======
prawn
"I have no good explanation for the trough centered around 1700..."

Never studied history, but some googling suggests that the Early Modern Period
concluded just after that trough and the Age of Revolutions kicked off. French
Revolution started in 1789. Industrial revolution in the UK started around
that same time.

[http://en.wikipedia.org/wiki/Early_modern_period](http://en.wikipedia.org/wiki/Early_modern_period)

Looking for active or dull periods by _event_ would have to be adjusted by
30-50 years to allow for those born to achieve notoriety, right? 1678 seems to
be the deepest part of the trough suggesting that the early 1700s were dull?

Wikipedia makes any century seem particularly exciting...

[http://en.wikipedia.org/wiki/18th_century](http://en.wikipedia.org/wiki/18th_century)

Anyone more familiar with that period?

~~~
bencollier49
How about all the plagues? Those would have killed off plenty of people.

~~~
prawn
I think they generally occurred at other times?

~~~
bencollier49
[http://en.wikipedia.org/wiki/Great_Plague_of_London](http://en.wikipedia.org/wiki/Great_Plague_of_London)

~~~
prawn
I had seen that, but it also seemed to blend into a range of others including
the earlier and more devastating Black Death. The Great Plague of London
killed 100k, Black Death 75-200 million.

------
Yardlink
Notable people born before 1700 tended to be born in years that are multiples
of 10. I wonder if a lot of dates are less accurate than they're thought to
be.

~~~
ejr
That's part of it:

    
    
      "When exact birth year was not tagged, the 
      individual was prorated over the appropriate time period."
    

Which makes sense as you still get a relatively accurate portrait of the
scheme. I wonder how large a roll the loss of records have played in erasing a
large chunk of our history.

~~~
kevinwang
"Judging from the spikes every decade prior to about 1800, it seems that
Wikipedia generously applies birth year categories when birth decades would be
more appropriate. Nonetheless, the birth categories turns out to be far better
curated than the machine-readable PersonData field."

Looks like it's a Wikipedia problem.

~~~
Yardlink
Ah right there in the article :)

------
gojomo
In honor of the 1,000,000+ once-upon-a-time notables who'll likely never be
known to Wikipedia:

[http://en.wikipedia.org/wiki/User:Mike_Linksvayer/Article_of...](http://en.wikipedia.org/wiki/User:Mike_Linksvayer/Article_of_the_Unknown_Notable)

------
brudgers
The trend isn't surprising. 15 minutes of fame is easier to achieve than 300
years. It's also less work to research Justin Timberlake than Justinian
because one need not even open a book.

~~~
danohuiginn
I am surprised; I would have expected the over-weighting of recent births to
be far larger. The graph shows 1920s births and 1970s births as about equally
likely to be notable. That doesn't fit with my experience that every minor
30-something writer or actor has their own page.

------
mkx
Isn't history also just much more documented after the advent of the internet
and Wikipedia itself?

~~~
th3iedkid
I doubt wikipedia on that, because its only supposed to show already
documented artifacts for it cannot add as a new document.

Of Internet , i cannot disprove either.

------
bambax
I guess the birth date of people on Wikipedia normalized in order to be
indexable?

I have a (stupid) theory that I would like to test but I don't know how to go
about it. The theory is that many (most) people who go on to achieve great
things are orphaned (or estranged) from their father at a young age (before
they're 6).

A first step to test it would be to calculate the correlation between the age
of the person when their father died, and some measure of their "greatness"
(length of article? number of sources?)

But I don't think the information about the year of death of each parent is
normalized; any idea in how to go about this?

~~~
danohuiginn
Start with freebase.

But you're going to have a serious problem with missing data. Date of parent's
death will only be there (in structured form) if the parent is themselves
notable enough to have a wikipedia article. Otherwise, it'll only be in the
text.

So I think you'll only get a meaningful result if you manually clean the data.
Choose a 'neutral' source of notable people (e.g. Who's Who). Hire somebody
via odesk or mechanical turk to find and record age of father's death. Compare
with averages across the entire population.

Basic manual research is now cheap; take advantage of it!

------
krzrak
Most important part of the linked article: "The bar for notability or even
remembrance is simply much lower for recent history."

~~~
_delirium
An interesting intersection of those is that the bar for notability is
_really_ low for ancient history, _if_ any information is available. Minor
people about whom some documentation has survived are interesting to
historians, simply because the number of ancient Greeks who we know by name
and about whom any information has survived is small, so essentially all of
them are of interest to historians. Same with texts: every ancient surviving
poem is at least somewhat notable and has some study of it (even surviving
poem _fragments_ ), which is definitely not the case with every surviving poem
from 1995.

~~~
rooneel
True that, but if you were writing poems in say, ancient Greece, you likely
weren't Little Billy writing love ballads to Susan from across the street.
That someone was writings poems itself makes them notable than their
compadres.

~~~
x1798DE
There's this woman, Allia Potestas [1], who is notable just because we happen
to have her epitaph, and the epitaph happens to mention details of some sort
of polyandrous relationship configuration that she had. I doubt she would be
notable if she lived today.

Additionally, I'm pretty sure that poetry in Ancient Greece was if anything
_more_ commonplace than it is today, and not necessarily some high action of
the literati. I think we're used to a world where the marginal cost of
reproducing creative works is very low, so everyone can hear the same songs
and read the same books as everyone else - before the era of easy
reproduction, I think you had a lot more "local talent" generating their own
little poems, songs, plays, etc.

1\.
[https://en.wikipedia.org/wiki/Allia_Potestas](https://en.wikipedia.org/wiki/Allia_Potestas)

------
csl
It would be interesting to plot the PageRank score of each person's page as
well, showing their significance. I would expect a long tailed graph (as we
filter only the very significant persons the farther back we go in time; less
notable ones gradually pass into oblivion).

~~~
aquadrop
I think even better metric would be number of translations to other languages
for each person's page (maybe in combination with PageRank), I noticed it
correlates with "notability" (not just for people) pretty good and I even have
been meaning to do it myself for some time :)

------
wudf
I would love to see how this graph has changed between wikipedia's launch and
today.

------
valdiorn
Overlay this with the population of Earth, and suddenly it's a lot more
obvious:

[http://en.wikipedia.org/wiki/World_population](http://en.wikipedia.org/wiki/World_population)

~~~
peteretep
Hrm, at first I upvoted this, but then I saw it was normalized by total births
anyway.

------
sytelus
I haven't understood why Wikipedia insists on notability requirements? Why not
let every human have their own Wikipedia page? After all we are no longer
limited by number of pages that can be printed.

~~~
ordinary
The notability requirement is not there to keep out the plebs, it's there to
keep the information on Wikipedia accurate. In effet, it sets a lower bound on
the reliability of primary sources, improving the quality of the project as a
whole.

~~~
Tomte
I think that's right, but in another way Wikipedia does keep the plebes out:

When you defend a reverted change on the talk page all too often you get
something along the lines "I don't care about this discussion, it was an IP
anyway. Stay away!".

Wikipedians often see "IPs" as vandals and don't even pretend to judge on the
merits.

Together with some "ownership" feeling that many Wikipedia users have
regarding "their" pages, it makes for a very unproductive and frustrating
experience for a casual would-be supporter.

Unless you're willing to really commit to Wikipedia, learn all the jargon and
be a part of the power games, Wikipedia is effectively read-only.

As an "IP" you're mostly restricted to fixing typos.

That's perfectly fine, but a stark contrast to how the project is painting
itself.

~~~
aragot
Most IRL people still don't know that wikipedia is editable (starting with my
flatmate, art school age 24). Granted: They just miss the point entirely and
assume it's another Linux-or-Facebook free thing. So IP is very efficient for
demoing: it makes sparkles in their eyes. Then the change is reverted, serious
participation requires an account.

When I'm working from a corporate computer, I keep my IP and don't log in.
That way, it leaves the trail open if anyone wants to trace a bias back to my
company.

------
contingencies
This is interesting to me mostly because I a Wikipedia admin since 2003 and am
also writing an ancient history book since 2010 wherein my scope begins to
cease exactly where Wikipedia picks up.

------
joefreeman
The trough and then peak between 1970-1990 is interesting. Any insights? Are
people more likely to have an article written about them in their 20s than
30s?

------
KhalilK
Any idea how such results were collected? Isn't it resource-demanding to crawl
over Wikipedia?

~~~
danohuiginn
It's the first link under Sources

May 2, 2014 English Wikipedia Archive
[http://dumps.wikimedia.org/enwiki/20140502/enwiki-20140502-p...](http://dumps.wikimedia.org/enwiki/20140502/enwiki-20140502-pages-
articles-multistream.xml.bz2)

------
seanflyon
Anyone else notice that Wikipedia articles are more likely to show even
numbered birth years?

------
x4m
I just can't get it: why there are only fractional Notable Birth count?

~~~
hollerith
Because the y axis represents notable births as a fraction of all births.

~~~
x4m
When I point with a cursor on a graph line, there are numbers with a caption
"notable births". And then "notable births fraction". So, I suppose, first one
is count...

