

Tracking Flu Outbreaks with Wikipedia - lauradhamilton
http://www.additiveanalytics.com/solutions/flu_tracker

======
lauradhamilton
I wrote a script to import Wikipedia traffic data for the Influenza page and
graph it. (Updates daily)

Graph:
[http://www.additiveanalytics.com/solutions/flu_tracker](http://www.additiveanalytics.com/solutions/flu_tracker)

Inspired by this study published last week:
[http://www.ploscompbiol.org/article/info:doi/10.1371/journal...](http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003581#abstract1)

The researchers found that Wikipedia page view data provided better real-time
reporting on influenza outbreaks when compared to the CDC's data (which has a
typical lag of 1-2 weeks) and Google Flu Trends (which had trouble with the
2012-2013 flu season and 2009 H1N1 panic.

Graph was done with d3.js.

------
Fomite
I was very recently at a conference where some folks presented their work on
using Wikipedia for disease tracking - it's considerably more nuanced than
just "better than CDC". Large variation by area, disease, etc.

Disease surveillance hasn't been well served in the past decade by breathless
attempts to replace existing systems with a "Big Data" solution - syndromic
surveillance, then Google Flu Trends, then Twitter, now this.

Combinations people! Different data streams tell you different things.

~~~
lauradhamilton
Yep, great point. The wikipedia data does seem to be more real-time than the
CDC data, which has a typical lag of 7-14 days. Obviously it has its
downsides, including lack of localization, basically zero clinical
information, no demographic info, etc.

Perhaps one day we will see a real-time API for anonymized influenza reports
from electronic medical record data? Then that could be combined with
wikipedia data and perhaps other social media data.

