Wednesday, October 19, 2011

Scientific Uses for Twitter: Estimating Flu Outbreaks Using Tweets

Three researchers from Iowa released a study post H1N1 2010 that demonstrated how they were able to statistically estimate the regional pathway of Flu and H1N1 outbreaks using Twitter data. Similarly to Google's Flu Tracker which uses Search data to identify regions of Flu, the researchers used keywords found in Tweets to estimate where H1N1 would show up. They chose Twitter results because 1)  they felt that the Tweets provided more context and sentiment (such as "went to my doctor today with possible Flu symptoms"), than search data and 2) because the Tweets would provide an opportunity to proactively estimate where the Flu would hit next when overlaid with geo-tracking. Using machine-learning methods, they created mapping models that used keywords like "flu, h1n1, vaccine, shot, shortage" to follow disease outbreaks and then compared this information to CDC findings. They were then able to estimate on a regional level where the disease would appear (often 1-2 weeks ahead of CDC projections) using a similar methodology. This study was published in May 2011 and is available online. The chart below shows how their findings using Tweeter ultimately matched the CDC's reported cases.

* Chart From : Alessio Signorini, Alberto Maria Segre, Philip M. Polgreen. The Use of Twitter to Track Levels of Disease Activity and, Public Concern in the U.S. during the Influenza A H1N1, Pandemic (Via

As we move into Flu season it will be interesting to see how Twitter activity trends. Will it provide the same insights as 2010 and will researchers be able to use it year over year? The H1N1 scare clearly created a high standard for Twitter volume; it also occurred at a time when Twitter was growing steadily (Twitter estimated a 100 million new accounts in 2010) and people were jumping on the bandwagon. A year later, according to comScore, Twitter's growth hasn't slowed, so it's possible that we will see another year of Twitter disease tracking. According to these three researchers, there are other groups around the world that are currently working on similar statistical analyses and projections for the 2011-2012 season.

No comments:

Post a Comment