Data’s Big Impact on the World

In January, SIIA announced its commitment to Data-Driven Innovation as a top policy priority in 2013.  As part of this initiative, its white paper Data Driven Innovation a Guide for Policymakers: Understanding and Enabling the Economic and Social Value of Data will be released soon.

Now an article that reinforces the points made in the white paper has appeared in the May/June issue of Foreign Affairs Magazine – The Rise of Big Data: How It’s Changing the Way We Think About the World by Kenneth Cukier and Viktor Mayer-Schoenberger.  The most important point of both the SIIA white paper and the Foreign Affairs articles is this:

New data collection and analytical techniques allow the use of massive amounts of data to help businesses and governments make everyday things work better.

The Foreign Affairs article described three main changes in how we go about approaching data.  In addition to collecting and using large amounts of data we need to accept data that is messy or unorganized instead of just using data that is clean or organized.  Furthermore we need to place less emphasis on causation and instead look at correlation.  In short, we should ask “what?” and not “why?”

Some of the challenges to data-driven innovation are due to people applying the same mindset that worked in the past when we didn’t have the ability to utilize large amounts of data.  One example from the article highlights that for a long time, people tried unsuccessfully to make computers “learn how to do something.”  With increased amounts of data and analytical capabilities, we are instead giving computers a massive amount of data and empowering them to use it to come up with probabilities of something happening.  In the past, analysts were usually limited to smaller amounts of data, and therefore the data inputs had to be precise and accurate.  But now with the increased amount of data, it does not have to be as precise or accurate because the sheer volume of data can fill in these gaps. 

The case study that perhaps provides the best example of what big data can be used for in the paper is about Google, and how they were able to use search records to track outbreaks of the flu.  In 2009 Google “took the 50 million most commonly searched terms between 2003 and 2008 and compared them against historical influenza data from the Centers for Disease Control and Prevention.” By running all of this data through algorithms, Google was able to come up with a list of 45 terms that had a strong correlation with the CDC’s data on the flu.  The biggest difference in how Google and the CDC were able to come to this conclusion is that Google was only concerned with how those terms were related and what that meant, not why people were getting sick as the CDC was asking.  By approaching the data in this way, Google was able to come up with an answer in close to real-time instead of several weeks, which in the case of pandemic is crucial to saving lives.

The article also highlights that one of the biggest potential concerns associated with big data is its ability to create “Big Brother.”  So to be sure, there are some risks associated with data driven innovation, but the article appropriately concludes that there is no such thing as bad data. Rather, the potential use of massive amounts of data to achieve positive outcomes in the way we live far outweighs the potential concerns.   

Ken WaschDenys Emmert is the Public Policy intern at SIIA. He has a degree in marketing and political science from Florida State University.