Big data is a term often met with one of two emotions, great suspicion or hyperbolic optimism. Neither sentiment encompasses the reality of what big data has to offer. Few other topics are debated with such frequency both within and outside of the beltway. Numerous policy luncheons are convened with panelists ranging from devout data analytics worshipers to data atheists. As Stephen Pratt of Wired put it:
“We are at a crossroads in the fundamental role and use of these data. Are we creating an Orwellian dystopia or a Jetsonian utopia?”
This is the central question. What is the nature of big data? Privacy advocates, some policymakers, parents and others are of the Orwellian persuasion. To this group the word ‘data’ sparks fears of privacy invasion, heightened government control, commercial manipulation, and discrimination.
Even the White House has participated in the discussion with the President tasking the Administration with a 90-day review that culminated in the report “Big Data: Seizing Opportunities Preserving Values.” What they found is that virtually all companies are good actors and of the numerous harms listed all but two were hypotheticals. The Center for Data Innovation’s Daniel Castro and Travis Korte have a good analysis of the harms in their article “A Catalog of Every ‘Harm’ in the White House Big Data Report.”
In the Jetsonian camp, are people like Chris Anderson who exaggerates the power of data in his 2008 article, “The End of Theory: the Data Deluge makes the Scientific Method Obsolete.” Anderson makes the claim that with the advent of big data and the systems for processing it, there is no need to search for causal relationships between two correlated variables; “correlation is enough.”
The problem with Anderson’s position is that the outcomes are half-baked. It is true that big data, via statistical algorithms, illuminates millions of patterns that would be impossible to identify otherwise. And while finding those patterns or correlations is important, your stats teacher was right; correlation does not equal causation. Take for example this BuzzFeed post “The 10 Most Bizarre Correlations.” One of the less colorful relationships BuzzFeed found, although no less absurd, is the link between the decline in market share for Internet Explorer and the drop in the national murder rate. In Anderson’s world there is no room to consider the validity of found correlations because there is no need to actually understand what is going on behind statistically significant relationships.
Clearly, big data has not antiquated the scientific theory, in many ways it reinforces its necessity. We need careful analysis through understanding of the specific facts in a scientific field and construction of tested and validated models to sift through the millions of spurious correlations. The Clayton Christensen Institute for Disruptive Innovation, a nonprofit, nonpartisan think tank illustrates why in a blog post titled, “Big Data: The end of theory in healthcare?”
In reality, as with most debates, the truth lies somewhere in the middle, between Orwell and Jetson.
Data-driven innovation has the ability to capture, comingle, store, verify and analyze relevant data, and then integrate the results into established processes to derive innovative practical outcomes. But the power of big data lies not in the accumulated data points themselves. Data itself is nothing but a collection of information. Rather, the power of data-driven innovation lies in our hands. To truly utilize large quantities of data two things are absolutely necessary:
- An adequate system for integrating, managing and analyzing the data.
- A data scientist with expertise in the field of what, within the data, is being studied, to ask the right questions and interpret results.
Data analytics has the power to reveal what works, what is missing and what can be done better in a way that would not be possible otherwise. Big data in and of itself, is not the solution to all of our problems. Data analytics is a tool that must be wielded by people and when leveraged appropriately there is much good that can be accomplished.
Sabrina Eyob is the Public Policy Coordinator at SIIA. Follow the Policy team on Twitter @SIIAPolicy.