A recent article in the New York Times raises the question: do algorithms discriminate? The question is legitimate, but the emphasis is wrong. Instead of thinking of data analytics as a problem, we need to welcome the new opportunities for improved decisionmaking that they enable. And we need to cooperate to identify and address any disparate impacts data-driven decisionmaking might have.
Algorithms are computer programs that are used to help people make better decisions in business, government, education, and medicine. Using the vast amounts of data generated by online and offline activity and employing the latest methods of statistical analysis these computer programs enable more complete and accurate assessments of people’s interests and background. A list of the good things that can be done with these improved data analytic techniques would fill books: detecting early warning signs of infection that can literally save lives, balancing electric utility load to make efficient use of our energy grid, preventing credit card fraud in real time, identifying students at risk of failure so that they can get the help they need to succeed in school.
But there is not a new and dangerous tendency for these data analytic techniques to be used for discriminatory purposes. There are anecdotes in which these computer programs are used in ways that appear to have adverse disproportionate impacts on protected classes such as race, gender, religious belief, and ethnic origin. But the data and analytics do not use membership in these groups as a criterion to classify people or to make decisions about them. The disparate impacts are entirely accidental.
Nevertheless, it makes sense to ask the question: what should be done about disparate impacts derived from the use of data analytics if they do arise? The good news is that this issue is not new. For generations, civil rights laws have protected the dignity and equality of citizens by providing a way to assess disparate impacts. These laws establish that when there is a disparate impact, companies must be prepared to show that the decision criterion serves a legitimate business purpose, unrelated to a discriminatory intent. They must also be able to show that this criterion does the job better than other available criteria. This mode of assessment was just upheld by the Supreme Court in Texas v. Inclusive Housing.
How to apply this mode of assessment might take some thought in light of the new techniques, and shared efforts to understand how to do this are called for. But the general approach is clear.
It is crucial to emphasize that disparate impact results are no more likely to occur now than they were before the advent of big data analytics. Indeed, the greater precision of decision criteria derived from these techniques is likely to produce results that are fairer than the less accurate indicators of the past.
Still, it is appropriate to ask: what precautions should businesses take so that their use of data analytics does not have disparate impacts? How can these tools be made “fair by design?” Businesses already go to considerable lengths to insure that their use of analytics is accurate and non-discriminatory. FICO, for instance, rigorously tests its credit scoring models to make sure that they meet fair lending standards. But it makes sense to have further conversations about what else can be done. Making progress in this area is a responsibility shared by business, public interest groups, civil rights organizations, academics and government agencies. We need to engage in a respectful dialogue with each other and to search together for realistic, practical ways to move forward.