Friday, April 1, 2011

Ethical Dilemma

I encountered an ethical dilemma last year in my internship. I was working in Data Analytics, and the client (a national retailer) asked my team to analyze its shrinkage (theft) data from its many stores.

After analyzing the company's data in conjunction with US Census data, we came to a number of politically incorrect conclusions. Specifically, we concluded that the percentage of certain ethnic groups in the population in the zip code of each store had a statistically significant influence on the store's theft rate. That is, large amounts of certain ethnic groups were strongly correlated with high theft in the company's stores.

This conclusion presented an ethical dilemma because we had two competing values we wanted to uphold. The first value was accuracy; that is, presenting the information we found in a clear, actionable, accurate way. This value alone would lead us to present the data exactly was we found it by telling the company that stores in areas with large percentages of these ethnic groups were at risk for having high theft. The second value was political correctness; that is, we didn't want to present the information in such a way that could offend the company's board or bring accusations of discrimination or profiling against the company. This value alone would lead us to deliberately leave our politically incorrect findings out of our report completely.

While I finished my internship before this ethical dilemma was completely sorted out, I proposed a solution that I thought balanced these two competing values. My solution was to present the data in another context: the percentages of all of these ethnic groups were highly, positively correlated with population density (the higher the population density, the higher the percentages of these ethnic groups). While leaving out the specific ethnic groups whose population percentages were highly correlated with high theft, just reporting the population density finding captured most of the predictive power of the ethnic group percentages (since the two variables were so well correlated), allowing us to uphold our value of accuracy. Also, since this report would not mention specific ethnic groups, it could not reasonably be labelled as discrimination or profiling to act on the report. Thus, this solution helped to preserve, as much as possible, both of our competing values.