Followup on Big Data and Civil Rights

A few weeks ago, I wrote a post about Big Data and Civil Rights, which seems to have hit a nerve. We cross-posted it at O'Reilly Radar, and then folks like Boingboing picked it up.

I haven't had this kind of response to a post before (well, I've had responses, such as the comments to this piece for GigaOm five years ago, but they haven't been nearly as thoughtful.)

Some of the best posts have really added to the conversation. Here's a list of those I suggest for further reading and discussion:

Nobody notices offers they don't get

On Oxford's Practical Ethics blog, Anders Sandberg argues that transparency and reciprocal knowledge about how data is being used will be essential. Anders captured the core of my concerns in a single paragraph, saying what I wanted to far better than I could:

...nobody notices offers they do not get. And if these absent opportunities start following certain social patterns (for example not offering them to certain races, genders or sexual preferences) they can have a deep civil rights effect.

To me, this is a key issue, and it responds eloquently to some of the comments on the original post. Harry Chamberlain commented,

However, what would you say to the criticism that you are seeing lions in the darkness? In other words, the risk of abuse certainly exists, but until we see a clear case of Big Data enabling and fueling discrimination, how do we know there is a real threat worth fighting?

I think that this is precisely the point: you can't see the lions in the darkness, because you're not aware of the ways in which you're being disadvantaged. If whites get an offer of 20% off, but minorities don't, that's basically a 20% price hike on minorities—but it's just marketing, so apparently it's okay.

The problem with black box algorithms

Another issue is that many of the algorithms that calculate this stuff just look for correlation, without causality. Orbitz was offering more expensive hotels to Mac users not because some merchandiser thought Apple fanboys had spare cash, but because their algorithm found a pattern.

Weirdly, Wall Street banks aren't allowed to use these kinds of "evolved" black box algorithms because they can't explain them to regulators, who have to be able to vet and qualify them. But private traders have no such restrictions. This is one of the reasons for economic volatility: more efficient private traders take advantage of the inefficiencies in human-based Big Bank ones.

Context is everything

Mary Ludloff of Patternbuilders asks, "when does someone else's problem become ours?" Mary is a presenter at Strata, and an expert on digital privacy. She has a very pragmatic take on things. One point Mary makes is that all this analysis is about prediction—we're taking a ton of data and making a prediction about you.

The issue with data, particularly personal data, is this: context is everything. And if you are not able to personally question me, you are guessing the context.

If we (mistakenly) predict something, and act on it, we may have wronged someone. Mary makes clear that this is thoughtcrime—arresting someone because their behavior looked like that of a terrorist, or pedophile, or thief. Firing someone because their email patterns suggested they weren't going to make their sales quota. That's the injustice.

This is actually about negative rights.

Rights considered negative rights may include civil and political rights such as freedom of speech, private property, freedom from violent crime, freedom of worship, habeas corpus, a fair trial, freedom from slavery.

Most philosophers agree that negative rights outweigh positive ones (i.e. I have a right to fresh air more than you have a right to smoke around me.) So our negative right (to be left unaffected by your predictions) outweighs your positive one. As analytics comes closer and closer to predicting actual behavior, we need to remember the lesson of negative rights.

Big Data is the new printing press

Lori Witzel compares the advent of Big Data to the creation of the printing press, pointing out—somewhat optimistically—that once books were plentiful, it was hard to control the spread of information. She has a good point—we're looking at things from this side of the Big Data singularity.

And as the cost of Big Data and Big Data Analytics drops, I predict we’ll see a similar dispersion of technology, and similar destabilizations to societies where these technologies are deployed.

There's a chance that we'll democratize access to information so much that it'll be the corporations, not the consumers, that are forced to change.

While you slept last night

TIBCO's Chris Taylor, standing in for Kashmir Hill at Forbes, paints a dystopian picture of video-as-data, and just how much tracking we'll face in the future.

This makes laughable the idea of an implanted chip as the way to monitor a population. We’ve implanted that chip in our phones, and in video, and in nearly every way we interact with the world. Even paranoids are right sometimes.

I had a great, and wide-ranging, chat with Chris late last week, and we're sure to spend more time on this in the future.

The veil of ignorance

The idea for the original post came from a conversation I had with some civil rights activists in Atlanta a few months ago, who hadn't thought about the subject. They (or their parents) walked with MLK. But to them Big Data was "just tech." That bothered me, because unless we think of these issues in the context of society and philosophy, bad things will happen to good people.

Perhaps the best tool for thinking about these ethical issues is the Veil of Ignorance. It's a philosophical exercise for deciding social issues that goes like this:

  1. Imagine you don't know where you will be in the society you're creating. You could be a criminal, a monarch, a merchant, a pauper, an invalid.

  2. Now design the best society you can.

Simple, right? When we're looking at legislation for Big Data, this is a good place to start. We should set privacy, transparency, and use policies without knowing whether we're ruling or oppressed, straight or gay, rich or poor.