How to regulate an algorithm

Should an AI have to explain itself? Or will peering inside the Black Box of an algorithm undermine it?

Algorithms are already a part of our everyday lives. They set our credit scores, our insurance premiums, and our welfare payments. They influence which of our friends’ lives we hear about, and even what facts inform our understanding of society. For some, mathematical functions built into our watches, scales, and dieting apps even tell us when to sleep, what to eat, or how best to exercise.

It’s only going to increase. The US economic recovery has been lopsided, with much of yesterday’s repetitive, manufacturing-centric jobs being replaced by software. When citizens can use an app, why would they stand in line at the DMV? When customers can bank, why visit a branch? Every one of these transactions will rely on an algorithm, rather than a human with a job description.

A new kind of auditor

So who will ensure these algorithms are working as intended? I imagine a new kind of auditor tasked with auditing algorithms to see that they’re functioning properly. They’ll ask questions like:

  • Is the algorithm having the desired effects on health, or fiscal policy, or risk? Pokemon Go significantly increased the number of steps walked each day by its players; if the goal of the app had been to get citizens moving, a regulator would be tweaking game mechanics to maximize healthy outcomes.

  • Is the algorithm violating laws, or exhibiting immoral behaviour? Orbitz’ algorithms suggested higher-priced offerings to visitors using Mac products, having noticed that this attribute was a predictor of spending more on the site.

  • Is the algorithm building up a dangerous amount of risk by hiding complexity? The 2008 banking collapse came from bundling collateralized debt, hiding risk inside Byzantine financial instruments and exposing investors to a cascade failure of home loan repayment.

  • Is the algorithm over-fitting to something that doesn’t work well in practice? The US Military wanted to train a computer vision algorithm to recognize tanks, so it showed the code hundreds of overhead pictures with and without tanks. The algorithm failed to work in real-world tests; further investigation showed that the photos of tanks were taken on a cloudy day, and those without tanks taken on a sunny one. The algorithm had learned to distinguish sunny from cloudy weather, not the presence or lack of tanks.

Most auditors today don’t spend their time in this sort of forensic deep-dive policy-making. Today’s auditor tries to verify that a process is complying with regulations and accepted practices. To do this, the auditor looks at the steps being used to manage that process — accounting, invoicing, payments, and so on — and samples enough data going through the system to ensure that the data is accurate and the process works as described.

Looking inside the black box

We’re moving towards an even more algorithm-driven future, and with that move, a new role for auditors. Now, they need to peer inside the “black box” of an algorithm to understand how it works. When done properly, this gives us good controls: The Nevada Gaming Commission has the right to inspect the math within a gambling machine. When the black box remains closed, or actively tries to hide its behavior, it leads to bad outcomes: Volkswagen intentionally misled emissions regulators by having their cars “cheat” when being tested.

There’s good precedent for requiring transparency, so this might, at first blush, seem like the best policy for regulating an AI. It’s certainly what many people are calling for when regulating things like algorithmic news feeds—because small changes can have big, wide-ranging impacts.

But transparency has some downsides, too. Among them, there’s the fact that logging undermines algorithmic competitiveness; and the fact that algorithms are often complex, tangled up in customer data and trade secrets.

A watched algorithm is a constrained algorithm

In the banking industry, public investment houses must let the Securities and Exchange Commission review their algorithms. In return, they’re allowed to invest customers’ money. Private funds that rely on algorithmic “black boxes,” on the other hand, outperform these public banks. They’re able to seek Alpha where banks can’t, evolving their own, often inexplicable, models. They’re opaque — and they do better. You just don’t hear much about them because their designers, operators, and investors have no need for publicity.

In other words, logging and transparency makes the algorithms easier to regulate — and less competitive.

An algorithm is a system

The black box is seldom a few lines of code. Often, it’s a complex, distributed system that will be hard to examine, and may even violate privacy or intellectual property.

Take Tesla Motors’ self-driving software, which the company pushed to drivers in a software update. Every moment a Tesla drives, it collects data that informs all other Teslas, a hive mind of automated driving that gets better each day. But can regulators understand the decisions those cars make? At what point does the public good mandate the inspection of those algorithms, overriding Tesla’s right to proprietary technology?

Formulae to decide whether someone qualifies for welfare, or gets into a particular school, or pays higher taxes, or receives an experimental medical treatment, might once have been set by a simple mathematical formula. But many of these algorithms are being replaced by something new: Machine learning.

Machine learning is code that looks for hidden patterns in data and learns to classify and group according to rules it gleans from within the data. It creates algorithms, but those algorithms are often impossibly complex, and opaque to a human. The code that identifies a face in a picture isn’t written by a human. Rather, it’s generated by a computer after having trained it on thousands of examples.

This kind of generative algorithm can have its problems. Without the right guidance, it can become a racist, sexist bigot. Using the wrong corpus to train it can inadvertently make it a misogynist. Letting it set service areas based on current demand may break the law, under-serving certain minority groups because it isn’t following constitutional laws or social policies. Machine Learning needs good parenting.

And that means we have some important decisions to make as a society.

How should we govern AI?

As we make algorithms that can improve themselves — stumbling first steps on the road to artificial intelligence — how should we regulate them? Should we require them to tell us their every step­ — effectively owning up to the fact that Mac owners are willing to pay more? Or should we let the algorithms run unfettered?

Nara Logics’ Jana Eggers—one of our speakers at Pandemon.io this February—suggests that a good approach is to have algorithms explain themselves. After all, humans are terrible at tracking their actions, but software has no choice but to do so. Each time a machine learning algorithm generates a conclusion, it should explain why it did so. Then auditors and regulators can query the justifications to see if they’re allowed.

On the surface, this seems like a good idea: Just turn on logging, and you’ll have a detailed record of why an algorithm chose a particular course of action, or classified something a certain way.

A recent conversation I had with highly-placed White House technology staffer suggested that there’s another side to the argument. And from the discussion we had, it’s clear that the US government has thought long and hard about this question: Do we cripple our AI by forcing it to explain itself?

A properly harnessed AI could be a massive advantage to a country. Depending on how much you buy into the idea of the singularity, a country could — assuming it can harness the AI to tasks that favour its national objectives — gain efficient resource allocation, perfect military strategy, effective cyber-defenses, seductive public policy, and plenty more advantages (both wonderful and dystopian.)

Imagine that two competing countries are each rushing to build the first nation-state level artificial intelligence. Assume that this isn’t a truly sentient computer, but rather a specially-focused AI designed to outsmart an enemy army. Whoever gets there first will win future fights, because a truly self-aware set of algorithms will immediately begin improving itself and learn from every battle faster than humans can.

As any software engineer knows, turning on logging slows things down. When you run code in debug mode so that you can understand how it’s working and where it’s broken, you consume more computing resources. You make the software less powerful. And as a result, in our battle between two nations, the one that makes the AI explain itself loses the race.

There’s a tension between transparent regulation of the algorithms that rule our futures (having them explain themselves to us so we can guide and hone them) and the speed and alacrity with which an unfettered algorithm can evolve, adapt, and improve better than others.

Is he who hesitates to unleash an AI without guidance lost?

There’s no simple answer here. It’s more like parenting than computer science: Giving your kid some freedom, and a fundamental moral framework, and then randomly checking in to see that the kid isn’t a jerk. But simply asking to share the algorithm won’t give us the controls and changes we’re hoping to see.