Old laws, new problems
At Strata in New York a couple of weeks ago, we had a lively discussion of data privacy with some members of the media. Tim O'Reilly and the New Yorker's Bob Mankoff were talking about who owns data and how we can control the flow of privacy—a chat that would have made a great debate in its own right.
Part of the conversation revolved around the new laws we'll need to create to manage a world full of data. But I realized that we don't need new laws—rather, we need to apply existing ones in new ways. I think leveraging existing laws, and adjusting them, is far more effective.
(It should also be noted that I live in Canada, and my idea of government overreach may not be normal in the US or elsewhere.)
Clouds are like airplanes
Years ago, I gave a talk in Washington, DC that compared the regulation of cloud computing to an existing set of laws governing air travel. Clouds were like airline companies:
All countries probably had one, possibly government-backed
Most countries had other, specialized ones (an air force, netjets, medevac helicopters)
Big, competitive economies probably had several (United, American, Southwest, Jetblue)
Some were domestic and some were international
There might be alliances for portability across countries (Star Alliance, etc.)
Airlines were regulated for safety and prevented from colluding, but free to run their businesses
Allowing things into and out of airplanes was done through a secure perimeter run by the country of origin
Transportation of goods and people required some multi-factor authentication
Exchanges (airports) existed to efficiently move large amounts of cargo and many passengers from one airline to another
There are plenty more parallels. A couple of years later, a US government official told me that this analogy had become one of the ways that Congress thought about cloud computing. Not because it was particularly good—but because there was an existing framework to hang things from.
So here are three existing legal concepts we should update to the world of data.
Eminent domain
Some data might be so valuable that it belongs in the public domain. This debate is already happening around patenting DNA, but it could as easily apply to geosensing data that shows who's polluting, or housing price information that might usher in a financial collapse.
We already have laws that say when the government can acquire a house whether the owner likes it or not—for example, to build a highway on the location—called Eminent Domain. This is the legislative equivalent of "it's for the greater good." These laws could be applied to private data sets needed for undertakings of public policy.
Creative Commons and sidechains
When I upload a picture to Flickr, I can define how it may be used according to a Creative Commons license. I could say nobody can use it; or that it may only be used noncommercially; or that it requires attribution.
Data is the same. When Commoncrawl publishes an open copy of the web's data, it also provides a terms of use. Each time data is published, it needs an associated license explaining how it can be shared, commercialized, and modified.
The problem, of course, is that content is not metadata. The thing being shared is separate from the terms of use—my picture isn't associated with the license. When we've tried to tie the content to the terms of use, we've created the horrible monstrosity of DRM, which has generally failed miserably, made things difficult, and stifled innovation, because we can't foresee how something will be used.
There may, however, be a happy medium on the horizon. Cryptocurrency, and more specifically sidechains that allow us to peg other things to bitcoin, might finally provide a scalable, decentralized way to intertwine terms of use with content without breaking creativity.
Algorithmic regulation
As laws get more complex and society gets more instrumented, we're seeing the rise of algorithms to enforce laws. A simple, almost trivial, example might be that of a traffic camera—the algorithm for sending a ticket is based on the observed speed of a car.
Reporting on data is often done by exception. That means that when something is within normal boundaries, nothing happens—but if it changes dramatically, someone is notified. A single use of data might not be significant; a sudden rise in data usage is probably something I want to know about.
Getting regulators on board
There's little doubt that wide-ranging technology which touches all of us sometimes requires new legislation. But if we can build atop existing concepts that regulators already understand—using old laws in new ways rather than trying to create new ones from whole cloth—I suspect we'll be far more successful.