Data Ownership, 12-28
What does the term mean?
We own almost nothing that we produce on digital platforms and the tech companies mean to keep it that way. I don’t own my Twitter. I don’t own my Google account. Google could destroy my account, burn years of my work, lock me out of my phone, and make it nearly impossible to do my job and they could do this for just about any reason or for no reason at all. There are very few laws protecting consumers from these practices.
Data ownership seems like a tricky concept to me.
Who owns my DNA? You can say that since it is my body, I must own it. But if you think of my DNA as represented in data, things get murky. I do not have the ability to represent my DNA as data—I need a lab to do that for me. I do not have the know-how to draw any inferences from this data representation—I need an expert to do that for me.
Maybe I have, or should have, rights to control the use of the data representation of my DNA. Perhaps those rights are what we mean by ownership. The right that I think that I want is the right to decide who gets to connect my DNA representation with my personal identity. If you want to build a database to correlate DNA data representations of many people with their disease profiles and other characteristics, while keeping my identity anonymous, you don’t need my permission. But if you want to do anything with my DNA data representation that involves describing it as Arnold Kling’s DNA, then you do need my permission.
Who owns the email in my Gmail account? If Google decides that it no longer wishes to store my email, either because it does not like me personally or because it wants to end the email storage service (the way it ended blogsearch), I don’t believe I have a contractual right to anything. My ownership of archived emails, many of which are very important to me (or I would not have had Google archive them) is totally dependent on Google being nice about it. Perhaps this is not a reasonable state of affairs, and instead email archiving ought to be a service that is provided contractually in ways that protect consumers more formally.
Who owns my data on Facebook? Recently, I heard someone argue that since Facebook takes advantage of its vast collection of data to better target advertising, then Facebook should be paying individuals for their provision of data. If I were Facebook, I would reply, “Oh yeah? What about your taking advantage of the entertainment that Facebook provides to you? How about paying for that?”
In the world of tangible goods, there are many instances in which people and businesses do not pay one another in an itemized way. Go have lunch at a food court in a mall. Do you get charged for napkins? Does the fast-food outlet pay you to return your tray and dump your trash? If you are not upset by those transactions, why should you get upset over Facebook not paying for your data or you not paying for your use of Facebook?
In the digital world, the low cost of creating and distributing intangible goods gives rise to more situations where there is not an intuitively obvious way to assign rights and obligations. And yet if you try to define “ownership,” that is what it comes down to: a set of rights and obligations.
I would speculate that it is futile to try to come up with a general set of rules that specify rights and obligations with respect to data. Instead, the specific applications and contexts will give rise to different bundles of rights and obligations. I doubt that we will find the term “data ownership” to be very useful. Instead, we will have to look at various transactions and try to identify the rights and obligations that we think are appropriate for each type of transaction.
As a database engineer, it's fairly intuitive to me what is an is not Joe Blog's data. But that just means I use the genitive case when describing the relation - it doesn't mean Joe has property rights. Joe's credit card number very much "belongs" to him informally, but the card company has more actual rights regarding it. Ownership is a bad model extroplated out of linguistic usage.
But privacy matters, so anyone who has Joe's credit card number (or home address or what-not) should risk penalties for mishandling it. And the risk should make them not want to hold such data casually.
Email and social-posting archives, etc, are different but again ownership of the data is not the point. These services are definitely contractual in character. Given user's reliance interest, governments (preferably courts) should declare that gMail, Facebook etc. are bound by implicit contracts. That would be a Hayekean articulation of the implicit understanding that grew up through practical interactions.
Keep in mind that 'anonymous data' is _very_ hard to create/maintain – effectively impossible in many cases. As Gwern says/writes "everything is correlated". See the saga of the "Netflix Prize" for a great example of this.
It also seems weird to give people ownership of data that is _almost entirely_ the same as MANY other similar 'datasets'. I think some crimes have been solved not by having the alleged perpetrator's DNA, but a relative of theirs. Given the extreme 'entanglement'/correlation of DNA data, any 'ownership' scheme would likely be either useless or extremely Baroque in its complexity (and thus probably useless too).