Do you “own” personal data?

I’ve been meaning to re-write this as a blog post for ages, and a Twitter exchange with the excellent @mediamocracy has finally nudged me into doing so.

Incidentally, I say “re-write” because, despite rumours to the contrary, the Internet is not as indelible as people might have you believe. I used to have a blog at, but all you will get there now is a 404 from Oracle. I don’t know what line of reasoning they followed to delete some Sun blogs while leaving many others up, but there you go.

“Ah, but what about the Wayback Engine?”, I hear you say… What indeed? The thing about Wayback is that it only captures the page pointed to by a URL at the time of its crawl. Sun’s blogs, like many, worked as a push-down stack; so any posts that got pushed off the bottom of the page between one Wayback crawl and the next were not captured. In short, Wayback will replay some of my posts for anyone wanting to dig into them, but not the one about data “ownership”.

That being the case, I’ll base this post on a related comment I made later, on an IETF thread about privacy and geo-location.

In brief, my underlying argument is this:

You’ve probably all seen privacy threads where an aggrieved data subject says “All I want is to be given back *my* data”… The implicit assumption is that, in some way, I ‘own’ my [sic] personal data. Unfortunately, not far down the line that leads to all kinds of unwanted consequences, and therefore we’re better off not starting out with a model based on concepts of ‘ownership’ if at all possible.

For instance, as Bob Blakley pithly put it, “You can’t control the stories other people tell about you”. There’s lots of personal data about you over which you have no control, let alone ‘ownership’, because it’s generated by other people. The only time you get control over it is, for instance, if the information is libellous. Even then, you don’t get ‘ownership’ of the data, but you get the opportunity to exercise certain rights pertaining to it. [The Google “de-indexing” ruling of 2014 is a classic example of this principle.]

Similarly, a model based on a concept of ‘ownership’ doesn’t work well for informational resources that can be ‘stolen’ from you, yet still leave you in possession of the data. Think of copyright digital media… you own the CD of Beethoven’s 5th., but there are rights to do with the original work (or the performance) that you don’t enjoy.

Legally – at least in the UK and US, and I believe elsewhere, too – there are distinctions between the treatment of “personal property” (or personalty) and “real property” (or realty), my own belief is that we’re better off treating personal data as if it were realty than as if it were personalty. This is especially true of the legal remedies when something is stolen from you. What has to happen in the case of realty offers a better model than the legal remedies for theft of personalty.

I know this is a rather terse and dense statement of the issue – there are doubtless points here that could be unpacked in far greater detail – but suffice to say, I think an approach based on assumptions of ‘rights’ over data has fewer problems than one based on assumptions of ‘ownership’. Think of personal data such as location/tracking/behavioural data: it makes little sense to claim that I ‘own’ the data collected about my path through a shopping mall, but it makes a lot of sense to claim that I have certain rights relating to it.

[Update]: since I initially wrote this, I have actually tended to take a tougher line still. In my view, not only do I have rights relating to data about me; I also, I believe, have rights relating to data that affect me. I sometimes express this as “PII should be re-defined from ‘Personally Identifiable Information’ to ‘Privacy-Impacting Information'”. This might reflect more accurately the reality of today’s personal data ecosystem, in which you are affected not only by personally identifiable data, but also:

  • by inferences drawn from that data
  • by personally identifiable data about other people thought to be similar to you
  • by aggregations of metadata
  • by aggregated data about the behaviour of others.

In short, trying to protect your own privacy and self-determination by focussing solely on data over which you think you have “ownership” is likely to prove ineffective, and will fail to address a significant proportion of the real privacy risk.