My Relationship with Metadata: It’s Complicated!
Ever since the Snowden revelations broke, there has been a lot of interest in metadata, with a lot of ink (or should that be bytes?) devoted to defining exactly what it is, where it can be gathered from, who is capable (and how) of doing said gathering, and most importantly of all, if it is even important enough to warrant all the discussion. Official statements of “We’re only collecting metadata” have attempted to downplay the significance and privacy implications of the metadata collection. Organizations like the EFF have tried to counter that with simple to understand examples (like the ones below) that show how a conclusion could be drawn by having access to just the metadata and not the data (the content).
Debunking the Myth of “It’s Just Metadata”. With Data
And today, I read the most easy-to-understand account of just how much can be gleaned from metadata. A group of researchers were given access to “the same type of metadata that intelligence agencies would collect, including phone and email header information” for just one person, Ton Siedsma, for the period of just one week. They gathered this metadata by installing a data-collecting app on his phone. Here’s what they were able to do with it:
- They were able to create a detailed profile of Ton, complete with information about his job, lifestyle, relationships, interests and more. It’s terrifying how detailed it is. J. Edgar Hoover would have been proud.
- They were able to ascertain that Ton has “a good information position within (his employer)”, which is a significant piece of information from an intelligence gathering and hacking perspective.
- Given what they were able to determine of his interests, the researchers were able to figure out Ton’s password and gain access to his Twitter, Google and Amazon accounts, again a significant thing from an intelligence perspective.
As the article points out, the intelligence agencies have access to a lot more metadata (in volume and over time), and much more sophisticated ways to analyze said metadata. So you can see why all the privacy advocates are raising alarms about this.
All this, and we haven’t even touched about all the other organizations that are able to gather this metadata, and whose business models are dependent on selling data and user dossiers to advertisers and other data brokers.
And Yet, I Can Haz Metadata?
With all this, you’d think that I, with all the privacy related advocacy that I do on Twitter, would hate metadata. But the fact is that it’s a complicated relationship. In looking at the future of Security, I’ve talked recently about how we can make it possible for us to have good security that does not negatively impact usability. But that model relies on doing more work in the background using environmental, transactional and behavioral information – aka metadata. Bob Blakley long ago talked about the move from Authentication to Recognition, which relies on continuous data gathering through different sensors to help in identifying the person or device interacting with the service. Most multi-factor authentication and risk-analysis services are already there, and going deeper.
All of this means that the security frameworks enterprises rely on will need to be able to gather and have access to all this metadata. This was much easier in the days of employer issued laptops and phones. BYOD and IoT completely change the landscape by creating new concerns regarding the what, when and how of metadata gathering by enterprises. Commercial entities also have the need to make their offerings more secure, which is to the benefit of their customers. But how does that mesh with the need that creates to gather metadata about their customers, a need that would ordinarily get a viscerally negative reaction if disclosed? The individual me is constantly having vigorous debates on this topic with the security practitioner me, leading to many amused (and some alarmed) glances from my fellow subway riders. At my core, I’m driven by the belief that we can find a way to balance the metadata gathering necessary to support the security models we’re advocating while giving individuals the necessary controls to manage and preserve their privacy in an informed way.
One thing is clear. Because one person’s metadata is another person’s data, enterprises need to start dealing with the collection, disclosure, usage and protection requirements of this PII (yes, I just classified Metadata as PII. Let the flame wars begin). As are laws. And engineers. It is likely going to get hidden inside those interminable ToS documents nobody ever reads. And employment contracts.
It’s going to be interesting for a while. And complicated.