You might remember this news from a while back. In order to improved the predictive ability of its movie recommendation system, Netflix cranked up a $1 million contest for research and business --build us a better mousetrap and the money is yours.
Besides the money Netflix offered up its subscriber data, which included their viewing recommendations and choices, but didn’t include names. Netflix believed it had protected the identities of its subscribers this way. With no names it would be impossible to identify any single person in a crowd the size of Netflix’s subscriber base.
That was the theory anyway. Two computer scientists at the University of Texas at Austin, Arvind Narayanan and Vitaly Shmatikov, rolled out a paper in early 2008 that showed you could in fact identify individual subscribers from Netflix’s data. With privacy compromised the Federal Trade Commission (FTC) stepped in, and as Netflix refused to back down on its contest, the FTC sued to protect subscriber privacy.
Netflix blinked, and has announced that to settle the lawsuit it would withdraw its contest. Netflix says it will continue to refine how it handles recommendations, in cooperation with the research community, and will be more attentive to the privacy of its subscribers in the process.
There’s a bigger lesson to be learned here. We tend to be blasé about the data trail we leave while on the Internet. After all, our identities, we are told, are removed prior to any data sharing or mining. Turns out, no matter how big the crowd we are in, our individual identities aren’t necessarily protected. Something to think about.
Image Credit: Netflix