Wednesday, April 09, 2008

DP law and search engines

There is a truely remarkable amount happening right now on what one might very loosely call the "Web 2.0" privacy front. On top of the UK Byron report and the Ofcom report dealt with in last two posts to this blog, we also now have the EC Article 29 working party opinion on data protection issues related to search engines.

Very roughly, this report takes the long -expected, but not uncontroversial (especially if you're Google) stance that IP addresses are (mostly) personal data. This follows the view taken previously by the Art 29 WP in its WP 136 that"… unless the Internet Service Provider is in a position to distinguish with absolute certainty that the data correspond to users that cannot be identified, it will have to treat all IP information as personal data, to be on the safe side". Basically even dynamic IP addresses can be connected to particular users given the cooperation of log-keeping ISPs. As such potentially all IP addresses must be viewed as "personal data".

It also argues that:

the Data Retention Directive (2006/24/EC) is clearly highlighted as not applicable to search engine providers. This is because Article 2 sub c of the Framework Directive (2002/21/EC), which contains some of
the general definitions for the regulatory framework over "electronic communications services", explicitly excludes services providing or exercising editorial control over content. Notably, earch engines both filter out illegal content, provide safe search, and respect no-robots text tags on sites, all functions search engines should continue to exercise.

Search engine providers must thus delete or irreversibly anonymise personal data once they no longer serve the specified and legitimate purpose they were collected for, and be capable of justifying retention and the longevity of cookies deployed at all times. The DRD is NOT an excuse to retain data for longer (as Google have previously claimed.) The WP recommended retention for no more than 6 months. Similarly, if search engine providers use cookies, their lifetime should be no longer than demonstrably necessary.

- the DPD does however clearly apply to search engines which deposit cookies on the machines of EU resident users, even if the search engine is based economically or physically outside the EU eg the USA.
European data
protection law also applies to search engines in specific situations, for example if they offer a caching service or specialise in building profiles of individuals based in the EU.

- on DP law, search engines generally fail to say exactly for what purposes they gather personal data of users. If it is used for purposes users might not reasonably have anticipated eg building profiles of users for advertisers, the search industry may be breaking DP law.

The WP also considered the new so-called "people search engines " such as PIPL and Rapleaf, which draw on data from a wide range of sites, often including blogs and SNSs as well as the general Web, to form indexed profiles of individuals. Such profiling may both reveal unexpected data, and throw up misleading correlations, and some have already drawn adverse comment. The WP emphasised that these sites "must have a legitimate ground for processing, such as consent, and meet all other requirements of the Data Protection Directive, such as the obligation to guarantee the quality of data and fairness of processing."

Pangloss is pleased to see this issue adressed: it provides a compulsory legal basis for what is emerging as good industry practice, namely (a) email the data subject whose profile is published (b) allow them to remove or correct or make private the data published. Of course we still need to make sites not based in the EU take notice of EU law. Eventually, what we desprately need is a technical fix, namely better multiple identity control - roll on the research into distributed identity management.


inel said...

Thanks for covering this topic. I'm glad to hear the working party is looking into the 'people search engines' and that:

"The WP emphasised that these sites 'must have a legitimate ground for processing, such as consent, and meet all other requirements of the Data Protection Directive, such as the obligation to guarantee the quality of data and fairness of processing.'"

One problem is that Rapleaf approach data privacy from the opposite end of the pitch from individuals like myself, and Rapleaf's sequence of actions is in reverse order to reputable ways of doing business.

For example, Rapleaf start by collecting all data on me, without my consent. After collection, Rapleaf then allow me to opt-out from their service: I presume this means they hold and continue to collect my data, but just do not have my permission to share it with strangers.

As a human being, rather than a company chasing a business opportunity, I approach data privacy from the opposite direction.

It is not right for any company to build, as Rapleaf does, a so-called 'reputation management' business based on its ability to systematically collect, index, sort, prioritise and package all data it deep-mines on me without my explicit permission upfront.

pangloss said...

I'm glad you agree as rthat's exactly how I feel too!
I have a partcicular bugbear right now with a particular "people" search engine which will only allow me to withdraw private information if I sign up for one of the 4 big webmail clients - which needless to say do not include UK googlemail, the only one of these I ever use. I *strongly* do not think I should have to set up a new spam honeypot account just to remove personal data already processed wihout my consent...