In my post on the GDPR status
of ads I wrote that I expected that European data protection
agencies would more likely than not rule that collecting personal
information for ad fraud detection requires consent:
Is it within the legitimate interests of sites to collect user data
for ad fraud detection? The ad industry has historically thought that
it was. For example, the IAB's
TCFv2, the standard protocol consent popups use to talk to ad
networks, categorizes ad fraud detection under
"Special Purpose 1", with users having "No right-to-object to
processing under legitimate interests". On the other hand, based on
points 52 and 53 of the recent
Microsoft ruling I would predict that French regulators would rule
that since users do not visit sites to see ads, sites cannot claim
that they have a legitimate interest in using personal data to attempt
to determine whether their ads are being viewed by real people.
This is not settled; among other things the Microsoft ruling was
primarily considering ePrivacy which is stricter on some points. But I
think it's more likely than not that when we get clarity from the
regulators it will turn out that the kind of detailed tracking of user
behavior necessary for effective detection of ad fraud is not
considered to be within a publisher's legitimate interests.
There was some informed pushback on this, from
Hugo
Roy and Michael Kleber: a privacy
lawyer and a privacy
engineer. This has definitely pushed me in the direction of
thinking I've misunderstood the situation and it's more likely that
the conventional ad industry interpretation is correct. Which would
be a good thing in my book: as I said at the end of my post I do think
ad fraud detection should be permitted without user opt-in.
But I did want to write more about why I had, and to some extent still
hold, the view I did. As Hugo referenced, fraud detection is
specifically called out in the GDPR as a legitimate interest:
The processing of personal data strictly necessary for the purposes of
preventing fraud also constitutes a legitimate interest of the data
controller concerned.
—Recital 47
The main way I could see a decision preventing the continued operation
of economically effective ad fraud detection is that a court might
rule that the status quo involves collecting too much. Ad fraud
detection looks something like collecting every signal you can and
then looking for patterns that distinguish people from bots. How do
you know if a signal will be useful? Start logging it and feed it
into the analysis. A company might have trouble convincing a court
that they really need all this data, especially when there's
collection they can't justify in terms of current utility. But if
this gets limited to where you can only collect what you can show is
immediately useful, and don't have a way to learn from real traffic
what new signals you might want to be logging, then more and more bots
will get around detection.
A secondary way I could see a decision like this happening is if a
data protection agency decided that, while a publisher has a
legitimate interest in preventing itself from being defrauded by the
user, it doesn't have a legitimate interest in (delegating) collecting
data to demonstrate that it is not defrauding its advertisers. Yes,
there is a sense in which the data collection is for the purpose of
preventing fraud, but it's essentially the publisher preventing
themself from committing fraud. A court could draw a distinction
between intrusion into an individual user's privacy for the purpose of
determining whether that individual user is defrauding the publisher,
but not for the purpose of determining whether there's fraud happening
between the publisher and the advertiser, which the user has nothing
to do with.
This is speculation: can we use a prediction market to get a better
estimate? The one I made earlier on the GitHub
Co-pilot litigation seems to be going well so far, so here's another
market:
Before the comments from Roy and Kleber I would have put this at ~65%;
now I'm at ~45%. But if you think I'm wrong, take my (play) money!
In my post on the GDPR status of ads I wrote that I expected that European data protection agencies would more likely than not rule that collecting personal information for ad fraud detection requires consent:
There was some informed pushback on this, from Hugo Roy and Michael Kleber: a privacy lawyer and a privacy engineer. This has definitely pushed me in the direction of thinking I've misunderstood the situation and it's more likely that the conventional ad industry interpretation is correct. Which would be a good thing in my book: as I said at the end of my post I do think ad fraud detection should be permitted without user opt-in.
But I did want to write more about why I had, and to some extent still hold, the view I did. As Hugo referenced, fraud detection is specifically called out in the GDPR as a legitimate interest:
The main way I could see a decision preventing the continued operation of economically effective ad fraud detection is that a court might rule that the status quo involves collecting too much. Ad fraud detection looks something like collecting every signal you can and then looking for patterns that distinguish people from bots. How do you know if a signal will be useful? Start logging it and feed it into the analysis. A company might have trouble convincing a court that they really need all this data, especially when there's collection they can't justify in terms of current utility. But if this gets limited to where you can only collect what you can show is immediately useful, and don't have a way to learn from real traffic what new signals you might want to be logging, then more and more bots will get around detection.
A secondary way I could see a decision like this happening is if a data protection agency decided that, while a publisher has a legitimate interest in preventing itself from being defrauded by the user, it doesn't have a legitimate interest in (delegating) collecting data to demonstrate that it is not defrauding its advertisers. Yes, there is a sense in which the data collection is for the purpose of preventing fraud, but it's essentially the publisher preventing themself from committing fraud. A court could draw a distinction between intrusion into an individual user's privacy for the purpose of determining whether that individual user is defrauding the publisher, but not for the purpose of determining whether there's fraud happening between the publisher and the advertiser, which the user has nothing to do with.
This is speculation: can we use a prediction market to get a better estimate? The one I made earlier on the GitHub Co-pilot litigation seems to be going well so far, so here's another market:
Before the comments from Roy and Kleber I would have put this at ~65%; now I'm at ~45%. But if you think I'm wrong, take my (play) money!