Data Protection assessment of Privacy Sandbox's Protected Audience API

A fascinating process initiated around 2010 is ongoing. The process leads to web browsers improving user privacy, including in the online ads display. Changes are ongoing. I can say that in some of these, I participated directly. Lastly, can we eventually say goodbye to the “cookie consent popup notices”?

One such interesting proposal is Privacy Sandbox, now deployed in Chrome, a web browser with the largest user base. It is meant to connect fire and water: the ability to display tailored ads without user tracking based on heavy personal data processing, a common thing in 200x and 201x years. That is the crucial significance of this change. Notably, Privacy Sandbox proposals include a measure to “display tailored ads” in ways not resorting to invasive user tracking, Protected Audience API (PAA).

Previously I authored many privacy assessments of web technologies. This time I made an advanced data protection analysis of Protected Audience API. That is the content of my LL.M. dissertation (at University of Edinburgh) — the future evolution of the advertising technology ecosystem.

Data Protection, Competition of Protected Audience API

I validated Privacy Sandbox’s Protected Audience API (PAA) through the lens of EU data protection law. The most stringent standard of the kind, considering principles of data protection, and Data Protection by Design and by Default. The overall conclusion is that it is possible to use this system in ways compliant with EU data protection law. Even in ways that the GDPR consent prompt may not be needed. When no user personal data is processed.
While I approach this problem from data protection law, elements of competition law are crucial (see below, I understand the full picture).

I imagine that this work may serve as a useful reference to researchers, experts, industry, lawyers, and regulators (data protection, competition). But also to bodies like the US Federal Trade Commission, or the EU Commission. The assessment is thorough, made in an extremely informed way, and conducted with awareness of the public and policy debates and regulations.

The rest of this note is based on my dissertation.

What’s Protected Audience API (PAA)

PAA enables the display of contextual or “curated” content on the web in privacy-advancing ways (that’s the goal). The content in question may be advertisements. How come it’s privacy-advancing (or enhancing)? The tailoring/display can happen in ways superior to today’s targeted advertising, concerning privacy. That is made using a special technical design.

PAA greatly differs from the mechanisms of targeting and ad display in use yesterday/today, which were/are based on strong user tracking, heavy user data processing, display, and reporting that reveal user interests, and all of this allows to mine information such as the user’s gender, demographic traits, interests, etc. In other words, enabling user profiling. In such systems, heavy processing happens on the server side, in ways invisible to the user or external, open audits. How to change such a system? By moving the bulk of processing to places closer to the user.

PAA takes advantage of the web platform architectural changes and is the primary reason we may expect dramatic changes in ad display, from the point of security and privacy, and transparency.

PAA works on the principle of processing data in a user terminal (or “trusted” execution environments, but my dissertation focuses on on-device processing; if only due to a strict length limit). The high-level overview of how it works is as follows:

the user is browsing a website with some PAA-enabled scripts
In response to what the user is browsing (a product, or a website about something specific, like white wines?) or doing (adding or removing something from the shopping cart?) scripts may execute calls that place the user in “interest groups” (for example “likes cats”, “eats cheesecake”, “purchased wine-Savoie”), in response to user activity. This information never leaves the user's device (i.e. web browser). This is the main contrast to today’s ad tech.
Some time later when the user browses another website, scripts on the site may run an ad auction where input information is used. Including the aforementioned “interest groups”. Some information may be supplied by the AdTech providers, but the crucial context is the interest group (e.g. “likes cheesecake”)
The auction is executed on the user's device. During the auction provided bidding logic is being computed/evaluated. The winner’s ad is to be displayed. All this happens in an isolated frame – no information escapes this sealed computing environment. Unlike today. The ads to be displayed were likely fetched some time ago, also in ways unlinkable to the user, same for the bidding algorithms.
The ad is displayed, and after a while aggregated statistics about the ad display may be sent to the ad-providers.

I admit that this system is creative and very different from what we have today. It puts established concepts on their head and demands a change in the thought model of how web ads work. It seems to draw on or is comparable to, some of the previous academic research in privacy-preserving advertising. As such it is standing on the arms of giants, in this case, well privacy-validated models. There are some tradeoffs, still.

However, the main message of my analysis is that this system can be deployed to operate on non-personal data. In such a situation, GDPR could even not apply. To be extra careful, let’s talk about principles. The system meets principles like purpose limitation; data minimisation; accuracy; storage limitation; integrity and confidentiality; accountability; Lawfulness, fairness, and transparency. What does it mean if personal data isn’t processed?

This system could utilise two legal bases: consent, or legitimate interests. To unlock the use of legitimate interests (which do not need consent), a data protection balancing test must be done. It must weigh the interests of the processor (like a website), and the user (their privacy). I perform such a test. The result: “PAA does not work on the principle of monitoring user activity: (1) the ad targeting happens on the device (or in isolation), (2) no personal data is acquired to be brokered subsequently, (3) there are no means for “further processing”.

This is an important consideration because the balancing test must consider impacts on the data subject, taking into account aspects like issues of power imbalance between the user and the controller. The details of processing might tip the balance in favour of the controller. Especially if advanced techniques would be deployed, such as: technical and organisational measures; extensive use of anonymisation techniques; aggregation of data; privacy-enhancing technologies. These are included in the PAA’s design.”

If the information was not to be disproportionately processed, or not at all, the impact on the data subject would be low, minimal, or even none. That looks impressive. Even more so when you compare it with previous systems.

Data processing might not be happening at all

In this system, users need not be singled out. Consequently, GDPR need not apply:

“The list of interest group data is not supposed to leave the user’s web browser. Accessing the list of interest groups that the user is a member of is impossible. No external party can access it (unlike with cookies, it is not possible to read this information). Since no party, such as a buyer, or advertiser, can access a full list of interest groups, the possibility of linking such information to identifiable persons may be unreasonable. This would be achieved through the use of technical design (i.e. of the web browser), along organisational ones (if only for the involved servers, when in use). The execution of auctions purely on the device may fulfil the test of non-identifiability by “any person”, making the anonymity guarantees strong. The on-device computation part seems to be particularly strong: the processed information cannot be used as an online identifier like cookies. As for the content display part, micro-targeting precautions lend credence to the conclusion that the aim of such design is not to enable the reaching of specific persons.

When such access is impossible technically, it would then be unreasonable to consider it as identifiable. Furthermore, in line with Breyer, any foreseen accesses could be made in ways segregated organisationally. Therefore, I conclude that the list of interest groups is not identifiable information in the sense of GDPR”.

Therefore, it can be possible to use this system in ways not processing personal data. Or at least, it could be possible. It all depends on the deployment considerations. In this (actually, any…) system information (even if non-personal) is processed.

To be on the safe/complete side I devote a lot of discussion to aspects of consent. The reason is that the ePrivacy Directive/Regulation may still call for consent. How come? It considers “information”, not “personal data”.

“When accessing the user terminal, active consent is required, the rationale being the protection of the user “from interference with his or her private sphere, regardless of whether or not that interference involves personal data”. Even when no data is “accessed by advertising network providers when data subjects visit a partner website”, like in PAA, ePrivacy necessitates consent. Execution of operations causing the user browser to join or leave an interest group to modify information in the user terminal, and may suffice to meet the thresholds”

Does this mean that PAA does not need consent and needs consent at the same time?Yes, in a sense. However, GDPR uses consent for other purposes.

As for the ePrivacy consent (which may be needed!)... Let’s face it. It makes little sense to require it for privacy-friendly systems. Currently, this law (ePrivacy) therefore seems to be maladapted to privacy-preserving, privacy-enhancing technologies. It requires consent even in cases where the level of offered privacy is very high. That’s the reason why the ePrivacy framework should be amended to rightly unlock the full potential of privacy technologies with benefit to the user. This is not currently the case because the policy debate deadlocked on consent needs even when privacy standards are undeniably high. This is the big Elephant in the Room: ePrivacy Directive (and the later Regulation proposal) is not adapted to today’s world. The provisions for privacy in electronic communication are needed, but they must be well-crafted. This is currently not the case.

Deployments aspects

During the tests, and the early deployment, some technical and organisational guarantees are to be relaxed. For example:

The isolation in ad auctions would not be as tight as it must be in the final product.
The ad reporting would be event-based, not aggregate (which could technically enable user monitoring or tracking until transitioned to aggregate).
Some server infrastructure is to be provided by the bidders/buyers, which also potentially may facilitate forms of user tracking until it is fully transitioned to the final stage.

Such aspects introduce risks. However, my analysis concerns the assumed final state. In the meantime, analysts and deployers must navigate this maze. Good luck with that (and should you need help: reach out).

Some parts of the system (like the auction execution) may also be performed not necessarily on the device (in the user’s browser), but in isolated infrastructures (trusted execution environment), for “performance” reasons. If these measures are made well, the main conclusions in this work should still stand.

Competition aspects

Competition aspects of technology become important lately. There are no technical standards for that. I raise this in my dissertation to a smaller degree. A separate, devoted assessment is here.

It is important to highlight that Protected Audience API has superior privacy standing in comparison with third-party cookie tracking. But also some other custom-made proposals such as tracking based on deterministic identifiers (like e-mail addresses, and personal data of the kind) or probabilistic ones (partly including fingerprinting). Placing Protected Audience API “next to” those other, custom solutions is inappropriate, both from the point of view of data protection and also likely due to competition issues (such technical approaches differ greatly and it is clear that deployers treat user privacy differently…). Now “another alternative” is to simply display just contextual ads or block any tracking technically; when a web browser vendor has a significant market share though, this one is also subject to competition concerns (reminder: those analyses are grounded in aspects of the law and regulation).

Future challenges

The future of AdTech with generative AI may produce user-tailored AI-produced ads, microtargeted to a particular user. New standards, including laws, likely need to be developed. Not to mention design/deployment considerations. The challenges are relevant right now. I’m looking forward to such work (want to engage me? Contact me at me@lukaszolejnik.com)
ePrivacy Directive/Regulation must be amended. This law is today incompatible with privacy-advancing technologies, it does not motivate their development or deployment. On the contrary, it may even inhibit the development of privacy technologies, functioning as a legal disincentive to improve privacy. Adapting privacy regulations to incentivise technical improvement should be the desired change. So: “Among the takeaways of this dissertation is therefore the need to align the legal landscape to support approaches that handle user data with appropriate care, and specifically aim not to process personal data. Provisions balancing the consequences of ePrivacy Directive’s article 5(3), or exemptions — like in the case of the Directive, when cookies are strictly necessary — should be considered. The changes should not lower the protections.” Everyone knows that ePrivacy is outdated: DPAs (EDPB), EU Commission, everyone. However, perhaps they aren’t aware of how outdated it is.

To quote my work: “EU Data Protection law with respect to web user monitoring is partly motivated by user tracking, a substantial problem of the 2000s and 2010s. This is evident due to applicable laws explicitly referencing cookies or similar approaches. The use of such capabilities is being reduced or phased out (in the optimistic scenario). As such, developing laws to account for the realities of technology is appropriate. The GDPR is fit for purposes, but the ePrivacy Directive should be adjusted to bring it in line with reality“.

And the last paragraph: “Considering the current mass spread of consent notices on websites in Europe, it is justified to ask if it is reasonable when solutions with improved data protection qualities are in place".

Summary

It is possible to construct privacy-improving technologies respecting the law. My LL.M. dissertation considers technology, standardisation, data protection law, and competition law. It’s a lot of content! I managed to find the right balance to fit it all. The result is the first analysis of Protected Audience API, based on scientific and legal grounds. I find it interesting how it relates to my initial Computer Science privacy work (PhD) on real-time bidding, and the later privacy analyses, even work. That is to say, I know this system from various angles.

In the end, let me add that the system is fragile. To function, the design, implementation, and deployments must be carefully calibrated, with crucial protection responsibility put on the web browser. After the final release, any proposed changes must be considered extremely carefully. This necessitates care

Find the full content of my LL.M. dissertation here.

Lukasz Olejnik (me@lukaszolejnik.com)

Back to main

Security, Privacy & Tech Inquiries

Lukasz Olejnik