W3C

– DRAFT –
Improving Web Advertising BG

01 February 2022

Attendees

Present
alextcone, AramZS, aschlosser, blassey, bmay, cbstarr, cpn, dinesh-pubmatic, dmarti, FredBastello, GarrettJohnson, jdelhommeau, jeff_burkett_Gannett, jrosewell, Karen, kleber, kris_chapman, l_pilot, lionel_basdevant, mjv, nics, npd, wbaker, weiler
Regrets
-
Chair
Wendy Seltzer
Scribe
Karen, Karen Myers

Meeting minutes

Introductions and Agenda Curation

Wendy: We'll give people moment more to join us and then get started

<l_pilot> pressent+

Wendy: Our agenda today has introductions and agenda curation
… Federated Credential Management and Federated Identity CG
… Topics API
… sounds like a pretty full agenda

Aram: I had two agenda plus items and sent in email as well

Wendy: Thank you

<jrosewell> Question regarding process.

Wendy: Great to get new agenda requests
… Let's start with introductions; anyone new to the call who would like to introduce

Punhon Chan: Mediavine

Jen Schutte: I'm with P&G and filling in for Ryan on paternity leave

Sam Shapiro Kline introduces

Beri Lee: I'm from Google on the Product side

Matt Lieu: Hi, I'm with Washington Post engineering team

Wendy: terrific to see so many folks joining us

<kleber> Woo hoo, in-person conference!!

Wendy: [apologies for poor Internet connection, at an in-person conference]

James: I have a couple of questions
… Are we covered by the W3C's anti-trust guidelines?

Wendy: yes, they apply to all W3C meetings
… participants conduct themselves in compliance with the law

<AramZS> I'm jealous!

James: thank you for clarifying
… looking at the agenda
… The API question raises competition questions

<npd> thanks Wendy for chairing from outside!

Wendy: I will ask you to stop raising legal points; if you have technical issues, those are welcome

James: it is seeking to have a discussion about market

<AramZS> (To be clear I don't work for Google and I have an interest in talking about the Topics API)

James: and the anti-trust guidelines prevent that
… I think it should be taken out of the agenda until it is safe

Wendy: The proposal refers to technical means of information gathering
… it is a proposal asking whether participants in the ecosystem would want to adopt this
… it is not referring to individual players in the market
… I think that at this stage...

James: I would place web browsers to make decisions about data processing and restrict the data to the rest of the ecosystem

Wendy: And that would be their based on their technical positioning in the ecosystem
… but not on who they are as a company
… I am going to ask you to take this conversation elsewhere
… It is disruptive to our proceedings

James: We do have to seek legal counsel
… so my only option is to noisily withdraw from the meeting
… I will remain to listen

Wendy: Thank you and I remind and encourage everyone
… as we have been in our meetings
… that we are focused on the technical discussions, business needs and not making product or market plans

Karen: no on one the queue

Federated Credential Management and the FedID CG

Wendy: We want to hear more on Federated Credential Management

Kris: Thanks, Wendy. I'll share my screen

Kris: I'm Kris Chapman, work on privacy in Saleforce Marketing Cloud

Kris: and I'm a co-chair of the Privacy CG
… Tim Cappalli at MS as well
… I'd like to give an overview of identity and talk about the Federated Credentials Management API

Kris: Identity Federation, when we talk about federated ID, a lot of terms pop up
… can hear Oauth2, SAML,
… authentication v authorization, all sorts of terms
… at its base, it's the linking of a user's digital id across different servers
… single sign on is a flavor of federated Identity
… we mean the services or sites being access are owned by different companies providing it
… security domains, or systems
… single sign on being owned by same org
… why talking about it hear and why does it matter for Web Adv?

<blassey> https://docs.google.com/presentation/d/1Qn27aoqNsjPHsZ6_SMUcWSP8YIdmGH2KLdIcb_9g6kI/edit?usp=sharing

Kris: advertising is backbone and federated ID is user backbone

<blassey> Hopefully that link works for everyone

Kris: users don't have to id themselves and log into everything
… the overlap is that they both use the same web prmitives
… use cookies, re-directs, etc., same tech
… when we make privacy changes for adv we are going to impact federated id as well

[try sharing now]

[slides appear]

Kris: I invited Sam and Beri to help
… and correct if I make mistakes and to answer questions
… Sam, can you go two more slides forward?
… thank you
… Slide[] is example of ID flow
… my example shows NPR
… get presented with login screen
… and get option to login with account info for that site
… or links to various id providers like Google, Apple, FB
… those links are the federated id flow
… if a user clicks on link, a popup shows
… and check to see if loginr is logged in and if not, user will be asked to use
… and once logged in to id provider
… provider will provide link and ask if ok to share
… and it's redirected after user says yes
… Another common use case
… is when you are using a portal
… and you have access to different tools
… that is often a federated id use case as well

[next slide- cannot see numbers]
… Behind the scenes
… [chart/workflow]
… won't go through all the details
… point of slide is to show two things
… all those little popups are different areas where we could use web primitives
… similar to tracking and advertising
… and highlight that this flow and communications is all going back to the user agent
… but user agent is passively involved

[next slide - Some of the Challenges Involved]
… as I said, it uses the same tech as advertising for collecting data from users
… so this will break identity federation
… we have to come up with a way not to break federated id without using cookies
… outside of advertisers tracking info about data
… federated id also allows companies to collect data about users as well
… part of flow and can collect which sites people go to
… sites being accessed, get info back from id provider about the user
… id, name, phone number
… info that can go back to site that the user is not aware is being shared, or how much detail
… final thing that is a privacy concern
… in id federation, it's done using global identifiers
… an email, phone number
… so it's shared back with global identifier
… so relying part of sites can join up with other sites that have that same global identifier
… so there are privacy concerns with id federation
… this has been around since start of web

<npd> that particular flow didn't seem to rely on third-party cookies, did it? if you redirect with url parameters to the identity provider, and then are redirected back, there doesn't seem to be an embedded iframe that needs access to those cookies

… a) there are a lot of protocols for using federal ID
… a lot of technical implementations
… it's ubiquitous and not done a single way
… and it's not a single institution or group
… in adv it's brands, advertisers, adtech, users
… with federated ID, it's higher ed to give students access to resources
… to researchers to share online
… federal institutions will use to share wtih consumers
… financial services, health institutions, employers use it to give access to their employees
… many B2B and B2C use cases
… many use cases
… whatever is a solution has to be straightforward
… on behalf of relying parties
… has to be simple so it will be timely
… need a robust solutions that is easy to implement

<wseltzer> slides online

[next slide: Federated Credential Management API] Google's proposed solution
… you may have heard of it as WebID, but there was a naming conflict
… also known as WebCM
… focused on loss of third party cookies
… that is most pressing change
… this proposal might evolve to deal with other privacy changes
… but right now we are focused on the third-party cookie loss and those scenarios
… Google is trying to make it as simple as possible
… the onus is more on the id providers and less on the relying parties
… to make it a change that is more feasible overall
… Google submitted it to Federated ID CG
… we are working on this
… focusing on different use cases, what works and what doesn't work
… When Google looks at this, there are three alternatives
… there is the Permission-Oriented one
… could take more active role and flag for users a potential tracking risk
… give users a prompt and ask users for their permission, highlight potential risks
… second option is the mediation-oriented option
… idea that user agent would take a more active role
… it would start taking over some of the prsentation from the id providers
… have a directed id that is site specific
… it would stay in the middle, increase privacy by trying to manage the interactions so it's more privacy friently
… final option is delegation-oriented
… where users agents take the most active role
… they take over the identity provider
… id provider issues the tokens, but browser takes over providing the tokens
… basically, Sam had great analogy
… like going to RMV for driver's license
… they issue it, but don't provide it to others when you need to show your id
… the one that manages user agents most is most privacy preserving
… For now, Google has picked the mediation-oriented flow
… permission-oriented one doesn't tackle a lot of the privacy concerns for now
… areas in green is where user agent would take on some of the roles
… on right side, it would steer users
… using a directed identifier instead of a global identifier

[next slide use cases Google has considered]
… this is a pretty complete list, but call to this group is to see if there are unique use cases
… we want to find those sooner verus later
… and make sure we know about them; ask people to speak up
… things like sign-in, sign-out

[next slide: current status]
… FedID is being discussed at FedID CG
… let people know this conversation is going on and invite folks to participate
… where we're at
… in discussions
… one of first ones is what should the user agent's role be?
… they aren't involved
… all these privacy changes make the browsers more involved in the exchange
… where to draw that line is up for debate
… some of id providers are also browser providers
… there is a question there
… you might stop id providers from tracking, but browser will have more access to data, so what do you do there?
… we are trying to figure out the best answer
… Google and Microsoft have been participating
… we would love to get more of the browser vendors participating
… it would be helpful to know what they may support
… get more participation from user agents, and anyone who has concerns here as well is always welcome
… FedCM is one answer
… there are of course a lot of privacy proposals, and may be used for different aspects and use cases
… looking at which ones fall best for which use cases

[last slide: For more info]
… I will send this out as a PDF after the call
… so you can find out more about identity in general and FedCM
… if you want to join the FedID CG, you are welcome

Wendy: Thanks a ton for that overview
… and thank you for your patience with the technical challenges
… I want to reiterate about the standards process

<npd> (I have a few questions, but don't seem especially advertising related, so I'll follow-up elsewhere)

Wendy: Community Groups look at different proposals
… and help us to figure out places where there is common interest

<jrosewell> Karen - could you please add this link to the minutes to make it clear to the reader of the minutes the guidelines that applied to this meeting that the Chair confirmed. https://www.w3.org/Consortium/Legal/2017/antitrust-guidance - thanks.

Wendy: in standards across the industry, across the ecosystem for improvement to the web platform
… while we heard names of companies making proposals
… for making a web standard
… it needs the input and reviews from all participants
… we appreciate those who are building pieces of the product chain sharing their ideas
… to help us figure out collectively how to improve federated identity management to the web

Aram: thanks for that Wendy
… appreciate your moderation here
… for the FedID group; I know this has come up in privacy proposals in past
… idea of of authentication unlocking certain...
… is that being incorporated into these proposals?

Kris: It is happening

<AramZS> *the idea of authentication unlocking particular browser privileges in regards to privacy is what I meant, for the notes

Kris: not say it is being incorporated
… some distinction between authentication and authorization
… looking at these and think about what to do
… Sam, do you want to add a comment from Google's perspective?

Sam: yes, we are trying not to break things, very busy
… I do personally believe that the browser knowing you are authenticated to a page
… can unlock certain abilities for our browser
… it's a meaningful signal a browser can take

<AramZS> I see Sam Goto confirming that FedID is indeed where that discussion would live.

Sam: I do believe it is a primitive foundational API for the web to make people's lives better

Aram: Thank you

SteveSilvers, Neustar: quick question
… maybe same question was asked
… have you considered privacy signaling along with privacy federation
… in addition to log-in you could use your privacy preferences in the same place

<npd> what kind of privacy preferences would the user want to communicate here?

Kris: we have not considered piggybacking a privacy signal
… whether user has ok'd use of data signal or not
… we have looked from perspective of right now when you look at these activities
… the user agents cannot tell if tracking for adv or trying to log in
… if we create a channel or mechanisms to have browsers id that activity and take more steps to ensure user privacy
… there is an interesting opportunity
… once you set this up as a comms channel
… you can also add in privacy protections for users as well
… Sam, invite you or Beri to chime in
… don't hear anyone talking
… ok, he's good to go
… any other questions?

Wendy: No one else on queue

Achim: Federated identity not just about identity but also authorization to resources

Wendy: Achim had his hand up in webex

Wendy: If you can join irc channel
… we manage our queing there

Topics API

Wendy: Achim, feel free to ask question on email or follow up
… now give time to Josh Karlin to share information on topics API

<aschlosser> a privacy signal would be another ressource attached to an identity

Wendy:proposal for a role of a generic web browser in interest-based advertising
… comes as an idea to see if the Topics API is interesting for building web standards around it
… thank you, Josh

Josh: There is a brief window of time, so get straight to it

<wseltzer> https://github.com/jkarlin/topics

Josh: this is about interest-based advertising
… the next version of FLoc
… we trialed that
… most important results were feedback from the ecosystem
… want to specifically thank Mozilla
… I'd like to talk about what that feedback was, how we addressed it, and how it resulted in our new proposal
… talk about what has changed in light of that feedback as it encouraged us to make these changes
… First thing we heard is that a lot of sites with ads were unhappy they were automatically included
… we did this to give adtech a better feel for performance
… and to have enough info available to the API to understand how it will behave for utility
… we understand that was problematic
… so we won't do that in future
… we also heard that the cohorts were difficult to understand
… cohorts means there is an extra step to translate what it means
… for people who use Chrome, we cannot tell them what they are saying
… Given these perspectives, we think it makes more sense for browser to determine the topics
… like performing arts rather than a specific number
… users can recognize what is being said, to each topic embedded for sensitivity
… size of Topic taxonomy can be shorter
… focus on adv topics specifically, so less finger printing surface
… consumers can opt in or out of specific Topics
… there are other proposals regarding Topics
… Ad Topic Hints and Eyeo has Crumbs proposal...give credit where
… due
… we decided that our refined proposal needs a new name
… hope this makes the new proposal clearer
… where do topics come from, who determines them, etc.
… Describe what we have in mind for the initial taxonomy to put together safely and fairly and for training
… ideally start with an external taxonomy
… cannot make an ML model model
… have looked at IAB's taxonomy
… map topics in vertical 4
… Google's sensitivity muster
… we are sourcing the topics externally
… with about 350 topics
… as we say in explainer, we would like for this to be run externally
… don't know when that may be
… whatever taxonomy Chrome uses will be publicly available
… third issue is that FLoC added new finger printing surface
… something being said about user
… find an adv that is more representative
… we want to add as little finger printing as possible
… leverage topics
… topics target topics advertisers care about
… from 16 bits for Cohorts to 8-12 bits
… can add noise
… ensure each topic selected by certain number users
… provides certain amount of deniability
… each caller say 5% of the time receives
… improvement over FLoC

Josh: if we can reduce amount of info Topic reveals

<npd> I actually think it makes a bigger difference for fingerprinting that the values can be different by origin rather than the size of the surface

Josh: if site returns cars, another performing arts, site may not know it's same user

<wseltzer> https://github.com/jkarlin/topics#meeting-the-privacy-goals

Josh: this slows down cross-site linkage
… trying to condense info quickly, will slow down
… Topics API takes advantage of idea by showing only one of the top topics for each week
… a single site cannot learn them all
… no matter how many callers onsite, each get same topic
… we have dramatically reduced rate of cross-site information flow compared to FLoC
… Sensitive information
… third party cookies can track anything about a use; precise URLs, etcl.
… Topics API is restricted to a human-curated list
… can change over time
… can be statistically correlated

<GarrettJohnson> I attempted to explain Topics API in a short thread https://twitter.com/garjoh_canuck/status/1486438310335721474

Josh: when comparing two, Topics seems like a clear improvement over cookies
… Last major change we made
… is that
… giving less information, say Topics, for more people
… any caller out there, is not a net improvement for privacy
… we need to state that it's a step forward for privacy
… and best way to do is to compare to third party cookies
… and show that Topics is not sharing more information
… what comparisons
… cookies are caller specific and not global
… Adtech A can only learn about sites from where their tag appears
… FLoC was global
… key difference to fix
… first idea to make it site specific
… the more reach the caller has, the wider range of topics
… this approach has privacy flaw
… if various callers combine topics, they can quickly identify users across sites
… so we came up with a hybrid type
… calculate use of topics over a week
… but topics returned to callers if they observe the user on a site related on that topic
… sites are not learning more info
… I think I will leave it there since we are 12:00pm

Wendy: Thank you very much, Josh

<AramZS> Can you come back for questions next meeting?

Wendy: we have a couple more minutes

<jrosewell> Karen – please can you include the following to ensure the request I raised at the beginning of the meeting is recorded for the official minutes of the meeting.

Wendy: we have people on the queue

<jrosewell> Information concerning the proposal for Topics API via a github web page was provided prior to the meeting in the agenda.

<jrosewell> The Topics API proposal seeks to restrict competition in the B2B market for Interest-based advertising (IBA) between those that operate web browsers and those that do not.

Wendy: and I'd like to invite you back for a meeting in two weeks

Josh: yes, that makes sense

<jrosewell> The web browser operators will decide how data is processed and only the output of that processing passed to B2B IBA market participants.

<jrosewell> The W3C’s antitrust guidelines explicitly prevent such a discussion (see sentences 6 and 7). As such I requested the Topics API agenda item is ruled out of scope and removed from the agenda.

Josh: on the queue is Brian, Moshe, and Angelina

<jrosewell> Chair did not wish to discuss this, did not seek to understand how discussing Topics restricts competition, and explicitly requested that I cease verifying conformance with W3C rules and laws.

<jrosewell> The antitrust guidelines suggest participants seek legal counsel (sentence 8). I was left with no choice other than to noisily withdraw from the meeting and remained as an observer.

Josh: I invite you to rejoin for future discussion after these questions

Brian: I was going to suggest we take this topic up in the next meeting

Angelina: yes, I will, too

Moshe: I don't know if I need an answer right away
… food for thought

<AramZS> +1 to folks interested in talking about this next meeting

Moshe: we would or Google picks top five topics
… any topics that over index as number six
… will there be a bias towards high volume topics, but not toward the next higher topics

Josh: great question; we encourage experimenters to help us out here
… we may need to weight topics, not by value, but by frequency
… so it's not always stuck in sixth position

Wendy: We will come back to this
… as the GitHub page notes, there are many open questions
… we look forward to discussing at our next meeting
… and again apologies for novel technical challenges
… we appreciate both of your presentations and the discussions
… we will next meet on 15 February

<wseltzer> [adjourned]

Wendy: and thanks, Aram, for pointing me to additional agenda items in Github

Minutes manually created (not a transcript), formatted by scribe.perl version 185 (Thu Dec 2 18:51:55 2021 UTC).

Maybe present: Achim, Angelina, Aram, Brian, James, Josh, Kris, Moshe, Sam, SamWeiler, Wendy