W3C

– DRAFT –
Improving Web Advertising BG

23 March 2021

Attendees

Present
apireno_groupm, AramZS, arnaud_blanchard, arnoldrw, blassey, bmay, Brendan_IAB_eyeo, dialtone, dialtone_, dmarti, ErikAnderson, GarrettJohnson, gendler, imeyers, jdelhommeau, joelstach, jrosewell, Karen, kleber, kris_chapman, pedro_alvarado, pl_mrcy, seanbedford, shigeki, wbaker, wseltzer
Regrets
-
Chair
Wendy Seltzer
Scribe
Karen, Karen Myers

Meeting minutes

Wendy: welcome
… As we get started let's look at the agenda
… Valentino will present on second price auctions
… and some questions raised there for use cases and proposals
… FLoC testing and privacy proposals, particularly looking at international context
… Dashboard highlights
… and AOB, pointing to upcoming meetings
… there is an on-going call for a one-off FLoC meeting scheduling
… There was a heated contest going on between next Monday and Thursday slots

<wseltzer> https://doodle.com/poll/y8fsp9nnahz6udqw?utm_source=poll&utm_medium=link

Wendy: So if you want to participate in the FLoC conversation, please fill out this poll

<wseltzer> https://github.com/WICG/privacy-preserving-ads/issues/3

Wendy: The time with the greatest availability I will choose this Thursday (24 March) and get notifications out
… Other notification to give a heads-up
… There will be a PARAKEET meeting in the WICG call
… under the Web Incubator CG
… participants urged to join and participate in those calls; W3C community groups are free to join
… so please do so if of interest to you
… WICG meeting is tomorrow at 11am EDT

Agenda-curation, introductions

Wendy: Any introductions?
… Anyone new to the call today?
… who would like to introduce yourself?
… Any agenda items to add for today or for future meetings?

Second Price Auctions

Wendy: Reminder that "q+" in irc is best way to get my attention as we go through Q&A
… Valentino, I think you have something to present?

Valentino: yes, I will share my screen
… Second price auction is a bad title
… Better title is talking about real time spend feedback
… Briefly talk about the problem and then the proposal
… Real time spend is a key aspect of how ad ecosystem works today
… because it solves a variety of issues
… such as trouble shooting and monitoring
… self-serve buying campaigns from DSPs
… and they may set it up incorrectly
… due to set of parameters
… or choose parameters that wildly overspends budget in system
… not optimal
… spend at rate that customer asks you to
… every adtech platform supports some form of this
… Last Wed, 17 March in FLEDGE call we had long call about cost resolution
… multiple parties publish exchange
… and agree upon amount of money owed
… and what does current spec allow for this to happen smoothly
… and other features such as page-level frequency capping
… frequent in re-targeting use case
… If you have built real-time systems
… without a RT feedback
… the health of the system, spending money
… it ends up like being a self-driving car
… and piloting it with a video feed that is one hour delayed
… very conservative operation, never accelerating
… and will impact the top-level spend in the market
… We believe second price auction could unlock real time feedback
… One of primary problems was that bid price could be used to inject a user identifier and used to track
… in particular which sites they visit
… using second price, auction would return a lower price
… and should allow price to be returned to ad server
… so publisher and SSP can choose between first and second price auction
… then a report is triggered to transmit bid to the winner
… probably the winner, SSP and publisher should receive this price so they can do cost resolution
… does not go through aggregate reporting
… We want publishers and SSPs to select which auction they want
… Publishers have long list of reasons why to go with first-price auction
… this is a buyer perspective; I'm not super knowledgeable on this
… but typically buyers never bid their true value
… could be higher than inventory
… Bid shading is employed; uses ML to capture as much info as you can
… to lower price so percentage probability remains high
… in first price auctions I can use same price
… and know which sites visited
… even in current specs this can be done
… so far we are doing aggregate
… mathematically, true value
… pay value of second one
… maximizes views
… and allows everybody to bid as high as possible
… and it maximizes yields
… no need for bid shading
… from our perspective in may allow this real time feedback to come through
… This is a simplified change set of the FLEDGE spec
… where the seller during the auction can specify an auction type volume
… and set to second price
… auction is not super simple
… have to make sure buy is not competing against himself
… avoid having data leakage
… at that point notification of buyer and seller of the ultimate price
… Among things I mentioned
… what is in it for publishers
… Better performance
… going from first to second yields better performance
… whether lower CPM or CPA
… leads to more publishers
… real time feed can solve those four features I talked about earlier
… there is going to be better pacing so bid density will be higher; more competition, higher demand will drive up rprices
… publishers will have better knowledge of their own inventory prices
… allows them to set private auctions, if there is a spec for minimums or floors
… which are harder to set when working through first prices
… and the cost resolution
… Looked at possible areas of attack
… Collusion
… Multiple buyers
… harder to pull off; requires coordination with two buyers
… browser would keep a history of the same two bids that repeat
… and stop choosing winners from those two buyers
… Could be some leakage in the winning
… but not different from current proposals
… in aggregate form, some leaking here
… Lastly, there may not be a second price available within the auction
… in practice we see bids get shaded
… typically should be possible to have a second price or a lowering of the bid
… which the browser could do on its own
… if it determines the auction is not anonymous enough
… could have noise by lowering the auction an arbitrary amount
… and attacker could set up a web site
… more frequently the buyer
… not reveal much history
… on the web site you set up
… not much of a leak expected
… could be a lot of other smaller modifications to this system such that there are delays
… or other features browser could add to increase anonymity
… I have some data if we want to see it later
… But generally speaking, the proposal is simple; allow ability to choose auction type
… auction leads to function and sends to three parties of the auction

Wendy: I queued up to ask if you can share your slides for us
… and is there a repository or issue?

Valentino: yes, I will share the slides; I will be creating an issue or long-form document and share it later

GangWang: thanks...we have been talking about real time in different contexts
… could you comment about how much real time the industry really needs
… talking minutes, half hours, hours, days?

Valentino: it tends to be necessary across the board for a variety of reasons
… Easier one is case of targeting attributes; look-alike like what FLoC does
… in our case [NextRoll] they set budget too low
… in less than five minutes you may have spent the entire budget
… a company that wants to display ads to people interested in product
… they think it's people who browse CNN
… and a $10 budget and in three seconds they run out of money
… they did not realize this would happen
… In some countries
… we had to refund two cents to the customer
… because we had overspent their money at the end of the month
… Different but depends upon the variety of use cases
… in our case, we are talking about seconds
… Maybe 30 seconds or a minute are good enough
… It's definitely days or hours
… seconds to minutes at best

[slide]
… This is an experiment we ran
… an effective bid rate experiment on transitioning our pacing from daily to hourly to real time
… this is adjusted every five minutes
… effective bid rate went from 50%
… adjusting pacing every day
… to instead be median of 100%
… so we always had budget available when we adjusted our budget in real time
… so there are significant information

GangWang: on this slide
… when you adjusted every five minutes
… how much is the increment?

Valentino: we don't always increment; bid is tied to win rate and we monitor spend in real time
… we decrease sensitivity to risk of our ML
… at which point the system will be lower
… variation could be really small
… thousandth of a cent

GangWang: no doubt that real time adjust is important
… wanting to understand benefit of every five minutes vs. every 30 seconds

Valentino: we can do experiments with different customers and customer size
… and look at what latency can be sustained

Michael_Kleber: Thank you for the presentation, Valentino
… I want to say two things; one about real time information; one on first and second price
… on subject of real time
… from privacy sandbox POV
… difference between first and second price doesn't matter to privacy threat model
… people may be colluding with each other
… info associated to each event is exact price from first price highest bidder
… doesn't change event level info
… in the long term
… get past event level reporting stage, start of FLEDGE experiment
… I still think we need some aggregation
… let me make clear what type of aggregation we need and how it relates to differential privacy
… goal of diff privacy...ok to get how much budget is left
… great aggregated question
… as long as info is blurred, has enough noise
… to hide the contribution of any individual person
… amount of noise added to diff privacy answer is about same size as like three impressions
… if you need a real time update for each indiv impression
… I don't see how that is compatible with diff privacy approach
… maybe fuzz exact amount of bid to make more plausible
… but not leak indiv person's ID
… but if talking about how much value a campaign has left, every 30 seconds
… and expect answer, not enough has changed, so keep using last answer, that is perfectly fine
… from the privacy sandbox philosophy

Valentino: As you said, you will need to add noise

Michael: yes, that's right
… we need a central DP way to know what budget is remaining
… so you cannot keep querrying the budget
… no, there hasn't been enough to make a substantial change
… and then find new answer; have noise associated with it, for usual DP reasons
… and not track indiv involved

Valentino: you could do small aggregates back to third parties
… not solved is the cost resolution

Michael: I don't understand

Valentino: Be push or pull based

Michael: either could work
… sending alert when my budget runs out is a fine way for it to work
… we could engineer system with the diff privacy approach

Valentino: but doesn't add other features like cost resolution; you are only party with the repot
… people won't trust system they cannot independently audit

Michael: at least two parties would get cost info
… maybe even three, separate from buyer and seller
… buyer and seller could keep own books, and possibly another party to resolve disputes
… And take a moment to address first and second price
… diff question from how real time can be
… Be clear with Chrome's position
… we don't take a position on first or second price auction
… our goal is to build platform powerful enough to run either type
… up to seller to choose their mechanics
… up to buyer to choose where to participate
… personally, I like second price auctions for same reasons Valentino talked about
… Chrome as platform needs to support both
… so sellers can run auctions of either types

Valentino: How can it be @ three increments
… you could be sending us random numbers, and we would not know if you are lying to us
… in a first price we can set vector; in second price it's much harder; don't see how they are equivalent in the back end
… there is going to be more info in first price notification
… I receive it back; could bid $3 back
… get back list of sites in the report
… look at every $3
… and see aggregate results that tell me what group has done
… on second price, unless at point of collusion, there is no way to inject info into the auction
… using bid price, because you can discount at random

Michael: collusion is an important part of answer as you say
… if I have a user ID
… that is known at time I am added to an IG
… and all ad tech users agree it will be encoded in @
… look at millionths of a cent and look at my user ID

<jrosewell> Why does "Privacy Sandbox" have to assume everybody is colluding? Doesn't that ignore laws and contracts.

Michael: how to deal is to add noise
… why we still need that sort of protection whether first price or second price

Valentino: I think we should continue offline

Mehul: you are getting second price; if adding noise would it matter?

Valentino: I don't think noise at single impression
… when charged price CPM
… adding to noise to single impression is one thousandth of CPM

Marcal: about sampling
… we think it's more a strategic decision; maybe done at beginning
… a lot of people there
… if small kind of ok
… with site we have
… very good it hasn't had problems since today
… something that should be done at the beginning; to use sampling
… care about these insights
… how many users have done it, v comparing behaviors
… this would be interesting
… in short- to mid- term without cookies
… we have to think about how to sample
… talk about FLoCs
… maybe @ will accelerate that
… we are discussing that

Valentino: got it

<jrosewell> I've heard two issues; 1) need for support for 2nd price auction; and 2) accurate feedback frequency must be between 30 and 1 second.

Wendy: I see some comment in the Webex

<jrosewell> If the frequency of "polling" is too frequent the result will become increasingly inaccurate. Will advertisers really spend money on a solution that risks them not being in control of their budget?

Wendy: We try to use irc for comments

<jrosewell> Do we have a template for the expected legal contract between the parties? It would be helpful to get lawyers to review the expected legal contract before investing in expensive experimentation.

Wendy: if you would like to raise your voice, please speak up

<jrosewell> In regard to 2nd price auction who will decide the auction rules?

Mark: I guess this goes back into concept of correlation
… and ratio for fitness scoring
… is there reason why we cannot put back into cohort?

Marcal: we have to be consistent with the different regions

Mark: issue raised around win notification and signalling to allow for banker type services and bidders to adjust
… what is going on while still doing privacy
… could be ajusted with cohort
… and still aggregate and update fitness systems and not have delays
… don't know user transactions per se

<kleber> @Mehul: Adding noise to each bid value is certainly feasible also — that's "local DP". But after adding up n noised bids, you end up with an estimate that differs from the true value by O(sqrt(n))

Valentino: I think that is what Michael K was talking about
… push/pull
… becomes solid enough to receive real time comms
… still has to be compatible with value of campaign and repeat querying
… dictates the latency
… will be more challenging to tune if that is the only option
… Did I understand your question?

Mark: yes, I think so

Wendy: We are getting to levels of details that would be helpful to address on repo

FLoC testing and privacy considerations

Wendy: Julien, you wanted to raise a question on privacy

Julien: not privacy sandbox, but get POV as we move into origin trial for FLoC
… are people going to run tests in EMEA
… and what considerations for using FLoC to provide more context
… why I am asking this question

[shows slides]
… give overview of eprivacy definition
… it is a directive that applies to all countries in Europe
… Looking at French example
… eprivacy regulates anything that is reading or writing information on user's desktop
… So FLoC is writing...would be considered regulated and needs consent
… GDPR applies to all European countries same way
… personal data is information relating to....[reads from slide]
… and IP address from user
… so if you have user ID and FLoC ID, that would be considered personal data
… so implications for using FLoC in that context
… I am not looking for an answer to this question today
… knowing that those questions exist
… the French DPA
… have pushed a new rec that goes lives in one week
… they will have automatic check to make sure web sites are respectful
… seems ill-advised to start testing with FLoC today unless you are ready to address these eprivacy question
… Is anyone intending to test FLoC in European country
… or is testing going to be done in US, outside EMEA
… where there might not be those considerations
… if anyone plans to do testing in Europe with FLoC origin trials, how are they approaching the eprivacy part of this?

Wendy: we are not providing legal advice or specific product plans in this forum
… but if people find it helpful to provide ideas on how to respond on common issues, then please go ahead

James: There are plenty of lawyers on the call
… yourself included, Wendy
… So I would appreciate discussing...
… to Julien's question, we are not going to be testing FLoC
… another question, who is data processor and data controller; it's a machine
… GDPR and @ are not well suited to be doing this with machines
… another consider to think about
… and get these issues resolved
… which involve discipline of law as well as commercial

Michael: As Wendy said, I am not going to touch on legal advice
… everybody is welcome to have opinions
… I am not a lawyer so my opinions on legal matters don't matter

<blassey> presnt+

Michael: from Chrome origin trial POV
… wish it were on already, we are still working on it
… when we turn it on, it will only be activated in some countries
… not all the countries in the world
… I don't have the list of the countries right now
… It will not include any European economic area
… of the sort that this question was about

Julien: Thank you, Michael
… to clarify, the origin trial will not include FLoC in the European countries
… so anyone in those countries should not try to do that?

Michael: origin trials are turned on for traffic for some percentage of browsers
… Chrome standard is one half of one percent; it's a small number in terms of ML
… we're in negotiation to see how to make data available for a larger percentage of users to experiment with ML
… telling you this because any time you are going to run an origin trial
… not for everyone
… say if it exists in this browser, if not, wait for API to be launched
… that is background for all origin trials

<jrosewell> * robin an identifier created for the purposes of enabling communication (i.e. IP address, SIM, IMEI, etc) by a machine is different to an ID associated with an interest group as defined by FLoC.

Michael: for countries in Europe, we will not be turning on origin trial for users in EEA countries
… some countries, but zero traction in others
… not site in middle of visiting for example

Julien: user location

<jrosewell> * robin other identifiers created by a web site in Europe will have a data controller (or processor) associated with them. It will be the entity that operates the web site.

Julien: from IP

Michael: yes, Chrome knows what country you are in based on IP, geo

Julien: Europeans are out of origin trial for now
… do you know when some portion will take place?
… or have to mind down the way

Michael: We will figure out in the future
… since I told you this will start, but I'm often wrong about timelines

Julien: That is very useful

Aram: that is useful; two questions
… Does this mean FLoC will or may not run at all in EEA once it hits level of production?
… Does this mean FLEDGE also will not be available in EEA, and does TD not go into production?

Michael: too early to answer any of the forward-looking questions that Aram is asking

Wendy: thanks for raising the question
… our goal is to work towards standards that will be worldwide
… we hope that as experiments continue, that we can work towards proposals with global reach
… any further questions or comments?

<pl_mrcy> present

Brian: I know you said you are still working things out with FloC
… any ballpark for when the origin trials will get started?

Michael: I think we'll turn it on in couple of weeks; but that is not useful info until it happens

Wendy: So we will see it when we see it; assume there will be an announcement

Brian: can we expect an explainer?
… you said you will put together an explainer for the origin trials and wondered if that will also be in a couple of weeks/

Michael: yes, we will have info about origin trial when it is ready

Wendy: we are at end of our time
… we can return to the dashboard to see if other issues are gaining steam

<wseltzer> https://w3c.github.io/web-advertising/dashboard/

Wendy: we certainly have had a number of different proposals for consideration
… we may want to bring back some of those
… or bring back conversations from other side meetings here
… for today, we are at the end of our time
… we will adjourn now and see you next week

<wseltzer> [adjourned]

Wendy: thanks for all the discussion

Minutes manually created (not a transcript), formatted by scribe.perl version 127 (Wed Dec 30 17:39:58 2020 UTC).

Diagnostics

Succeeded: s/up campaigns/campaigns from DSPs/

Succeeded: s/sensitive/sensitivity/

Succeeded: s/Michael Kleber/Michael_Kleber/

Succeeded: s/pulled/pull/

Succeeded: s/choose/choose where to participate/

Succeeded: s/I like/personally, I like/

Succeeded: s/Macal/Marcal/

Succeeded: s/@/percentage of browsers/

No scribenick or scribe found. Guessed: Karen

Maybe present: Aram, Brian, GangWang, James, Julien, Marcal, Mark, Mehul, Michael, Michael_Kleber, Valentino, Wendy