Difference between revisions of "FLoC"

From Bitnami MediaWiki
Jump to navigation Jump to search
m
Line 1: Line 1:
The goal of Google's Federated Learning of Cohorts (FLoC) is to reduce organizations' ability to access [[Audience|audience]] membership across different publishers.<ref>https://https://github.com/WICG/floc</ref>
+
== Overview ==
  
Google's FloC proposal collects and processes web client behavior across various web sites to assign each web client to a [[cohort]] cluster. A cohort identifier, which Google calls “flock,” is short enough string (e.g., “43A7”) such that it cannot – even with other data – be used to uniquely identify a particular device. On each request the web client will send the cohort ID Google has assigned to this web client using the "Sec-CH-Flock" header.  
+
The goal of Google's Federated Learning of Cohorts (FLoC) is to reduce organizations' ability to access accurate [[Audience|audience]] membership across different publishers.<ref>https://https://github.com/WICG/floc</ref>
  
The minimum number of people in a cohort is likely in the thousands.<ref>https://iabtechlab.com/blog/explaining-the-privacy-sandbox-explainers</ref> To ensure that a minimum number of web clients is in each cohort, the web client identifier is sent to a Google controlled server to enable distinct counting.  
+
Google's FloC proposal collects and processes web client behavior across various web sites to assign each web client to a single [[cohort]] cluster.<ref>https://privacysandbox.com/proposals/floc</ref> A cohort identifier, which Google calls “flock,” is short enough string (e.g., “43A7”) such that it cannot – even with other data – be used to uniquely identify a particular device. During early trials, on each request the web client sent the cohort ID Google has assigned to this web client using the "Sec-CH-Flock" header. During current trials, FLoC membership and the version of the algorithm to assign membership is returned by the FLoC API.<ref>https://web.dev/floc</ref>
  
Marketers could measure which cohorts interact with their content, but would not know distinct reach or frequency. Marketers would also presumably be prohibited from associating particular outcomes on their web property with the cohort associated with the user's web client.  
+
The minimum number of people in a cohort is likely in the thousands.<ref>https://iabtechlab.com/blog/explaining-the-privacy-sandbox-explainers</ref> The original trial FLoC's used a 8-bit cohort audience ID that meant each browser would be assigned to only one of 256 possible audience segments. Google has since expanded the this to a 50-bit cohort audience ID that means each browser can be assigned to only one of 33,872 audience segments.<ref>https://www.eff.org/deeplinks/2021/03/google-testing-its-controversial-new-ad-targeting-tech-millions-browsers-heres</ref>
 +
 
 +
To ensure that a minimum number of web clients is in each cohort, the web client identifier is sent to a Google controlled server to enable distinct counting.  
 +
 
 +
Google will reassign each browser to a FLoC, potentially the same one, once every seven days.<ref>https://privacysandbox.com/proposals/floc</ref>
 +
 
 +
Google began its FLoC trial on 0.5% of Chrome users in certain countries in March 2021.<ref>https://www.thedrum.com/news/2021/01/25/what-the-fk-floc-google-opens-up-post-cookie-roadmap</ref>
  
 
Google published a research paper that compared the accuracy of assigning browser identifiers to cohorts.<ref>https://github.com/google/ads-privacy/blob/master/proposals/FLoC/FLOC-Whitepaper-Google.pdf</ref> Google's findings were that it was able to achieve 70% accuracy in building cohort clusters relative to random assignment, which is still well below the segmentation accuracy of the current industry standard of using cookies for audience creation.  
 
Google published a research paper that compared the accuracy of assigning browser identifiers to cohorts.<ref>https://github.com/google/ads-privacy/blob/master/proposals/FLoC/FLOC-Whitepaper-Google.pdf</ref> Google's findings were that it was able to achieve 70% accuracy in building cohort clusters relative to random assignment, which is still well below the segmentation accuracy of the current industry standard of using cookies for audience creation.  
 +
 +
The widely publicized statement that "marketers should expect to see 95% [effectiveness] when compared to cookie-based advertising,"<ref>https://blog.google/products/ads-commerce/2021-01-privacy-sandbox/</ref> was later to disclosed to the W3C that people "interpreted" the statement incorrectly given Google's experiment relied on its standard DSP that continued to rely on cookies, frequency capping and real-time optimization to achieve these results.<ref>https://www.w3.org/2021/02/02-web-adv-minutes.html</ref> The experiment also did not exclude retargeting given other Privacy Sandbox proposals are supposed to enable this audience data to improve marketers' outcomes.<ref>https://www.w3.org/2021/02/02-web-adv-minutes.html</ref> 
  
 
== Impact ==
 
== Impact ==
 
By removing audience segmentation, marketers would be unable to perform retargeting and frequency capping.  
 
By removing audience segmentation, marketers would be unable to perform retargeting and frequency capping.  
  
Another potential impact of removing this information is a degraded end user experience.  
+
Marketers may be able to measure which cohorts interact with their content, but would not know distinct reach or frequency. Marketers would also presumably be prohibited from associating particular outcomes on their web property with the cohort associated with the user's web client.
 +
 
 +
Given FLoC membership is reassigned once every seven days, this further inhibits marketers' ability to learn which FLoCs they expose later visit their web properties, as many products and services with purchase cycles greater than seven days may influence the FLoCs that correlate the greatest with the marekter's own web property.
 +
 
 +
Another potential impact of removing accurate audience segment information is a degraded end user experience.  
  
 
Another impact of cohorts is that they do not support smaller organizations (advertisers or publishers) as their audiences may be too small to meet the minimum threshold set by the cohort.
 
Another impact of cohorts is that they do not support smaller organizations (advertisers or publishers) as their audiences may be too small to meet the minimum threshold set by the cohort.
Line 19: Line 31:
  
 
The lack of any incentive for publishers to transfer audience building solely to Google has also been criticized. <ref>https://github.com/WICG/floc/issues/45</ref>
 
The lack of any incentive for publishers to transfer audience building solely to Google has also been criticized. <ref>https://github.com/WICG/floc/issues/45</ref>
 +
 +
== Regulator Perspectives ==
 +
The UK CMA noted (5.39-5.41) that should Google impair the accuracy of audience segments with its FLoC proposal, this would likely impair the "attractiveness of the open display market," for three reasons. Google's proposal would:
 +
 +
# reduce competitive differentiation and lead to a "homogenization of of ad inventory and ad tech services,"
 +
# "reduce the ability of rivals to provide a value proposition" and
 +
# given Google's extensive owned and operated properties would inform its ad buying solutions advantage over rivals even if its migrated its own ad solutions to rely only on FLoC audiences<ref>https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/992975/Notice_of_intention_to_accept_binding_commitments_offered_by_Google_publication.pdf</ref>
 +
 +
Note: Google has made no commitment that it will rely only on FLoC audiences to monetize its own web properties. 
 +
 +
== Perspectives of Trade Body and Advocacy Groups ==
 +
Electronic Frontier Foundation (EFF) has criticized FLoC as being a "terrible idea."<ref>https://www.eff.org/deeplinks/2021/03/googles-floc-terrible-idea</ref>
 +
 +
Given Google relies on unsupervised learning, EFF criticizes Google's design given some FLoCs are likely to consist of sensitive category information, such as members of a protect class whose web behaviors are similar. Google did analyze the synced browsing history of its Chrome users and validated this concern, given many FLoCs did express browsing activity that could be used to ascribe a sensitive category label to that FLoC.<ref>https://docs.google.com/a/google.com/viewer?a=v&pid=sites&srcid=Y2hyb21pdW0ub3JnfGRldnxneDo1Mzg4MjYzOWI2MzU2NDgw</ref> Google suggested that by sending such browsing activity to its servers, it could analyze whether individuals who have been assigned to a given FLoC interact more frequently with sensitive category data, so that Google can suppress these FLoCs from being used for advertising.<ref>https://docs.google.com/a/google.com/viewer?a=v&pid=sites&srcid=Y2hyb21pdW0ub3JnfGRldnxneDo1Mzg4MjYzOWI2MzU2NDgw</ref> 
 +
 +
EFF also criticizes Google's design given there is little notice or choice provided to people for Google's use of the consumer browser software to generate a business-to-business ads product.<ref>https://www.eff.org/deeplinks/2021/03/google-testing-its-controversial-new-ad-targeting-tech-millions-browsers-heres</ref>
  
 
== Open Questions ==
 
== Open Questions ==
* How frequently does a user's cohort membership change?
+
* What is the value impairment associated with shifting from marketer-defined audiences to Google-defined cohorts, even if frequency capping and real-time optimization were still available?
* What is the value impairment associated with shifting from marketer-defined audiences to Google-defined cohorts?  
+
* How frequently would algorithms that assign FLoC audiences changes?   
 +
* What oversight would exit to review whether changes to FLoC audience creation algorithm is better for users, publishers, marketers and the supply chain vendors they rely upon to operate their businesses?  
  
 
== References ==
 
== References ==

Revision as of 00:27, 13 June 2021

Overview

The goal of Google's Federated Learning of Cohorts (FLoC) is to reduce organizations' ability to access accurate audience membership across different publishers.[1]

Google's FloC proposal collects and processes web client behavior across various web sites to assign each web client to a single cohort cluster.[2] A cohort identifier, which Google calls “flock,” is short enough string (e.g., “43A7”) such that it cannot – even with other data – be used to uniquely identify a particular device. During early trials, on each request the web client sent the cohort ID Google has assigned to this web client using the "Sec-CH-Flock" header. During current trials, FLoC membership and the version of the algorithm to assign membership is returned by the FLoC API.[3]

The minimum number of people in a cohort is likely in the thousands.[4] The original trial FLoC's used a 8-bit cohort audience ID that meant each browser would be assigned to only one of 256 possible audience segments. Google has since expanded the this to a 50-bit cohort audience ID that means each browser can be assigned to only one of 33,872 audience segments.[5]

To ensure that a minimum number of web clients is in each cohort, the web client identifier is sent to a Google controlled server to enable distinct counting.

Google will reassign each browser to a FLoC, potentially the same one, once every seven days.[6]

Google began its FLoC trial on 0.5% of Chrome users in certain countries in March 2021.[7]

Google published a research paper that compared the accuracy of assigning browser identifiers to cohorts.[8] Google's findings were that it was able to achieve 70% accuracy in building cohort clusters relative to random assignment, which is still well below the segmentation accuracy of the current industry standard of using cookies for audience creation.

The widely publicized statement that "marketers should expect to see 95% [effectiveness] when compared to cookie-based advertising,"[9] was later to disclosed to the W3C that people "interpreted" the statement incorrectly given Google's experiment relied on its standard DSP that continued to rely on cookies, frequency capping and real-time optimization to achieve these results.[10] The experiment also did not exclude retargeting given other Privacy Sandbox proposals are supposed to enable this audience data to improve marketers' outcomes.[11]

Impact

By removing audience segmentation, marketers would be unable to perform retargeting and frequency capping.

Marketers may be able to measure which cohorts interact with their content, but would not know distinct reach or frequency. Marketers would also presumably be prohibited from associating particular outcomes on their web property with the cohort associated with the user's web client.

Given FLoC membership is reassigned once every seven days, this further inhibits marketers' ability to learn which FLoCs they expose later visit their web properties, as many products and services with purchase cycles greater than seven days may influence the FLoCs that correlate the greatest with the marekter's own web property.

Another potential impact of removing accurate audience segment information is a degraded end user experience.

Another impact of cohorts is that they do not support smaller organizations (advertisers or publishers) as their audiences may be too small to meet the minimum threshold set by the cohort.

The philosophy behind FLoC's unsupervised clustering has also been criticized as revealing sensitive information, such as people who frequent protected health conditions, or other protected classes of information. "The web currently has more than 1.2 billion sites (including parked domains). It is impractical for even a large browser developer to test for which patterns of usage of which sites are inadvertently revealing sensitive information about a user." [12]

The lack of any incentive for publishers to transfer audience building solely to Google has also been criticized. [13]

Regulator Perspectives

The UK CMA noted (5.39-5.41) that should Google impair the accuracy of audience segments with its FLoC proposal, this would likely impair the "attractiveness of the open display market," for three reasons. Google's proposal would:

  1. reduce competitive differentiation and lead to a "homogenization of of ad inventory and ad tech services,"
  2. "reduce the ability of rivals to provide a value proposition" and
  3. given Google's extensive owned and operated properties would inform its ad buying solutions advantage over rivals even if its migrated its own ad solutions to rely only on FLoC audiences[14]

Note: Google has made no commitment that it will rely only on FLoC audiences to monetize its own web properties.

Perspectives of Trade Body and Advocacy Groups

Electronic Frontier Foundation (EFF) has criticized FLoC as being a "terrible idea."[15]

Given Google relies on unsupervised learning, EFF criticizes Google's design given some FLoCs are likely to consist of sensitive category information, such as members of a protect class whose web behaviors are similar. Google did analyze the synced browsing history of its Chrome users and validated this concern, given many FLoCs did express browsing activity that could be used to ascribe a sensitive category label to that FLoC.[16] Google suggested that by sending such browsing activity to its servers, it could analyze whether individuals who have been assigned to a given FLoC interact more frequently with sensitive category data, so that Google can suppress these FLoCs from being used for advertising.[17]

EFF also criticizes Google's design given there is little notice or choice provided to people for Google's use of the consumer browser software to generate a business-to-business ads product.[18]

Open Questions

  • What is the value impairment associated with shifting from marketer-defined audiences to Google-defined cohorts, even if frequency capping and real-time optimization were still available?
  • How frequently would algorithms that assign FLoC audiences changes?
  • What oversight would exit to review whether changes to FLoC audience creation algorithm is better for users, publishers, marketers and the supply chain vendors they rely upon to operate their businesses?

References