Scaup
The goal of Google's Scaup is to provide marketers look-alike modeling capabilities, without providing marketers access to event data.[1]
Google's Scaup proposal relies on Multi-Party Computation (MPC) whereby the web client sends an encrypted version of its personal information to two trusted servers to build models and the outputs of these models is stored for later application. Each of the MPC servers must store historical data to be used as inputs into the modeling. However, under current design the MPC servers are not provided complete information.
MPC means that so long as one of these servers is trusted to be accountable, then there is a guarantee that if the other server is not being "honest" with its calculations it can be detected. "Honest" means that the result is mathematically accurate.
The web client periodically queries the trusted server to receive guidance on whether it belongs to a look-alike model. To receive the correct audience attribute membership the web client must send the information that was used to train the model. Thus the web client sends its history rather an identifier.
Under this proposal, the marketer is allowed to push the machine learning model to the trusted server. The trusted server informs the browser to store the events and features that would be inputs into the machine learning model.
Scaup still relies on Turtledove auctions that keep audience-based buying separate from contextual buying.
Impact
Prospecting is a critical activity for marketers. Prospecting chooses audiences that are believed to be more likely to become new customers. By centralizing control over this important function, marketers will have less choice on the vendors they can use.
Another potential impact is given the time delay in this look-alike modeling process assigning a given web client eligibility for a new audience attribute, the user experience will be diminished.
There are limits on scalability given web clients cannot process or store the information to support the multiple vendors marketers rely upon to generate audiences for each marketer brand. Bandwidth and battery impact on portable devices associated with audience creation are other related issues.
See Turtledove impacts for auction-related issues.
Open Questions
- Given each marketer would like to generate different look-alike models for different products, how much data is required to be sent and stored on the web client?
- What is the process for determining which organizations' servers are trusted?