Sushi
Adobe’s Suggested and User-Specified Hierarchical Interests (Sushi) is centered around organizations assigning a set of hierarchical topics of interest that can later be used in selecting relevant ads.[1] The restriction on using a fixed hierarchy of topics is meant to provide enhanced transparency to users as to the audience interest used to match content. Sushi relies on pre-defined rules to rank which attributes would be made available for decisioning logic.
Adobe’s proposal adopts the goal of Google's in preventing marketers from engaging a particular audience in a particular context.
Overview
The Sushi proposal separates the assignment of topics from the later inference logic to guess what a user might be interested in. For example, a user may visit multiple sites and be assigned to four separate interest topics from the taxonomy: sports→baseball→pros→Giants sports→football→pros→49ers sports→football→college→Stanford sports→football→college→Berkeley
“In this case sports has been suggested four times, football three times and college football twice ("pros" isn't counted as occurring twice, because its two occurrences are in different subtrees of the hierarchy).”[2] The SUSHI proposal suggests the web client should control the logic that relies on the recency and frequency of these topics to determine which nodes of the taxonomy are eligible for content matching.
Sushi suggests the look-back window of prior activity ought to be “at least 30 days (unless cleared by the user)”.[3] Sushi suggests that for seasonal topics, such as sports leagues or holidays, the interest should persist until the next season, rather than having a cold start to relearn such an interest each season.
To limit use of the user’s limited local storage for this B2B use case, Sushi suggests each organization will be limited to contribute a limited number of suggestions by a budgeting method. Thus, if one organization submits “five topics, while another only suggests one, the single topic might get five times as much weight as any of the five.” The Sushi proposal suggests that this weighting ought to also act as a hint into how strong the organization believes the user interest to be, such as “assigning a weight of 50% to the primary topic, 20% to a secondary topic and 10% to each of the remaining topics.”[4] Note: this proposal does not specify how an organization contributing only one suggested topic may limit the weighting if it is not as confident in the strength of the predicted interest (e.g., assuming a 20% vs 80% affinity).
To handle the disclosure of sensitive proprietary information for each organization, Sushi states “suggested interests should remain exclusive to the advertiser until the same topic is suggested by a sufficient number of different advertisers and/or publishers.”[5]
Sushi suggests that in addition to generating audience interests a new icon should be displayed when this information is used to match content in the URL/bookmarks area of the web client. When clicked, this icon should show:
- If Sushi was used in matching content on the page
- The top-most topics inferred to be of interest to the user, regardless of current page
- The current contextual topic of the page, regardless if used to match the content on the page
- Any topics sent to the content decisioning system to match content on the page
Moreover, Sushi suggests that the user should be able to see which URLs assigned which interest topics. Thus, if the user was assigned the topic “sports→football,” the user should see the history of all sites that were used as inputs into the behavioral classification process. If the user disagrees with the inference, then the user should be able to object, which would then block further assignment in the future.
Sushi suggests that the user should be able to manually self-assign interest topics from the same taxonomy. Such self-assigned topics should be given higher weighting by the web client behavioral classification logic. The assignment of or objection to an interest at a parent node should be inherited by all children nodes in the taxonomy. SUSHI suggests that users could optionally specify a different expiration of interest topics on a per node basis.
When signaling the interest topics to content matching systems, Sushi suggests that the random subset of all interest topics be sent, as well as which were self-assigned or assigned by current page. A trusted server would keep track of which combinations are so rare that this array might be used as a unique identifier by the content decisioning system. Sushi suggests that weightings of interest be used to help determine which interest topics are sent, such that higher interests are sent more frequently than lower interests.
Sushi states that IAB taxonomy[6] is not detailed enough to support the interest classification required for this proposal. As one approach, each organization might submit custom nodes that are more granular than the standard taxonomy.
Impact
Similar to Fenced Frames and Turtledove/Fledge, Sushi states that content matching systems should not be able to learn which ads were correlated to which conversion events. This will impair marketer effectiveness and negatively impact publisher revenues.
Not all interest attributes are equally valuable to all marketers. By restricting the list of available attributes, or ranking them on criteria independent from publisher monetization, will further impair publisher revenues.
Sushi's proposed weighting mechanism confuses marketers' estimated probability of accuracy with value. As mentioned above, if a marketer can guess only a single attribute (interest "topic") at 20% vs 80% accuracy this 4x weighting has no direct relationship to another marketer submitting four times as many topics.
The Sushi proposal requires storage of all domains linked to the generation of each attribute, which will use greater client storage than needed at present. This forces people to absorb the cost of B2B storage and processing, that may require them to update older devices to continue to access ad-funded sites.
Open Questions
- How will any interests scale if each organization wants its topic information to remain proprietary?
- As there are millions of organizations, if each is assigning custom nodes what is the impact on local storage or classification process of interests?
- How do different organizations that would prefer different rule sets to assign interest classification provide feedback to centralized behavioral interest classification system (e.g., minimum of 4 events not 3 to be eligible or 7 rather than 30 days look-back window for recency)?
- How will machine learning inside the web client improve its classification of interest inferences across different browsers signed into a single domain by the same individual to ensure they are improving publisher monetization and marketer ROAS?
- How will machine learning of organizations assigning interest topics learn how to improve the hints they assign to each topic (e.g., first topic should receive 50% not 20%)?
- How will organizations learn which interests a user has blocked, so they stop wasting their limited budget on trying to assign those interests to those users?
- Should self-assigned interests expire at the same rate as system-assigned topics by default?
See Also
References
- ↑ https://github.com/privacycg/proposals/issues/27
- ↑ https://github.com/privacycg/proposals/issues/27
- ↑ https://github.com/privacycg/proposals/issues/27
- ↑ https://github.com/privacycg/proposals/issues/27
- ↑ https://github.com/privacycg/proposals/issues/27
- ↑ https://www.iab.com/guidelines/content-taxonomy/