On the Robustness of Epoch-Greedy in Multi-Agent Contextual Bandit Mechanisms

TOP Literature Database On the Robustness of Epoch-Greedy in Multi-Agent Contextual Bandit Mechanisms

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2307.07675

PDF

https://arxiv.org/pdf/2307.07675

Paper Information

Author: Yinglun Xu;Bhuvesh Kumar;Jacob Abernethy
Published: 7-15-2023
Affiliation: University of Illinois at Urbana-Champaign
Country: United States of America
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Robustness Evaluation Algorithm Design Efficient Resolution of Learning Tasks

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Efficient learning in multi-armed bandit mechanisms such as pay-per-click (PPC) auctions typically involves three challenges: 1) inducing truthful bidding behavior (incentives), 2) using personalization in the users (context), and 3) circumventing manipulations in click patterns (corruptions). Each of these challenges has been studied orthogonally in the literature; incentives have been addressed by a line of work on truthful multi-armed bandit mechanisms, context has been extensively tackled by contextual bandit algorithms, while corruptions have been discussed via a recent line of work on bandits with adversarial corruptions. Since these challenges co-exist, it is important to understand the robustness of each of these approaches in addressing the other challenges, provide algorithms that can handle all simultaneously, and highlight inherent limitations in this combination. In this work, we show that the most prominent contextual bandit algorithm, $\epsilon$-greedy can be extended to handle the challenges introduced by strategic arms in the contextual multi-arm bandit mechanism setting. We further show that $\epsilon$-greedy is inherently robust to adversarial data corruption attacks and achieves performance that degrades linearly with the amount of corruption.