A Robust Phased Elimination Algorithm for Corruption-Tolerant Gaussian Process Bandits

TOP 文献データベース A Robust Phased Elimination Algorithm for Corruption-Tolerant Gaussian Process Bandits

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2202.01850

PDF

https://arxiv.org/pdf/2202.01850

文献情報

作者: Ilija Bogunovic;Zihan Li;Andreas Krause;Jonathan Scarlett
公開日: 2022-2-4
更新日: 2022-3-29
所属機関: ETH Zurich
所属の国: Switzerland
会議名: Conference on Neural Information Processing Systems (NeurIPS)

AIにより推定されたラベル

ロバスト性評価アルゴリズム設計収束分析

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

We consider the sequential optimization of an unknown, continuous, and expensive to evaluate reward function, from noisy and adversarially corrupted observed rewards. When the corruption attacks are subject to a suitable budget $C$ and the function lives in a Reproducing Kernel Hilbert Space (RKHS), the problem can be posed as corrupted Gaussian process (GP) bandit optimization. We propose a novel robust elimination-type algorithm that runs in epochs, combines exploration with infrequent switching to select a small subset of actions, and plays each action for multiple time instants. Our algorithm, Robust GP Phased Elimination (RGP-PE), successfully balances robustness to corruptions with exploration and exploitation such that its performance degrades minimally in the presence (or absence) of adversarial corruptions. When $T$ is the number of samples and $\gamma_T$ is the maximal information gain, the corruption-dependent term in our regret bound is $O(C \gamma_T^{3/2})$, which is significantly tighter than the existing $O(C \sqrt{T \gamma_T})$ for several commonly-considered kernels. We perform the first empirical study of robustness in the corrupted GP bandit setting, and show that our algorithm is robust against a variety of adversarial attacks.