A Robust Phased Elimination Algorithm for Corruption-Tolerant Gaussian Process Bandits

Authors: Ilija Bogunovic, Zihan Li, Andreas Krause, Jonathan Scarlett | Published: 2022-02-03 | Updated: 2022-03-28

2022.02.032025.04.03

Authors: Ilija Bogunovic, Zihan Li, Andreas Krause, Jonathan Scarlett
Published: 2022-02-03 | Updated: 2022-03-28

Source: https://arxiv.org/abs/2202.01850

PDF: https://arxiv.org/pdf/2202.01850

AIにより推定されたラベル

ロバスト性評価アルゴリズム設計収束分析

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

We consider the sequential optimization of an unknown, continuous, and expensive to evaluate reward function, from noisy and adversarially corrupted observed rewards. When the corruption attacks are subject to a suitable budget C and the function lives in a Reproducing Kernel Hilbert Space (RKHS), the problem can be posed as corrupted Gaussian process (GP) bandit optimization. We propose a novel robust elimination-type algorithm that runs in epochs, combines exploration with infrequent switching to select a small subset of actions, and plays each action for multiple time instants. Our algorithm, Robust GP Phased Elimination (RGP-PE), successfully balances robustness to corruptions with exploration and exploitation such that its performance degrades minimally in the presence (or absence) of adversarial corruptions. When T is the number of samples and γ_T is the maximal information gain, the corruption-dependent term in our regret bound is O(Cγ_T^3/2), which is significantly tighter than the existing $O(C \sqrt{T \gamma_T})$ for several commonly-considered kernels. We perform the first empirical study of robustness in the corrupted GP bandit setting, and show that our algorithm is robust against a variety of adversarial attacks.