Reinforced Few-Shot Acquisition Function Learning for Bayesian Optimization

TOP Literature Database Reinforced Few-Shot Acquisition Function Learning for Bayesian Optimization

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2106.04335

PDF

https://arxiv.org/pdf/2106.04335

Paper Information

Author: Bing-Jing Hsieh;Ping-Chun Hsieh;Xi Liu
Published: 6-8-2021
Affiliation: Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan
Country: Taiwan
Conference: Conference on Neural Information Processing Systems (NeurIPS)

Labels Estimated by AI

Optimization Methods Reinforcement Learning Machine Learning

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Bayesian optimization (BO) conventionally relies on handcrafted acquisition functions (AFs) to sequentially determine the sample points. However, it has been widely observed in practice that the best-performing AF in terms of regret can vary significantly under different types of black-box functions. It has remained a challenge to design one AF that can attain the best performance over a wide variety of black-box functions. This paper aims to attack this challenge through the perspective of reinforced few-shot AF learning (FSAF). Specifically, we first connect the notion of AFs with Q-functions and view a deep Q-network (DQN) as a surrogate differentiable AF. While it serves as a natural idea to combine DQN and an existing few-shot learning method, we identify that such a direct combination does not perform well due to severe overfitting, which is particularly critical in BO due to the need of a versatile sampling policy. To address this, we present a Bayesian variant of DQN with the following three features: (i) It learns a distribution of Q-networks as AFs based on the Kullback-Leibler regularization framework. This inherently provides the uncertainty required in sampling for BO and mitigates overfitting. (ii) For the prior of the Bayesian DQN, we propose to use a demo policy induced by an off-the-shelf AF for better training stability. (iii) On the meta-level, we leverage the meta-loss of Bayesian model-agnostic meta-learning, which serves as a natural companion to the proposed FSAF. Moreover, with the proper design of the Q-networks, FSAF is general-purpose in that it is agnostic to the dimension and the cardinality of the input domain. Through extensive experiments, we demonstrate that the FSAF achieves comparable or better regrets than the state-of-the-art benchmarks on a wide variety of synthetic and real-world test functions.