強化学習最適化

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Authors: Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, Chelsea Finn | Published: 2023-05-29 | Updated: 2024-07-29
アライメント
報酬メカニズム設計
強化学習最適化

RRHF: Rank Responses to Align Language Models with Human Feedback without tears

Authors: Zheng Yuan, Hongyi Yuan, Chuanqi Tan, Wei Wang, Songfang Huang, Fei Huang | Published: 2023-04-11 | Updated: 2023-10-07
アライメント
報酬メカニズム設計
強化学習最適化

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

Authors: Yuntao Bai, Andy Jones, Kamal Ndousse, Amanda Askell, Anna Chen, Nova DasSarma, Dawn Drain, Stanislav Fort, Deep Ganguli, Tom Henighan, Nicholas Joseph, Saurav Kadavath, Jackson Kernion, Tom Conerly, Sheer El-Showk, Nelson Elhage, Zac Hatfield-Dodds, Danny Hernandez, Tristan Hume, Scott Johnston, Shauna Kravec, Liane Lovitt, Neel Nanda, Catherine Olsson, Dario Amodei, Tom Brown, Jack Clark, Sam McCandlish, Chris Olah, Ben Mann, Jared Kaplan | Published: 2022-04-12
アライメント
強化学習最適化
性能評価

Zeno: Distributed Stochastic Gradient Descent with Suspicion-based Fault-tolerance

Authors: Cong Xie, Oluwasanmi Koyejo, Indranil Gupta | Published: 2018-05-25 | Updated: 2019-05-18
強化学習最適化
損失関数
線形モデル

A Study on Overfitting in Deep Reinforcement Learning

Authors: Chiyuan Zhang, Oriol Vinyals, Remi Munos, Samy Bengio | Published: 2018-04-18 | Updated: 2018-04-20
トレーニング手法
一般化性能
強化学習最適化