報酬メカニズム設計

Secure Tug-of-War (SecTOW): Iterative Defense-Attack Training with Reinforcement Learning for Multimodal Model Security

Authors: Muzhi Dai, Shixuan Liu, Zhiyuan Zhao, Junyu Gao, Hao Sun, Xuelong Li | Published: 2025-07-29

報酬メカニズム設計

強化学習最適化

防御手法

2025.07.29

文献データベース

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Authors: Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, Chelsea Finn | Published: 2023-05-29 | Updated: 2024-07-29

アライメント

報酬メカニズム設計

強化学習最適化

2023.05.29 2025.04.03

文献データベース

RRHF: Rank Responses to Align Language Models with Human Feedback without tears

Authors: Zheng Yuan, Hongyi Yuan, Chuanqi Tan, Wei Wang, Songfang Huang, Fei Huang | Published: 2023-04-11 | Updated: 2023-10-07

アライメント

報酬メカニズム設計

強化学習最適化

2023.04.11 2025.04.03

文献データベース

IronForge: An Open, Secure, Fair, Decentralized Federated Learning

Authors: Guangsheng Yu, Xu Wang, Caijun Sun, Qin Wang, Ping Yu, Wei Ni, Ren Ping Liu, Xiwei Xu | Published: 2023-01-07

プライバシー保護技術

プロンプトインジェクション

報酬メカニズム設計

2023.01.07 2025.04.03

文献データベース

Ares: A System-Oriented Wargame Framework for Adversarial ML

Authors: Farhan Ahmed, Pratik Vaishnavi, Kevin Eykholt, Amir Rahmati | Published: 2022-10-24

ポイズニング

報酬メカニズム設計

評価手法

2022.10.24 2025.04.03

文献データベース

Blockchain and Machine Learning for Fraud Detection: A Privacy-Preserving and Adaptive Incentive Based Approach

Authors: Tahmid Hasan Pranto, Kazi Tamzid Akhter Md Hasib, Tahsinur Rahman, AKM Bahalul Haque, A. K. M. Najmul Islam, Rashedur M. Rahman | Published: 2022-10-23

ブロックチェーンとFLの統合

不正取引

報酬メカニズム設計

2022.10.23 2025.04.03

文献データベース

Reinforcement Learning for Hardware Security: Opportunities, Developments, and Challenges

Authors: Satwik Patnaik, Vasudev Gohil, Hao Guo, Jeyavijayan, Rajendran | Published: 2022-08-29

報酬メカニズム設計

最適化問題

機械学習技術

2022.08.29 2025.04.03

文献データベース

Understanding the Limits of Poisoning Attacks in Episodic Reinforcement Learning

Authors: Anshuka Rangi, Haifeng Xu, Long Tran-Thanh, Massimo Franceschetti | Published: 2022-08-29

サイバー攻撃

報酬メカニズム設計

最適化問題

2022.08.29 2025.04.03

文献データベース

Dual-Mandate Patrols: Multi-Armed Bandits for Green Security

Authors: Lily Xu, Elizabeth Bondi, Fei Fang, Andrew Perrault, Kai Wang, Milind Tambe | Published: 2020-09-14 | Updated: 2024-04-26

報酬メカニズム設計

性能評価指標

最適化アルゴリズムの選択と評価

2020.09.14 2025.04.03

文献データベース

Enhanced Adversarial Strategically-Timed Attacks against Deep Reinforcement Learning

Authors: Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Yi Ouyang, I-Te Danny Hung, Chin-Hui Lee, Xiaoli Ma | Published: 2020-02-20

報酬メカニズム設計

脆弱性予測

防御手法

2020.02.20 2025.04.03

文献データベース