bi-GRPO: Bidirectional Optimization for Jailbreak Backdoor Injection on LLMs Authors: Wence Ji, Jiancan Wu, Aiying Li, Shuyi Zhang, Junkang Wu, An Zhang, Xiang Wang, Xiangnan He | Published: 2025-09-24 Disabling Safety Mechanisms of LLMPrompt InjectionGenerative Model 2025.09.24 2025.09.26 Literature Database
Exploring the Secondary Risks of Large Language Models Authors: Jiawei Chen, Zhengwei Fang, Xiao Yang, Chao Yu, Zhaoxia Yin, Hang Su | Published: 2025-06-14 | Updated: 2025-09-25 Indirect Prompt InjectionPrompt leakingGenerative Model 2025.06.14 2025.09.27 Literature Database
GIFDL: Generated Image Fluctuation Distortion Learning for Enhancing Steganographic Security Authors: Xiangkun Wang, Kejiang Chen, Yuang Qi, Ruiheng Liu, Weiming Zhang, Nenghai Yu | Published: 2025-04-21 Adversarial LearningGenerative ModelWatermarking Technology 2025.04.21 2025.05.27 Literature Database
Tempest: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search Authors: Andy Zhou, Ron Arel | Published: 2025-03-13 | Updated: 2025-05-21 Disabling Safety Mechanisms of LLMAttack MethodGenerative Model 2025.03.13 2025.05.27 Literature Database
Mark Your LLM: Detecting the Misuse of Open-Source Large Language Models via Watermarking Authors: Yijie Xu, Aiwei Liu, Xuming Hu, Lijie Wen, Hui Xiong | Published: 2025-03-06 | Updated: 2025-03-15 Digital Watermarking for Generative AIGenerative ModelWatermark Removal Technology 2025.03.06 2025.05.27 Literature Database
Cost-Effective Hallucination Detection for LLMs Authors: Simon Valentin, Jinmiao Fu, Gianluca Detommaso, Shaoyuan Xu, Giovanni Zappella, Bryan Wang | Published: 2024-07-31 | Updated: 2024-08-09 HallucinationDetection of HallucinationsGenerative Model 2024.07.31 2025.05.27 Literature Database
SecretGen: Privacy Recovery on Pre-Trained Models via Distribution Discrimination Authors: Zhuowen Yuan, Fan Wu, Yunhui Long, Chaowei Xiao, Bo Li | Published: 2022-07-25 Privacy ClassificationPrivacy LeakageGenerative Model 2022.07.25 2025.05.28 Literature Database
Generative Models for Security: Attacks, Defenses, and Opportunities Authors: Luke A. Bauer, Vincent Bindschaedler | Published: 2021-07-21 | Updated: 2021-07-29 PoisoningAttack MethodGenerative Model 2021.07.21 2025.05.28 Literature Database
PassFlow: Guessing Passwords with Generative Flows Authors: Giulio Pagnotta, Dorjan Hitaj, Fabio De Gaspari, Luigi V. Mancini | Published: 2021-05-13 | Updated: 2021-12-14 Password GuessingPerformance EvaluationGenerative Model 2021.05.13 2025.05.28 Literature Database
Improving Query Efficiency of Black-box Adversarial Attack Authors: Yang Bai, Yuyuan Zeng, Yong Jiang, Yisen Wang, Shu-Tao Xia, Weiwei Guo | Published: 2020-09-24 | Updated: 2020-09-25 Performance EvaluationSelection and Evaluation of Optimization AlgorithmsGenerative Model 2020.09.24 2025.05.28 Literature Database