LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing Authors: Hongxiang Zhang, Yuyang Rong, Yifeng He, Hao Chen | Published: 2024-06-11 | Updated: 2024-06-13 LLM Performance EvaluationFuzzingPrompt Injection 2024.06.11 2025.05.27 Literature Database
An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection Authors: Shenao Yan, Shen Wang, Yue Duan, Hanbin Hong, Kiho Lee, Doowon Kim, Yuan Hong | Published: 2024-06-10 LLM SecurityBackdoor AttackPrompt Injection 2024.06.10 2025.05.27 Literature Database
SecureNet: A Comparative Study of DeBERTa and Large Language Models for Phishing Detection Authors: Sakshi Mahendru, Tejul Pandit | Published: 2024-06-10 LLM Performance EvaluationPhishing DetectionPrompt Injection 2024.06.10 2025.05.27 Literature Database
Safety Alignment Should Be Made More Than Just a Few Tokens Deep Authors: Xiangyu Qi, Ashwinee Panda, Kaifeng Lyu, Xiao Ma, Subhrajit Roy, Ahmad Beirami, Prateek Mittal, Peter Henderson | Published: 2024-06-10 LLM SecurityPrompt InjectionSafety Alignment 2024.06.10 2025.05.27 Literature Database
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States Authors: Zhenhong Zhou, Haiyang Yu, Xinghua Zhang, Rongwu Xu, Fei Huang, Yongbin Li | Published: 2024-06-09 | Updated: 2024-06-13 LLM SecurityPrompt InjectionCompliance with Ethical Guidelines 2024.06.09 2025.05.27 Literature Database
Adversarial Tuning: Defending Against Jailbreak Attacks for LLMs Authors: Fan Liu, Zhao Xu, Hao Liu | Published: 2024-06-07 LLM SecurityPrompt InjectionAdversarial Training 2024.06.07 2025.05.27 Literature Database
GENIE: Watermarking Graph Neural Networks for Link Prediction Authors: Venkata Sai Pranav Bachina, Ankit Gangwal, Aaryan Ajay Sharma, Charu Sharma | Published: 2024-06-07 | Updated: 2025-01-12 WatermarkingPrompt InjectionWatermark Robustness 2024.06.07 2025.05.27 Literature Database
AutoJailbreak: Exploring Jailbreak Attacks and Defenses through a Dependency Lens Authors: Lin Lu, Hai Yan, Zenghui Yuan, Jiawen Shi, Wenqi Wei, Pin-Yu Chen, Pan Zhou | Published: 2024-06-06 LLM Performance EvaluationPrompt InjectionDefense Method 2024.06.06 2025.05.27 Literature Database
BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents Authors: Yifei Wang, Dizhan Xue, Shengjie Zhang, Shengsheng Qian | Published: 2024-06-05 LLM SecurityBackdoor AttackPrompt Injection 2024.06.05 2025.05.27 Literature Database
Safeguarding Large Language Models: A Survey Authors: Yi Dong, Ronghui Mu, Yanghao Zhang, Siqi Sun, Tianle Zhang, Changshun Wu, Gaojie Jin, Yi Qi, Jinwei Hu, Jie Meng, Saddek Bensalem, Xiaowei Huang | Published: 2024-06-03 LLM SecurityGuardrail MethodPrompt Injection 2024.06.03 2025.05.27 Literature Database