Literature Database

LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet

Authors: Nathaniel Li, Ziwen Han, Ian Steneker, Willow Primack, Riley Goodside, Hugh Zhang, Zifan Wang, Cristina Menghini, Summer Yue | Published: 2024-08-27 | Updated: 2024-09-04

Prompt Injection

User Education

Attack Method

2024.08.27 2025.05.27

Literature Database

Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection

Authors: Niklas Risse, Jing Liu, Marcel Böhme | Published: 2024-08-23 | Updated: 2025-04-23

Security Analysis

Vulnerability Management

Evaluation Method

2024.08.23 2025.05.27

Literature Database

Obfuscated Memory Malware Detection

Authors: Sharmila S P, Aruna Tiwari, Narendra S Chaudhari | Published: 2024-08-23

Cybersecurity

Malware Classification

Model Performance Evaluation

2024.08.23 2025.05.27

Literature Database

Is Generative AI the Next Tactical Cyber Weapon For Threat Actors? Unforeseen Implications of AI Generated Cyber Attacks

Authors: Yusuf Usman, Aadesh Upadhyay, Prashnna Gyawali, Robin Chataut | Published: 2024-08-23

Cybersecurity

Prompt Injection

Attack Method

2024.08.23 2025.05.27

Literature Database

LLM-PBE: Assessing Data Privacy in Large Language Models

Authors: Qinbin Li, Junyuan Hong, Chulin Xie, Jeffrey Tan, Rachel Xin, Junyi Hou, Xavier Yin, Zhun Wang, Dan Hendrycks, Zhangyang Wang, Bo Li, Bingsheng He, Dawn Song | Published: 2024-08-23 | Updated: 2024-09-06

LLM Security

Privacy Protection Method

Prompt Injection

2024.08.23 2025.05.27

Literature Database

Efficient Detection of Toxic Prompts in Large Language Models

Authors: Yi Liu, Junzhe Yu, Huijia Sun, Ling Shi, Gelei Deng, Yuqi Chen, Yang Liu | Published: 2024-08-21 | Updated: 2024-09-14

Content Moderation

Prompt Injection

Model Performance Evaluation

2024.08.21 2025.05.27

Literature Database

Private Counting of Distinct Elements in the Turnstile Model and Extensions

Authors: Monika Henzinger, A. R. Sricharan, Teresa Anna Steiner | Published: 2024-08-21

Algorithm

Privacy Protection Method

2024.08.21 2025.05.27

Literature Database

Large Language Models are Good Attackers: Efficient and Stealthy Textual Backdoor Attacks

Authors: Ziqiang Li, Yueqi Zeng, Pengfei Xia, Lei Liu, Zhangjie Fu, Bin Li | Published: 2024-08-21

Backdoor Attack

Poisoning

2024.08.21 2025.05.27

Literature Database

EEG-Defender: Defending against Jailbreak through Early Exit Generation of Large Language Models

Authors: Chongwen Zhao, Zhihao Dou, Kaizhu Huang | Published: 2024-08-21

LLM Security

Prompt Injection

Defense Method

2024.08.21 2025.05.27

Literature Database

Hide Your Malicious Goal Into Benign Narratives: Jailbreak Large Language Models through Carrier Articles

Authors: Zhilong Wang, Haizhou Wang, Nanqing Luo, Lan Zhang, Xiaoyan Sun, Yebo Cao, Peng Liu | Published: 2024-08-20 | Updated: 2025-02-07

Prompt Injection

Large Language Model

Attack Scenario Analysis

2024.08.20 2025.05.27

Literature Database