Large Language Model

Test-Time Immunization: A Universal Defense Framework Against Jailbreaks for (Multimodal) Large Language Models

Authors: Yongcan Yu, Yanbo Wang, Ran He, Jian Liang | Published: 2025-05-28
LLM Security
Prompt Injection
Large Language Model

Deconstructing Obfuscation: A four-dimensional framework for evaluating Large Language Models assembly code deobfuscation capabilities

Authors: Anton Tkachenko, Dmitrij Suskevic, Benjamin Adolphi | Published: 2025-05-26
Model evaluation methods
Large Language Model
Watermarking Technology

What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs

Authors: Sangyeop Kim, Yohan Lee, Yongwoo Song, Kimin Lee | Published: 2025-05-26
Prompt Injection
Model Performance Evaluation
Large Language Model

Scalable Defense against In-the-wild Jailbreaking Attacks with Safety Context Retrieval

Authors: Taiye Chen, Zeming Wei, Ang Li, Yisen Wang | Published: 2025-05-21
RAG
Large Language Model
Defense Mechanism

sudoLLM : On Multi-role Alignment of Language Models

Authors: Soumadeep Saha, Akshay Chaturvedi, Joy Mahapatra, Utpal Garain | Published: 2025-05-20
Alignment
Prompt Injection
Large Language Model

Dark LLMs: The Growing Threat of Unaligned AI Models

Authors: Michael Fire, Yitzhak Elbazis, Adi Wasenstein, Lior Rokach | Published: 2025-05-15
Disabling Safety Mechanisms of LLM
Prompt Injection
Large Language Model

Analysing Safety Risks in LLMs Fine-Tuned with Pseudo-Malicious Cyber Security Data

Authors: Adel ElZemity, Budi Arief, Shujun Li | Published: 2025-05-15
LLM Security
Prompt Injection
Large Language Model

Towards a standardized methodology and dataset for evaluating LLM-based digital forensic timeline analysis

Authors: Hudan Studiawan, Frank Breitinger, Mark Scanlon | Published: 2025-05-06
LLM Performance Evaluation
Large Language Model
Evaluation Method

$\texttt{SAGE}$: A Generic Framework for LLM Safety Evaluation

Authors: Madhur Jindal, Hari Shrawgi, Parag Agrawal, Sandipan Dandapat | Published: 2025-04-28
User Identification System
Large Language Model
Trade-Off Between Safety And Usability

Amplified Vulnerabilities: Structured Jailbreak Attacks on LLM-based Multi-Agent Debate

Authors: Senmao Qi, Yifei Zou, Peng Li, Ziyi Lin, Xiuzhen Cheng, Dongxiao Yu | Published: 2025-04-23
Indirect Prompt Injection
Multi-Round Dialogue
Large Language Model