Prompt Injection

DeepSight: An All-in-One LM Safety Toolkit

Authors: Bo Zhang, Jiaxuan Guo, Lijun Li, Dongrui Liu, Sujin Chen, Guanxu Chen, Zhijie Zheng, Qihao Lin, Lewen Yan, Chen Qian, Yijin Zhou, Yuyao Wu, Shaoxiong Guo, Tianyi Du, Jingyi Yang, Xuhao Hu, Ziqi Miao, Xiaoya Lu, Jing Shao, Xia Hu | Published: 2026-02-12
Prompt Injection
Large Language Model
Evaluation Method

Differentially Private and Communication Efficient Large Language Model Split Inference via Stochastic Quantization and Soft Prompt

Authors: Yujie Gu, Richeng Jin, Xiaoyu Ji, Yier Jin, Wenyuan Xu | Published: 2026-02-12
Privacy Assurance
Prompt Injection
Prompt leaking

Jailbreaking Leaves a Trace: Understanding and Detecting Jailbreak Attacks from Internal Representations of Large Language Models

Authors: Sri Durga Sai Sowmya Kadali, Evangelos E. Papalexakis | Published: 2026-02-12
Prompt Injection
Experimental Validation
Evaluation Method

LLM-FS: Zero-Shot Feature Selection for Effective and Interpretable Malware Detection

Authors: Naveen Gill, Ajvad Haneef K, Madhu Kumar S D | Published: 2026-02-10
Prompt Injection
Model Selection Method
evaluation metrics

Stop Testing Attacks, Start Diagnosing Defenses: The Four-Checkpoint Framework Reveals Where LLM Safety Breaks

Authors: Hayfa Dhabhi, Kashyap Thimmaraju | Published: 2026-02-10
Indirect Prompt Injection
Prompt Injection
Vulnerability Analysis

CIC-Trap4Phish: A Unified Multi-Format Dataset for Phishing and Quishing Attachment Detection

Authors: Fatemeh Nejati, Mahdi Rabbani, Mansur Mirani, Gunjan Piya, Igor Opushnyev, Ali A. Ghorbani, Sajjad Dadkhah | Published: 2026-02-09
Phishing Detection
Prompt Injection
Feature Engineering

Large Language Lobotomy: Jailbreaking Mixture-of-Experts via Expert Silencing

Authors: Jona te Lintelo, Lichao Wu, Stjepan Picek | Published: 2026-02-09
Prompt Injection
Large Language Model
安全性分析

Sparse Models, Sparse Safety: Unsafe Routes in Mixture-of-Experts LLMs

Authors: Yukun Jiang, Hai Huang, Mingjie Li, Yage Zhang, Michael Backes, Yang Zhang | Published: 2026-02-09
Sparsity Defense
Prompt Injection
安全性分析

Clouding the Mirror: Stealthy Prompt Injection Attacks Targeting LLM-based Phishing Detection

Authors: Takashi Koide, Hiroki Nakano, Daiki Chiba | Published: 2026-02-05
Indirect Prompt Injection
フィッシング検出手法
Prompt Injection

How Few-shot Demonstrations Affect Prompt-based Defenses Against LLM Jailbreak Attacks

Authors: Yanshu Wang, Shuaishuai Yang, Jingjing He, Tong Yang | Published: 2026-02-04
LLM Performance Evaluation
Prompt Injection
Large Language Model