AIセキュリティポータルbot

Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search

Authors: Xun Huang, Simeng Qin, Xiaoshuang Jia, Ranjie Duan, Huanqian Yan, Zhitao Zeng, Fei Yang, Yang Liu, Xiaojun Jia | Published: 2026-02-26
Prompt Injection
Large Language Model
脱獄手法

AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification

Authors: Tian Zhang, Yiwei Xu, Juan Wang, Keyan Guo, Xiaoyang Xu, Bowen Xiao, Quanlong Guan, Jinlin Fan, Jiawei Liu, Zhiquan Liu, Hongxin Hu | Published: 2026-02-26
Indirect Prompt Injection
Counterfactual Explanation
Data Management System

IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation

Authors: Yanpei Guo, Wenjie Qu, Linyu Wu, Shengfang Zhai, Lionel Z. Wang, Ming Xu, Yue Liu, Binhang Yuan, Dawn Song, Jiaheng Zhang | Published: 2026-02-26
LLM Performance Evaluation
Model evaluation methods
監査手法

Layer-Targeted Multilingual Knowledge Erasure in Large Language Models

Authors: Taoran Li, Varun Chandrasekaran, Zhiyuan Yu | Published: 2026-02-26
Alignment
Machine learning
Machine Learning Method

APFuzz: Towards Automatic Greybox Protocol Fuzzing

Authors: Yu Wang, Yang Xiang, Chandra Thapa, Hajime Suzuki | Published: 2026-02-25
プロトコルファジング
Prompt Injection
Research Methodology

Private and Robust Contribution Evaluation in Federated Learning

Authors: Delio Jaramillo Velez, Gergely Biczok, Alexandre Graell i Amat, Johan Ostman, Balazs Pejo | Published: 2026-02-25
Privacy Assessment
貢献評価手法
Federated Learning

Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection

Authors: Zheng Gao, Xiaoyu Li, Zhicheng Bao, Xiaoyan Feng, Jiaojiao Jiang | Published: 2026-02-25
Watermarking
Text Generation Method
Machine Learning Technology

The LLMbda Calculus: AI Agents, Conversations, and Information Flow

Authors: Zac Garby, Andrew D. Gordon, David Sands | Published: 2026-02-23
Indirect Prompt Injection
Security Analysis Method
Data Flow Analysis

Can You Tell It’s AI? Human Perception of Synthetic Voices in Vishing Scenarios

Authors: Zoha Hayat Bhatti, Bakhtawar Ahtisham, Seemal Tausif, Niklas George, Nida ul Habib Bajwa, Mobin Javed | Published: 2026-02-23
Phishing
認知バイアス
音声データ処理システム

RobPI: Robust Private Inference against Malicious Client

Authors: Jiaqi Xue, Mengxin Zheng, Qian Lou | Published: 2026-02-23
Model Extraction Attack
Adversarial Learning
Defense Mechanism