AIセキュリティポータルbot

Tempest: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search

Authors: Andy Zhou, Ron Arel | Published: 2025-03-13 | Updated: 2025-05-21
Disabling Safety Mechanisms of LLM
Attack Method
Generative Model

CASTLE: Benchmarking Dataset for Static Code Analyzers and LLMs towards CWE Detection

Authors: Richard A. Dubniczky, Krisztofer Zoltán Horvát, Tamás Bisztray, Mohamed Amine Ferrag, Lucas C. Cordeiro, Norbert Tihanyi | Published: 2025-03-12 | Updated: 2025-03-31
Security Metric
Prompt leaking
Vulnerability Mitigation Technique

Adv-CPG: A Customized Portrait Generation Framework with Facial Adversarial Attacks

Authors: Junying Wang, Hongyuan Zhang, Yuan Yuan | Published: 2025-03-11
Privacy Protection
Adversarial Example
Face Recognition System

Split-n-Chain: Privacy-Preserving Multi-Node Split Learning with Blockchain-Based Auditability

Authors: Mukesh Sahani, Binanda Sengupta | Published: 2025-03-10 | Updated: 2025-04-15
Performance Evaluation
Privacy Protection Method
Distributed Learning

Queueing, Predictions, and LLMs: Challenges and Open Problems

Authors: Michael Mitzenmacher, Rana Shahout | Published: 2025-03-10
LLM Performance Evaluation
Scheduling Method
Prediction-Based Scheduling

How Well Can Differential Privacy Be Audited in One Run?

Authors: Amit Keinan, Moshe Shenfeld, Katrina Ligett | Published: 2025-03-10 | Updated: 2025-05-26
Privacy Issues
監査手法
Watermark Design

Probabilistic Modeling of Jailbreak on Multimodal LLMs: From Quantification to Application

Authors: Wenzhuo Xu, Zhipeng Wei, Xiongtao Sun, Zonghao Ying, Deyue Zhang, Dongdong Yang, Xiangzheng Zhang, Quanchen Zou | Published: 2025-03-10 | Updated: 2025-07-31
Prompt Injection
Large Language Model
Robustness of Watermarking Techniques

Secure On-Device Video OOD Detection Without Backpropagation

Authors: Shawn Li, Peilin Cai, Yuxiao Zhou, Zhiyu Ni, Renjie Liang, You Qin, Yi Nian, Zhengzhong Tu, Xiyang Hu, Yue Zhao | Published: 2025-03-08 | Updated: 2025-03-17
Privacy Protection Method
Framework
Deep Learning

Nearly Optimal Differentially Private ReLU Regression

Authors: Meng Ding, Mingxi Lei, Shaowei Wang, Tianhang Zheng, Di Wang, Jinhui Xu | Published: 2025-03-08 | Updated: 2025-06-10
Privacy Protection Mechanism
Convergence Property
Differential Privacy

ToxicSQL: Migrating SQL Injection Threats into Text-to-SQL Models via Backdoor Attack

Authors: Meiyu Lin, Haichuan Zhang, Jiale Lao, Renyuan Li, Yuanchun Zhou, Carl Yang, Yang Cao, Mingjie Tang | Published: 2025-03-07 | Updated: 2025-04-03
Backdoor Detection
Backdoor Attack
Model Performance Evaluation