TPIA: Towards Target-specific Prompt Injection Attack against Code-oriented Large Language Models

Authors: Yuchen Yang, Hongwei Yao, Bingrun Yang, Yiling He, Yiming Li, Tianwei Zhang, Zhan Qin, Kui Ren, Chun Chen | Published: 2024-07-12 | Updated: 2025-01-16

Refusing Safe Prompts for Multi-modal Large Language Models

Authors: Zedian Shao, Hongbin Liu, Yuepeng Hu, Neil Zhenqiang Gong | Published: 2024-07-12 | Updated: 2024-09-05

How to beat a Bayesian adversary

Authors: Zihan Ding, Kexin Jin, Jonas Latz, Chenguang Liu | Published: 2024-07-11

Model-agnostic clean-label backdoor mitigation in cybersecurity environments

Authors: Giorgio Severi, Simona Boboila, John Holodnak, Kendra Kratkiewicz, Rauf Izmailov, Michael J. De Lucia, Alina Oprea | Published: 2024-07-11 | Updated: 2025-05-05

Explainable Differential Privacy-Hyperdimensional Computing for Balancing Privacy and Transparency in Additive Manufacturing Monitoring

Authors: Fardin Jalil Piran, Prathyush P. Poduval, Hamza Errahmouni Barkam, Mohsen Imani, Farhad Imani | Published: 2024-07-09 | Updated: 2025-03-17

Approximating Two-Layer ReLU Networks for Hidden State Analysis in Differential Privacy

Authors: Antti Koskela | Published: 2024-07-05 | Updated: 2024-10-11

A Geometric Framework for Adversarial Vulnerability in Machine Learning

Authors: Brian Bell | Published: 2024-07-03

From Theft to Bomb-Making: The Ripple Effect of Unlearning in Defending Against Jailbreak Attacks

Authors: Zhexin Zhang, Junxiao Yang, Yida Lu, Pei Ke, Shiyao Cui, Chujie Zheng, Hongning Wang, Minlie Huang | Published: 2024-07-03 | Updated: 2025-05-20

MALT Powers Up Adversarial Attacks

Authors: Odelia Melamed, Gilad Yehudai, Adi Shamir | Published: 2024-07-02

Attack-Aware Noise Calibration for Differential Privacy

Authors: Bogdan Kulynych, Juan Felipe Gomez, Georgios Kaissis, Flavio du Pin Calmon, Carmela Troncoso | Published: 2024-07-02 | Updated: 2024-11-07