TPIA: Towards Target-specific Prompt Injection Attack against Code-oriented Large Language Models

Authors: Yuchen Yang, Hongwei Yao, Bingrun Yang, Yiling He, Yiming Li, Tianwei Zhang, Zhan Qin, Kui Ren, Chun Chen | Published: 2024-07-12 | Updated: 2025-01-16

Refusing Safe Prompts for Multi-modal Large Language Models

Authors: Zedian Shao, Hongbin Liu, Yuepeng Hu, Neil Zhenqiang Gong | Published: 2024-07-12 | Updated: 2024-09-05

ProxyGPT: Enabling User Anonymity in LLM Chatbots via (Un)Trustworthy Volunteer Proxies

Authors: Dzung Pham, Jade Sheffey, Chau Minh Pham, Amir Houmansadr | Published: 2024-07-11 | Updated: 2025-06-11

How to beat a Bayesian adversary

Authors: Zihan Ding, Kexin Jin, Jonas Latz, Chenguang Liu | Published: 2024-07-11

Model-agnostic clean-label backdoor mitigation in cybersecurity environments

Authors: Giorgio Severi, Simona Boboila, John Holodnak, Kendra Kratkiewicz, Rauf Izmailov, Michael J. De Lucia, Alina Oprea | Published: 2024-07-11 | Updated: 2025-05-05

Explainable Differential Privacy-Hyperdimensional Computing for Balancing Privacy and Transparency in Additive Manufacturing Monitoring

Authors: Fardin Jalil Piran, Prathyush P. Poduval, Hamza Errahmouni Barkam, Mohsen Imani, Farhad Imani | Published: 2024-07-09 | Updated: 2025-03-17

Approximating Two-Layer ReLU Networks for Hidden State Analysis in Differential Privacy

Authors: Antti Koskela | Published: 2024-07-05 | Updated: 2024-10-11

A Geometric Framework for Adversarial Vulnerability in Machine Learning

Authors: Brian Bell | Published: 2024-07-03

Early-Stage Anomaly Detection: A Study of Model Performance on Complete vs. Partial Flows

Authors: Adrian Pekar, Richard Jozsa | Published: 2024-07-03 | Updated: 2025-06-30

From Theft to Bomb-Making: The Ripple Effect of Unlearning in Defending Against Jailbreak Attacks

Authors: Zhexin Zhang, Junxiao Yang, Yida Lu, Pei Ke, Shiyao Cui, Chujie Zheng, Hongning Wang, Minlie Huang | Published: 2024-07-03 | Updated: 2025-05-20