No Free Lunch with Guardrails

Authors: Divyanshu Kumar, Nitin Aravind Birur, Tanay Baswa, Sahil Agarwal, Prashanth Harshangi | Published: 2025-04-01 | Updated: 2025-04-03

Output Constraints as Attack Surface: Exploiting Structured Generation to Bypass LLM Safety Mechanisms

Authors: Shuoming Zhang, Jiacheng Zhao, Ruiyuan Xu, Xiaobing Feng, Huimin Cui | Published: 2025-03-31

Get the Agents Drunk: Memory Perturbations in Autonomous Agent-based Recommender Systems

Authors: Shiyi Yang, Zhibo Hu, Chen Wang, Tong Yu, Xiwei Xu, Liming Zhu, Lina Yao | Published: 2025-03-31

ObfusQate: Unveiling the First Quantum Program Obfuscation Framework

Authors: Nilhil Bartake, See Toh Zi Jie, Carmen Wong Jiawen, Michael Kasper, Vivek Balachandran | Published: 2025-03-31

THEMIS: Towards Practical Intellectual Property Protection for Post-Deployment On-Device Deep Learning Models

Authors: Yujin Huang, Zhi Zhang, Qingchuan Zhao, Xingliang Yuan, Chunyang Chen | Published: 2025-03-31

Detecting Functional Bugs in Smart Contracts through LLM-Powered and Bug-Oriented Composite Analysis

Authors: Binbin Zhao, Xingshuang Lin, Yuan Tian, Saman Zonouz, Na Ruan, Jiliang Li, Raheem Beyah, Shouling Ji | Published: 2025-03-31

Intelligent IoT Attack Detection Design via ODLLM with Feature Ranking-based Knowledge Base

Authors: Satvik Verma, Qun Wang, E. Wes Bethel | Published: 2025-03-27

Prompt, Divide, and Conquer: Bypassing Large Language Model Safety Filters via Segmented and Distributed Prompt Processing

Authors: Johan Wahréus, Ahmed Hussain, Panos Papadimitratos | Published: 2025-03-27

Bayesian Pseudo Posterior Mechanism for Differentially Private Machine Learning

Authors: Robert Chew, Matthew R. Williams, Elan A. Segarra, Alexander J. Preiss, Amanda Konet, Terrance D. Savitsky | Published: 2025-03-27

Tricking Retrievers with Influential Tokens: An Efficient Black-Box Corpus Poisoning Attack

Authors: Cheng Wang, Yiwei Wang, Yujun Cai, Bryan Hooi | Published: 2025-03-27