AIセキュリティポータル K Program
Cascade: Composing Software-Hardware Attack Gadgets for Adversarial Threat Amplification in Compound AI Systems
Share
Abstract
Rapid progress in generative AI has given rise to Compound AI systems - pipelines comprised of multiple large language models (LLM), software tools and database systems. Compound AI systems are constructed on a layered traditional software stack running on a distributed hardware infrastructure. Many of the diverse software components are vulnerable to traditional security flaws documented in the Common Vulnerabilities and Exposures (CVE) database, while the underlying distributed hardware infrastructure remains exposed to timing attacks, bit-flip faults, and power-based side channels. Today, research targets LLM-specific risks like model extraction, training data leakage, and unsafe generation -- overlooking the impact of traditional system vulnerabilities. This work investigates how traditional software and hardware vulnerabilities can complement LLM-specific algorithmic attacks to compromise the integrity of a compound AI pipeline. We demonstrate two novel attacks that combine system-level vulnerabilities with algorithmic weaknesses: (1) Exploiting a software code injection flaw along with a guardrail Rowhammer attack to inject an unaltered jailbreak prompt into an LLM, resulting in an AI safety violation, and (2) Manipulating a knowledge database to redirect an LLM agent to transmit sensitive user data to a malicious application, thus breaching confidentiality. These attacks highlight the need to address traditional vulnerabilities; we systematize the attack primitives and analyze their composition by grouping vulnerabilities by their objective and mapping them to distinct stages of an attack lifecycle. This approach enables a rigorous red-teaming exercise and lays the groundwork for future defense strategies.
Enhanced Membership Inference Attacks against Machine Learning Models
Jiayuan Ye, Aadyaa Maddi, Sasi Kumar Murakonda, Vincent Bindschaedler, Reza Shokri
Published: 11.18.2021
Reverse engineering convolutional neural networks through side-channel information leaks
W. Hua, Z. Zhang, G. E. Suh
Published: 2018
Stealing machine learning models via prediction APIs
F. Tramer, F. Zhang, A. Juels, M. K. Reiter, T. Ristenpart
Published: 2016
Lessons From Red Teaming 100 Generative AI Products
B. Bullwinkel, A. Minnich, S. Chawla, G. Lopez, M. Pouliot, W. Maxwell, J. de Gruyter, K. Pratt, S. Qi, N. Chikanov, R. Lutz, R. S. R. Dheekonda, B.-E. Jagdagdorj, E. Kim, J. Song, K. Hines, D. Jones, G. Severi, R. Lundeen, S. Vaughan, V. Westerhoff, P. Bryan, R. S. S. Kumar, Y. Zunger, C. Kawaguchi, M. Russinovich
Published: 2025
Flipping bits in memory without accessing them: An experimental study of dram disturbance errors
Y. Kim, R. Daly, J. Kim, C. Fallin, J. H. Lee, D. Lee, C. Wilkerson, K. Lai, O. Mutlu
Published: 2014
SoK: Memorization in General-Purpose Large Language Models
Valentin Hartmann, Anshuman Suri, Vincent Bindschaedler, David Evans, Shruti Tople, Robert West
Published: 10.24.2023
Nemo guardrails: A toolkit for controllable and safe llm applications with programmable rails
T. Rebedea, R. Dinu, M. Sreedhar, C. Parisien, J. Cohen
Published: 2023
Generative ai data governance – amazon bedrock guardrails – aws
"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
Published: 8.8.2023
Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM
Bochuan Cao, Yuanpu Cao, Lu Lin, Jinghui Chen
Published: 9.18.2023
Baseline defenses for adversarial attacks against aligned language models
N. Jain, A. Schwarzschild, Y. Wen, G. Somepalli, J. Kirchenbauer, P. yeh Chiang, M. Goldblum, A. Saha, J. Geiping, T. Goldstein
Published: 2023
Lamini - enterprise llm platform
Predibase: The developers platform for fine-tuning and serving llms - predibase
Prompt shields - azure ai foundry
Fact-checking with new grounding api in jina reader
Published: 2024
Fact checker ai —gemini api developer competition — google ai for developers
PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models
Wei Zou, Runpeng Geng, Binghui Wang, Jinyuan Jia
Published: 2.13.2024
Confusedpilot: Confused deputy risks in rag-based llms
A. RoyChowdhury, M. Luo, P. Sahu, S. Banerjee, M. Tiwari
Published: 2024
FLUSH+ RELOAD: A high resolution, low noise, l3 cache Side-Channel attack
Y. Yarom, K. Falkner
Published: 2014
An Off-Chip attack on hardware enclaves via the memory bus
D. Lee, D. Jung, I. T. Fang, C.-C. Tsai, R. A. Popa
Published: 2020
Mitigating storage side channels using statistical privacy mechanisms
Q. Xiao, M. K. Reiter, Y. Zhang
Published: 2015
Pytorchfi: A runtime perturbation tool for dnns
A. Mahmoud, N. Aggarwal, A. Nobbe, J. Vicarte, S. Adve, C. Fletcher, I. Frosio, S. Hari
Published: 2020
LLMart: Large Language Model adversarial robustness toolbox
C. Cornelius, M. Arvinte, S. Szyller, W. Xu, N. Himayat
Published: 2025
Everywhere all at once: Co-location attacks on public cloud faas
Z. N. Zhao, A. Morrison, C. W. Fletcher, J. Torrellas
Published: 2024
One bit flips, one cloud flops: Cross-VM row hammer attacks and privilege escalation
Y. Xiao, X. Zhang, Y. Zhang, R. Teodorescu
Published: 2016
Phoenix: Rowhammer attacks on ddr5 with self-correcting synchronization
D. Meyer, P. Jattke, M. Marazzi, S. Qazi, D. Moghimi, K. Razavi
Published: 2026
Rowhammer-Based trojan injection: One bit flip is sufficient for backdooring DNNs
X. Li, Y. Meng, J. Chen, L. Luo, Q. Zeng
Published: 2025
Universal and Transferable Adversarial Attacks on Aligned Language Models
Andy Zou, Zifan Wang, Nicholas Carlini, Milad Nasr, J. Zico Kolter, Matt Fredrikson
Published: 7.28.2023
Thunderclap: Exploring vulnerabilities in operating system iommu protection via dma from untrustworthy peripherals
A. T. Markettos, C. Rothwell, B. F. Gutstein, A. Pearce, P. G. Neumann, S. W. Moore, R. N. M. Watson
Published: 2019
Invisible probe: Timing attacks with pcie congestion side-channel
M. Tan, J. Wan, Z. Zhou, Z. Li
Published: 2021
Fault injection attack on deep neural network
Y. Liu, L. Wei, B. Luo, Q. Xu
Published: 2017
Understanding error propagation in deep learning neural network (dnn) accelerators and applications
G. Li, S. K. S. Hari, M. Sullivan, T. Tsai, K. Pattabiraman, J. Emer, S. W. Keckler
Published: 2017
Fault injection for tensorflow applications
N. Narayanan, Z. Chen, B. Fang, G. Li, K. Pattabiraman, N. Debardeleben
Published: 2022
Int-monitor: a model triggered hardware trojan in deep learning accelerators
P. Li, R. Hou
Published: 2023
Share