NVBleed: Covert and Side-Channel Attacks on NVIDIA Multi-GPU Interconnect

TOP Literature Database NVBleed: Covert and Side-Channel Attacks on NVIDIA Multi-GPU Interconnect

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2503.17847

PDF

https://arxiv.org/pdf/2503.17847

Paper Information

Author: Yicheng Zhang,Ravan Nazaraliyev,Sankha Baran Dutta,Andres Marquez,Kevin Barker,Nael Abu-Ghazaleh
Published: 3-23-2025
Affiliation: University of California, Riverside
Country: United States of America
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Side-Channel Attack Attack Method Cloud Computing

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Multi-GPU systems are becoming increasingly important in highperformance computing (HPC) and cloud infrastructure, providing acceleration for data-intensive applications, including machine learning workloads. These systems consist of multiple GPUs interconnected through high-speed networking links such as NVIDIA's NVLink. In this work, we explore whether the interconnect on such systems can offer a novel source of leakage, enabling new forms of covert and side-channel attacks. Specifically, we reverse engineer the operations of NVlink and identify two primary sources of leakage: timing variations due to contention and accessible performance counters that disclose communication patterns. The leakage is visible remotely and even across VM instances in the cloud, enabling potentially dangerous attacks. Building on these observations, we develop two types of covert-channel attacks across two GPUs, achieving a bandwidth of over 70 Kbps with an error rate of 4.78% for the contention channel. We develop two end-to-end crossGPU side-channel attacks: application fingerprinting (including 18 high-performance computing and deep learning applications) and 3D graphics character identification within Blender, a multi-GPU rendering application. These attacks are highly effective, achieving F1 scores of up to 97.78% and 91.56%, respectively. We also discover that leakage surprisingly occurs across Virtual Machines on the Google Cloud Platform (GCP) and demonstrate a side-channel attack on Blender, achieving F1 scores exceeding 88%. We also explore potential defenses such as managing access to counters and reducing the resolution of the clock to mitigate the two sources of leakage.

External Datasets

OpenMM benchmarks

MNIST dataset

Blender Studio open movies