BitHydra: Towards Bit-flip Inference Cost Attack against Large Language Models

TOP Literature Database BitHydra: Towards Bit-flip Inference Cost Attack against Large Language Models

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2505.16670

PDF

https://arxiv.org/pdf/2505.16670

Paper Information

Author: Xiaobei Yan,Yiming Li,Hao Wang,Han Qiu,Tianwei Zhang
Published: 5-22-2025
Updated: 9-29-2025
Affiliation: Nanyang Technological University
Country: Singapore
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Prompt Injection LLM Security Text Generation Method

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Large language models (LLMs) are widely deployed, but their growing compute demands expose them to inference cost attacks that maximize output length. We reveal that prior attacks are fundamentally self-targeting because they rely on crafted inputs, so the added cost accrues to the attacker's own queries and scales poorly in practice. In this work, we introduce the first bit-flip inference cost attack that directly modifies model weights to induce persistent overhead for all users of a compromised LLM. Such attacks are stealthy yet realistic in practice: for instance, in shared MLaaS environments, co-located tenants can exploit hardware-level faults (e.g., Rowhammer) to flip memory bits storing model parameters. We instantiate this attack paradigm with BitHydra, which (1) minimizes a loss that suppresses the end-of-sequence token (i.e., EOS) and (2) employs an efficient yet effective critical-bit search focused on the EOS embedding vector, sharply reducing the search space while preserving benign-looking outputs. We evaluate across 11 LLMs (1.5B-14B) under int8 and float16, demonstrating that our method efficiently achieves scalable cost inflation with only a few bit flips, while remaining effective even against potential defenses.