BitHydra: Towards Bit-flip Inference Cost Attack against Large Language Models

TOP 文献データベース BitHydra: Towards Bit-flip Inference Cost Attack against Large Language Models

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2505.16670

PDF

https://arxiv.org/pdf/2505.16670

文献情報

作者: Xiaobei Yan,Yiming Li,Hao Wang,Han Qiu,Tianwei Zhang
公開日: 2025-5-22
更新日: 2025-9-29
所属機関: Nanyang Technological University
所属の国: Singapore
会議名: Computing Research Repository (CoRR)

AIにより推定されたラベル

プロンプトインジェクション LLMセキュリティテキスト生成手法

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Large language models (LLMs) are widely deployed, but their growing compute demands expose them to inference cost attacks that maximize output length. We reveal that prior attacks are fundamentally self-targeting because they rely on crafted inputs, so the added cost accrues to the attacker's own queries and scales poorly in practice. In this work, we introduce the first bit-flip inference cost attack that directly modifies model weights to induce persistent overhead for all users of a compromised LLM. Such attacks are stealthy yet realistic in practice: for instance, in shared MLaaS environments, co-located tenants can exploit hardware-level faults (e.g., Rowhammer) to flip memory bits storing model parameters. We instantiate this attack paradigm with BitHydra, which (1) minimizes a loss that suppresses the end-of-sequence token (i.e., EOS) and (2) employs an efficient yet effective critical-bit search focused on the EOS embedding vector, sharply reducing the search space while preserving benign-looking outputs. We evaluate across 11 LLMs (1.5B-14B) under int8 and float16, demonstrating that our method efficiently achieves scalable cost inflation with only a few bit flips, while remaining effective even against potential defenses.