On Benchmarking Code LLMs for Android Malware Analysis

TOP 文献データベース On Benchmarking Code LLMs for Android Malware Analysis

ISSTA Companion

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2504.00694

PDF

https://arxiv.org/pdf/2504.00694

文献情報

作者: Yiling He,Hongyu She,Xingzhi Qian,Xinran Zheng,Zhuo Chen,Zhan Qin,Lorenzo Cavallaro
公開日: 2025-4-25
所属機関: University College London
所属の国: United Kingdom
会議名: ISSTA Companion

AIにより推定されたラベル

マルウェア検出手法 LLM性能評価研究方法論

Abstract

Large Language Models (LLMs) have demonstrated strong capabilities in various code intelligence tasks. However, their effectiveness for Android malware analysis remains underexplored. Decompiled Android malware code presents unique challenges for analysis, due to the malicious logic being buried within a large number of functions and the frequent lack of meaningful function names. This paper presents CAMA, a benchmarking framework designed to systematically evaluate the effectiveness of Code LLMs in Android malware analysis. CAMA specifies structured model outputs to support key malware analysis tasks, including malicious function identification and malware purpose summarization. Built on these, it integrates three domain-specific evaluation metrics (consistency, fidelity, and semantic relevance), enabling rigorous stability and effectiveness assessment and cross-model comparison. We construct a benchmark dataset of 118 Android malware samples from 13 families collected in recent years, encompassing over 7.5 million distinct functions, and use CAMA to evaluate four popular open-source Code LLMs. Our experiments provide insights into how Code LLMs interpret decompiled code and quantify the sensitivity to function renaming, highlighting both their potential and current limitations in malware analysis.