These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Despite the continued research and progress in building secure systems,
Android applications continue to be ridden with vulnerabilities, necessitating
effective detection methods. Current strategies involving static and dynamic
analysis tools come with limitations like overwhelming number of false
positives and limited scope of analysis which make either difficult to adopt.
Over the past years, machine learning based approaches have been extensively
explored for vulnerability detection, but its real-world applicability is
constrained by data requirements and feature engineering challenges. Large
Language Models (LLMs), with their vast parameters, have shown tremendous
potential in understanding semnatics in human as well as programming languages.
We dive into the efficacy of LLMs for detecting vulnerabilities in the context
of Android security. We focus on building an AI-driven workflow to assist
developers in identifying and rectifying vulnerabilities. Our experiments show
that LLMs outperform our expectations in finding issues within applications
correctly flagging insecure apps in 91.67% of cases in the Ghera benchmark. We
use inferences from our experiments towards building a robust and actionable
vulnerability detection system and demonstrate its effectiveness. Our
experiments also shed light on how different various simple configurations can
affect the True Positive (TP) and False Positive (FP) rates.