Abstract
Mobile apps often embed authentication secrets, such as API keys, tokens, and
client IDs, to integrate with cloud services. However, developers often
hardcode these credentials into Android apps, exposing them to extraction
through reverse engineering. Once compromised, adversaries can exploit secrets
to access sensitive data, manipulate resources, or abuse APIs, resulting in
significant security and financial risks. Existing detection approaches, such
as regex-based analysis, static analysis, and machine learning, are effective
for identifying known patterns but are fundamentally limited: they require
prior knowledge of credential structures, API signatures, or training data.
In this paper, we propose SecretLoc, an LLM-based approach for detecting
hardcoded secrets in Android apps. SecretLoc goes beyond pattern matching; it
leverages contextual and structural cues to identify secrets without relying on
predefined patterns or labeled training sets. Using a benchmark dataset from
the literature, we demonstrate that SecretLoc detects secrets missed by regex-,
static-, and ML-based methods, including previously unseen types of secrets. In
total, we discovered 4828 secrets that were undetected by existing approaches,
discovering more than 10 "new" types of secrets, such as OpenAI API keys,
GitHub Access Tokens, RSA private keys, and JWT tokens, and more.
We further extend our analysis to newly crawled apps from Google Play, where
we uncovered and responsibly disclosed additional hardcoded secrets. Across a
set of 5000 apps, we detected secrets in 2124 apps (42.5%), several of which
were confirmed and remediated by developers after we contacted them. Our
results reveal a dual-use risk: if analysts can uncover these secrets with
LLMs, so can attackers. This underscores the urgent need for proactive secret
management and stronger mitigation practices across the mobile ecosystem.