These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
The increasing reliance on machine learning (ML) in computer security,
particularly for malware classification, has driven significant advancements.
However, the replicability and reproducibility of these results are often
overlooked, leading to challenges in verifying research findings. This paper
highlights critical pitfalls that undermine the validity of ML research in
Android malware detection, focusing on dataset and methodological issues. We
comprehensively analyze Android malware detection using two datasets and assess
offline and continual learning settings with six widely used ML models. Our
study reveals that when properly tuned, simpler baseline methods can often
outperform more complex models. To address reproducibility challenges, we
propose solutions for improving datasets and methodological practices, enabling
fairer model comparisons. Additionally, we open-source our code to facilitate
malware analysis, making it extensible for new models and datasets. Our paper
aims to support future research in Android malware detection and other security
domains, enhancing the reliability and reproducibility of published results.