Optimized Deep Learning Models for Malware Detection under Concept Drift

TOP Literature Database Optimized Deep Learning Models for Malware Detection under Concept Drift

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2308.10821

PDF

https://arxiv.org/pdf/2308.10821

Paper Information

Author: William Maillet;Benjamin Marais
Published: 8-22-2023
Updated: 8-1-2024
Affiliation: Orange Innovation
Country: France
Conference

Labels Estimated by AI

Performance Evaluation Deep Learning Method Optimization Methods

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Despite the promising results of machine learning models in malicious files detection, they face the problem of concept drift due to their constant evolution. This leads to declining performance over time, as the data distribution of the new files differs from the training one, requiring frequent model update. In this work, we propose a model-agnostic protocol to improve a baseline neural network against drift. We show the importance of feature reduction and training with the most recent validation set possible, and propose a loss function named Drift-Resilient Binary Cross-Entropy, an improvement to the classical Binary Cross-Entropy more effective against drift. We train our model on the EMBER dataset, published in2018, and evaluate it on a dataset of recent malicious files, collected between 2020 and 2023. Our improved model shows promising results, detecting 15.2% more malware than a baseline model.

External Datasets

EMBER

BODMAS

MalwareBazaar