These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
One of the pivotal security threats for the embedded computing systems is
malicious software a.k.a malware. With efficiency and efficacy, Machine
Learning (ML) has been widely adopted for malware detection in recent times.
Despite being efficient, the existing techniques require a tremendous number of
benign and malware samples for training and modeling an efficient malware
detector. Furthermore, such constraints limit the detection of emerging malware
samples due to the lack of sufficient malware samples required for efficient
training. To address such concerns, we introduce a code-aware data generation
technique that generates multiple mutated samples of the limitedly seen malware
by the devices. Loss minimization ensures that the generated samples closely
mimic the limitedly seen malware and mitigate the impractical samples. Such
developed malware is further incorporated into the training set to formulate
the model that can efficiently detect the emerging malware despite having
limited exposure. The experimental results demonstrates that the proposed
technique achieves an accuracy of 90% in detecting limitedly seen malware,
which is approximately 3x more than the accuracy attained by state-of-the-art
techniques.