Software vulnerabilities (SVs) have become a common, serious and crucial
concern due to the ubiquity of computer software. Many machine learning-based
approaches have been proposed to solve the software vulnerability detection
(SVD) problem. However, there are still two open and significant issues for SVD
in terms of i) learning automatic representations to improve the predictive
performance of SVD, and ii) tackling the scarcity of labeled vulnerabilities
datasets that conventionally need laborious labeling effort by experts. In this
paper, we propose a novel end-to-end approach to tackle these two crucial
issues. We first exploit the automatic representation learning with deep domain
adaptation for software vulnerability detection. We then propose a novel
cross-domain kernel classifier leveraging the max-margin principle to
significantly improve the transfer learning process of software vulnerabilities
from labeled projects into unlabeled ones. The experimental results on
real-world software datasets show the superiority of our proposed method over
state-of-the-art baselines. In short, our method obtains a higher performance
on F1-measure, the most important measure in SVD, from 1.83% to 6.25% compared
to the second highest method in the used datasets. Our released source code
samples are publicly available at https://github.com/vannguyennd/dam2p