The proliferation of software vulnerabilities poses a significant challenge
for security databases and analysts tasked with their timely identification,
classification, and remediation. With the National Vulnerability Database (NVD)
reporting an ever-increasing number of vulnerabilities, the traditional manual
analysis becomes untenably time-consuming and prone to errors. This paper
introduces VulnScopper, an innovative approach that utilizes multi-modal
representation learning, combining Knowledge Graphs (KG) and Natural Language
Processing (NLP), to automate and enhance the analysis of software
vulnerabilities. Leveraging ULTRA, a knowledge graph foundation model, combined
with a Large Language Model (LLM), VulnScopper effectively handles unseen
entities, overcoming the limitations of previous KG approaches. We evaluate
VulnScopper on two major security datasets, the NVD and the Red Hat CVE
database. Our method significantly improves the link prediction accuracy
between Common Vulnerabilities and Exposures (CVEs), Common Weakness
Enumeration (CWEs), and Common Platform Enumerations (CPEs). Our results show
that VulnScopper outperforms existing methods, achieving up to 78% Hits@10
accuracy in linking CVEs to CPEs and CWEs and presenting an 11.7% improvement
over large language models in predicting CWE labels based on the Red Hat
database. Based on the NVD, only 6.37% of the linked CPEs are being published
during the first 30 days; many of them are related to critical and high-risk
vulnerabilities which, according to multiple compliance frameworks (such as
CISA and PCI), should be remediated within 15-30 days. Our model can uncover
new products linked to vulnerabilities, reducing remediation time and improving
vulnerability management. We analyzed several CVEs from 2023 to showcase this
ability.