Large Language Models (LLMs) are increasingly used for cybersecurity threat
analysis, but their deployment in security-sensitive environments raises trust
and safety concerns. With over 21,000 vulnerabilities disclosed in 2025, manual
analysis is infeasible, making scalable and verifiable AI support critical.
When querying LLMs, dealing with emerging vulnerabilities is challenging as
they have a training cut-off date. While Retrieval-Augmented Generation (RAG)
can inject up-to-date context to alleviate the cut-off date limitation, it
remains unclear how much LLMs rely on retrieved evidence versus the model's
internal knowledge, and whether the retrieved information is meaningful or even
correct. This uncertainty could mislead security analysts, mis-prioritize
patches, and increase security risks. Therefore, this work proposes LLM
Embedding-based Attribution (LEA) to analyze the generated responses for
vulnerability exploitation analysis. More specifically, LEA quantifies the
relative contribution of internal knowledge vs. retrieved content in the
generated responses. We evaluate LEA on 500 critical vulnerabilities disclosed
between 2016 and 2025, across three RAG settings -- valid, generic, and
incorrect -- using three state-of-the-art LLMs. Our results demonstrate LEA's
ability to detect clear distinctions between non-retrieval, generic-retrieval,
and valid-retrieval scenarios with over 95% accuracy on larger models. Finally,
we demonstrate the limitations posed by incorrect retrieval of vulnerability
information and raise a cautionary note to the cybersecurity community
regarding the blind reliance on LLMs and RAG for vulnerability analysis. LEA
offers security analysts with a metric to audit RAG-enhanced workflows,
improving the transparent and trustworthy deployment of AI in cybersecurity
threat analysis.