Large language models (LLMs) have achieved remarkable progress in code
understanding tasks. However, they demonstrate limited performance in
vulnerability detection and struggle to distinguish vulnerable code from
patched code. We argue that LLMs lack understanding of security specifications
-- the expectations about how code should behave to remain safe. When code
behavior differs from these expectations, it becomes a potential vulnerability.
However, such knowledge is rarely explicit in training data, leaving models
unable to reason about security flaws. We propose VulInstruct, a
specification-guided approach that systematically extracts security
specifications from historical vulnerabilities to detect new ones. VulInstruct
constructs a specification knowledge base from two perspectives: (i) General
specifications from high-quality patches across projects, capturing fundamental
safe behaviors; and (ii) Domain-specific specifications from repeated
violations in particular repositories relevant to the target code. VulInstruct
retrieves relevant past cases and specifications, enabling LLMs to reason about
expected safe behaviors rather than relying on surface patterns. We evaluate
VulInstruct under strict criteria requiring both correct predictions and valid
reasoning. On PrimeVul, VulInstruct achieves 45.0% F1-score (32.7% improvement)
and 37.7% recall (50.8% improvement) compared to baselines, while uniquely
detecting 24.3% of vulnerabilities -- 2.4x more than any baseline. In pair-wise
evaluation, VulInstruct achieves 32.3% relative improvement. VulInstruct also
discovered a previously unknown high-severity vulnerability (CVE-2025-56538) in
production code, demonstrating practical value for real-world vulnerability
discovery. All code and supplementary materials are available at
https://github.com/zhuhaopku/VulInstruct-temp.