These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Fuzz testing has become a cornerstone technique for identifying software bugs
and security vulnerabilities, with broad adoption in both industry and
open-source communities. Directly fuzzing a function requires fuzz drivers,
which translate random fuzzer inputs into valid arguments for the target
function. Given the cost and expertise required to manually develop fuzz
drivers, methods exist that leverage program analysis and Large Language Models
to automatically generate these drivers. However, the generated fuzz drivers
frequently lead to false positive crashes, especially in functions highly
structured input and complex state requirements. This problem is especially
crucial in industry-scale fuzz driver generation efforts like OSS-Fuzz-en, as
reporting false positive crashes to maintainers impede trust in both the system
and the team.
This paper presents two AI-driven strategies to reduce false positives in
OSS-Fuzz-Gen, a multi-agent system for automated fuzz driver generation. First,
constraint-based fuzz driver generation proactively enforces constraints on a
function's inputs and state to guide driver creation. Second, context-based
crash validation reactively analyzes function callers to determine whether
reported crashes are feasible from program entry points. Using 1,500 benchmark
functions from OSS-Fuzz, we show that these strategies reduce spurious crashes
by up to 8%, cut reported crashes by more than half, and demonstrate that
frontier LLMs can serve as reliable program analysis agents. Our results
highlight the promise and challenges of integrating AI into large-scale fuzzing
pipelines.