These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Fuzzing consists of repeatedly testing an application with modified, or
fuzzed, inputs with the goal of finding security vulnerabilities in
input-parsing code. In this paper, we show how to automate the generation of an
input grammar suitable for input fuzzing using sample inputs and
neural-network-based statistical machine-learning techniques. We present a
detailed case study with a complex input format, namely PDF, and a large
complex security-critical parser for this format, namely, the PDF parser
embedded in Microsoft's new Edge browser. We discuss (and measure) the tension
between conflicting learning and fuzzing goals: learning wants to capture the
structure of well-formed inputs, while fuzzing wants to break that structure in
order to cover unexpected code paths and find bugs. We also present a new
algorithm for this learn&fuzz challenge which uses a learnt input probability
distribution to intelligently guide where to fuzz inputs.