Generation-based fuzzing is a software testing approach which is able to
discover different types of bugs and vulnerabilities in software. It is,
however, known to be very time consuming to design and fine tune classical
fuzzers to achieve acceptable coverage, even for small-scale software systems.
To address this issue, we investigate a machine learning-based approach to fuzz
testing in which we outline a family of test-case generators based on Recurrent
Neural Networks (RNNs) and train those on readily available datasets with a
minimum of human fine tuning. The proposed generators do, in contrast to
previous work, not rely on heuristic sampling strategies but principled
sampling from the predictive distributions. We provide a detailed analysis to
demonstrate the characteristics and efficacy of the proposed generators in a
challenging web browser testing scenario. The empirical results show that the
RNN-based generators are able to provide better coverage than a mutation based
method and are able to discover paths not discovered by a classical fuzzer. Our
results supplement findings in other domains suggesting that generation based
fuzzing with RNNs is a viable route to better software quality conditioned on
the use of a suitable model selection/analysis procedure.