These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
MLFuzz, a work accepted at ACM FSE 2023, revisits the performance of a
machine learning-based fuzzer, NEUZZ. We demonstrate that its main conclusion
is entirely wrong due to several fatal bugs in the implementation and wrong
evaluation setups, including an initialization bug in persistent mode, a
program crash, an error in training dataset collection, and a mistake in
fuzzing result collection. Additionally, MLFuzz uses noisy training datasets
without sufficient data cleaning and preprocessing, which contributes to a
drastic performance drop in NEUZZ. We address these issues and provide a
corrected implementation and evaluation setup, showing that NEUZZ consistently
performs well over AFL on the FuzzBench dataset. Finally, we reflect on the
evaluation methods used in MLFuzz and offer practical advice on fair and
scientific fuzzing evaluations.