As machine learning gains prominence in various sectors of society for
automated decision-making, concerns have risen regarding potential
vulnerabilities in machine learning (ML) frameworks. Nevertheless, testing
these frameworks is a daunting task due to their intricate implementation.
Previous research on fuzzing ML frameworks has struggled to effectively extract
input constraints and generate valid inputs, leading to extended fuzzing
durations for deep execution or revealing the target crash.
In this paper, we propose ConFL, a constraint-guided fuzzer for ML
frameworks. ConFL automatically extracting constraints from kernel codes
without the need for any prior knowledge. Guided by the constraints, ConFL is
able to generate valid inputs that can pass the verification and explore deeper
paths of kernel codes. In addition, we design a grouping technique to boost the
fuzzing efficiency.
To demonstrate the effectiveness of ConFL, we evaluated its performance
mainly on Tensorflow. We find that ConFL is able to cover more code lines, and
generate more valid inputs than state-of-the-art (SOTA) fuzzers. More
importantly, ConFL found 84 previously unknown vulnerabilities in different
versions of Tensorflow, all of which were assigned with new CVE ids, of which 3
were critical-severity and 13 were high-severity. We also extended ConFL to
test PyTorch and Paddle, 7 vulnerabilities are found to date.