Paper Information
- Author
- Raphael Springer,Alexander Schmitz,Artur Leinweber,Tobias Urban,Christian Dietrich
- Published
- 4-30-2025
- Affiliation
- Westphalian University of Applied Sciences
- Country
- Germany
- Conference
- Computing Research Repository (CoRR)
Abstract
Function detection is a well-known problem in binary analysis. While previous
research has primarily focused on Linux/ELF, Windows/PE binaries have been
overlooked or only partially considered. This paper introduces FuncPEval, a new
dataset for Windows x86 and x64 PE files, featuring Chromium and the Conti
ransomware, along with ground truth data for 1,092,820 function starts.
Utilizing FuncPEval, we evaluate five heuristics-based (Ghidra, IDA, Nucleus,
rev.ng, SMDA) and three machine-learning-based (DeepDi, RNN, XDA) function
start detection tools. Among the tested tools, IDA achieves the highest
F1-score (98.44%) for Chromium x64, while DeepDi closely follows (97%) but
stands out as the fastest by a significant margin. Working towards
explainability, we examine the impact of padding between functions on the
detection results. Our analysis shows that all tested tools, except rev.ng, are
susceptible to randomized padding. The randomized padding significantly
diminishes the effectiveness for the RNN, XDA, and Nucleus. Among the
learning-based tools, DeepDi exhibits the least sensitivity and demonstrates
overall the fastest performance, while Nucleus is the most adversely affected
among non-learning-based tools. In addition, we improve the recurrent neural
network (RNN) proposed by Shin et al. and enhance the XDA tool, increasing the
F1-score by approximately 10%.