The rapid advancement of Large Language Models (LLMs) has enabled the
generation of highly realistic synthetic data. We identify a new vulnerability,
LLMs generating convincing career trajectories in fake resumes and explore
effective detection methods. To address this challenge, we construct a dataset
of machine-generated career trajectories using LLMs and various methods, and
demonstrate that conventional text-based detectors perform poorly on structured
career data. We propose CareerScape, a novel heterogeneous, hierarchical
multi-layer graph framework that models career entities and their relations in
a unified global graph built from genuine resumes. Unlike conventional
classifiers that treat each instance independently, CareerScape employs a
structure-aware framework that augments user-specific subgraphs with trusted
neighborhood information from a global graph, enabling the model to capture
both global structural patterns and local inconsistencies indicative of
synthetic career paths. Experimental results show that CareerScape outperforms
state-of-the-art baselines by 5.8-85.0% relatively, highlighting the importance
of structure-aware detection for machine-generated content.
外部データセット
dataset of machine-generated career trajectories
real resumes from two datasets used in career trajectory research
4,555 verified genuine resumes
1,000 examples for each category of synthetic resumes
4,000 synthetic career trajectories of GPT-4o, LLaMA-3, Gemini-2.0, and Agent