Unmasking Fake Careers: Detecting Machine-Generated Career Trajectories via Multi-layer Heterogeneous Graphs
Michiharu Yamashita 1, Thanh Q. Tran 2, Delvin Ce Zhang 3, Dongwon Lee 1
Published on arXiv
2509.19677
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
CareerScape outperforms state-of-the-art text and graph-based detectors by 5.8–85.0% relatively on detecting machine-generated career trajectories in structured resume data.
CareerScape
Novel technique introduced
The rapid advancement of Large Language Models (LLMs) has enabled the generation of highly realistic synthetic data. We identify a new vulnerability, LLMs generating convincing career trajectories in fake resumes and explore effective detection methods. To address this challenge, we construct a dataset of machine-generated career trajectories using LLMs and various methods, and demonstrate that conventional text-based detectors perform poorly on structured career data. We propose CareerScape, a novel heterogeneous, hierarchical multi-layer graph framework that models career entities and their relations in a unified global graph built from genuine resumes. Unlike conventional classifiers that treat each instance independently, CareerScape employs a structure-aware framework that augments user-specific subgraphs with trusted neighborhood information from a global graph, enabling the model to capture both global structural patterns and local inconsistencies indicative of synthetic career paths. Experimental results show that CareerScape outperforms state-of-the-art baselines by 5.8-85.0% relatively, highlighting the importance of structure-aware detection for machine-generated content.
Key Contributions
- CareerScape: a heterogeneous, hierarchical multi-layer graph framework that models career entities and relations in a unified global graph to detect LLM-generated career trajectories
- A new benchmark dataset of machine-generated career trajectories produced by multiple LLMs and domain-specific methods
- Empirical demonstration that conventional text-based detectors fail on structured career data, with CareerScape outperforming SOTA baselines by 5.8–85.0% relatively
🛡️ Threat Analysis
Primary contribution is a novel AI-generated content detection architecture (heterogeneous multi-layer graph framework) that detects LLM-synthesized career trajectories — falls under output integrity and AI-generated content detection.