benchmark 2026

SEED: A Large-Scale Benchmark for Provenance Tracing in Sequential Deepfake Facial Edits

Mengieong Hoi 1, Zhedong Zheng 1, Ping Liu 2, Wei Liu 3

0 citations

α

Published on arXiv

2604.10522

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

High-frequency wavelet components provide effective forensic cues for tracing sequential edits even under image degradation, outperforming spatial-only approaches

FAITH

Novel technique introduced


Deepfake content on social networks is increasingly produced through multiple \emph{sequential} edits to biometric data such as facial imagery. Consequently, the final appearance of an image often reflects a latent chain of operations rather than a single manipulation. Recovering these editing histories is essential for visual provenance analysis, misinformation auditing, and forensic or platform moderation workflows that must trace the origin and evolution of AI-generated media. However, existing datasets predominantly focus on single-step editing and overlook the cumulative artifacts introduced by realistic multi-step pipelines. To address this gap, we introduce Sequential Editing in Diffusion (\textbf{SEED}), a large-scale benchmark for sequential provenance tracing in facial imagery. SEED contains over 90K images constructed via one to four sequential attribute edits using diffusion-based editing pipelines, with fine-grained annotations including edit order, textual instructions, manipulation masks, and generation models. These metadata enable step-wise evidence analysis and support forgery detection, sequence prediction. To benchmark the challenges posed by SEED, we evaluate representative analysis strategies and observe that spatial-only approaches struggle under subtle and distributed diffusion artifacts, especially when such artifacts accumulate across multiple edits. Motivated by this observation, we further establish \textbf{FAITH}, a frequency-aware Transformer baseline that aggregates spatial and frequency-domain cues to identify and order latent editing events. Results show that high-frequency signals, particularly wavelet components, provide effective cues even under image degradation. Overall, SEED facilitates systematic study of sequential provenance tracing and evidence aggregation for trustworthy analysis of AI-generated visual content.


Key Contributions

  • SEED benchmark with 90K+ images covering 1-4 sequential diffusion-based facial edits with fine-grained annotations (edit order, text instructions, masks)
  • FAITH frequency-aware Transformer baseline that combines spatial and wavelet features for sequential provenance tracing
  • Demonstration that frequency-domain cues (especially wavelets) are effective for detecting cumulative diffusion artifacts across multi-step edits

🛡️ Threat Analysis

Output Integrity Attack

Paper focuses on detecting AI-generated/manipulated facial content and tracing the provenance of sequential edits — this is output integrity and authenticity analysis. The SEED benchmark and FAITH detector target verifying whether images are authentic or AI-edited, and recovering the editing history.


Details

Domains
visiongenerative
Model Types
diffusiontransformer
Threat Tags
inference_time
Datasets
SEED
Applications
deepfake detectionvisual provenance analysiscontent authenticity verificationfacial manipulation detection