benchmark 2025

Detecting Localized Deepfakes: How Well Do Synthetic Image Detectors Handle Inpainting?

Serafino Pandolfini , Lorenzo Pellegrini , Matteo Ferrara , Davide Maltoni

0 citations · 46 references · arXiv

α

Published on arXiv

2512.16688

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Detectors trained on large, diverse generator sets show partial transferability to inpainting edits and reliably detect medium- and large-area manipulations, outperforming many ad hoc inpainting-specific approaches


The rapid progress of generative AI has enabled highly realistic image manipulations, including inpainting and region-level editing. These approaches preserve most of the original visual context and are increasingly exploited in cybersecurity-relevant threat scenarios. While numerous detectors have been proposed for identifying fully synthetic images, their ability to generalize to localized manipulations remains insufficiently characterized. This work presents a systematic evaluation of state-of-the-art detectors, originally trained for the deepfake detection on fully synthetic images, when applied to a distinct challenge: localized inpainting detection. The study leverages multiple datasets spanning diverse generators, mask sizes, and inpainting techniques. Our experiments show that models trained on a large set of generators exhibit partial transferability to inpainting-based edits and can reliably detect medium- and large-area manipulations or regeneration-style inpainting, outperforming many existing ad hoc detection approaches.


Key Contributions

  • Systematic evaluation protocol for state-of-the-art synthetic image detectors applied to localized inpainting detection across diverse generators, mask sizes, and manipulation types
  • Analysis of transferability gaps between fully synthetic image detection and localized deepfake detection, identifying conditions (large masks, regeneration-style inpainting) where transfer succeeds
  • Actionable insights on which detector design factors (training generator diversity) most influence generalization to inpainting-based manipulations

🛡️ Threat Analysis

Output Integrity Attack

Directly addresses AI-generated content detection — specifically evaluating whether deepfake detectors generalize from fully synthetic images to localized inpainting manipulations, which is an output integrity and content authenticity problem.


Details

Domains
visiongenerative
Model Types
cnntransformerdiffusion
Threat Tags
inference_timedigital
Datasets
AI-GenBench
Applications
deepfake detectionimage forensicsmedia integritydisinformation detection