benchmark 2026

TGIF2: Extended Text-Guided Inpainting Forgery Dataset & Benchmark

Hannes Mareen 1, Dimitrios Karageorgiou 2, Paschalis Giakoumoglou 2, Peter Lambert 1, Symeon Papadopoulos 2, Glenn Van Wallendael 1

0 citations

α

Published on arXiv

2603.28613

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

IFL and SID methods degrade on FLUX.1 manipulations, and generative super-resolution significantly weakens forensic traces, undermining current forensic pipelines

TGIF2

Novel technique introduced


Generative AI has made text-guided inpainting a powerful image editing tool, but at the same time a growing challenge for media forensics. Existing benchmarks, including our text-guided inpainting forgery (TGIF) dataset, show that image forgery localization (IFL) methods can localize manipulations in spliced images but struggle not in fully regenerated (FR) images, while synthetic image detection (SID) methods can detect fully regenerated images but cannot perform localization. With new generative inpainting models emerging and the open problem of localization in FR images remaining, updated datasets and benchmarks are needed. We introduce TGIF2, an extended version of TGIF, that captures recent advances in text-guided inpainting and enables a deeper analysis of forensic robustness. TGIF2 augments the original dataset with edits generated by FLUX.1 models, as well as with random non-semantic masks. Using the TGIF2 dataset, we conduct a forensic evaluation spanning IFL and SID, including fine-tuning IFL methods on FR images and generative super-resolution attacks. Our experiments show that both IFL and SID methods degrade on FLUX.1 manipulations, highlighting limited generalization. Additionally, while fine-tuning improves localization on FR images, evaluation with random non-semantic masks reveals object bias. Furthermore, generative super-resolution significantly weakens forensic traces, demonstrating that common image enhancement operations can undermine current forensic pipelines. In summary, TGIF2 provides an updated dataset and benchmark, which enables new insights into the challenges posed by modern inpainting and AI-based image enhancements. TGIF2 is available at https://github.com/IDLabMedia/tgif-dataset.


Key Contributions

  • TGIF2 dataset extending TGIF with FLUX.1-generated inpainting manipulations and random non-semantic masks
  • Comprehensive benchmark evaluating image forgery localization (IFL) and synthetic image detection (SID) methods on modern inpainting models
  • Evaluation of generative super-resolution as an attack vector that weakens forensic traces in manipulated images

🛡️ Threat Analysis

Output Integrity Attack

Paper focuses on detecting and localizing AI-generated image manipulations (deepfakes created via text-guided inpainting) and evaluating forensic methods against adversarial enhancements like generative super-resolution that remove forensic traces. This is output integrity and AI-generated content detection.


Details

Domains
visionmultimodalgenerative
Model Types
diffusiontransformer
Threat Tags
inference_timedigital
Datasets
TGIFTGIF2
Applications
image forgery detectiondeepfake detectionmedia forensics