tool 2026

VIGIL: Part-Grounded Structured Reasoning for Generalizable Deepfake Detection

Xinghan Li , Junhao Xu , Jingjing Chen

0 citations

α

Published on arXiv

2603.21526

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

VIGIL consistently outperforms both expert CNN-based detectors and concurrent MLLM-based deepfake detection methods across all 5 generalizability levels when trained on only 3 foundational generators

VIGIL

Novel technique introduced


Multimodal large language models (MLLMs) offer a promising path toward interpretable deepfake detection by generating textual explanations. However, the reasoning process of current MLLM-based methods combines evidence generation and manipulation localization into a unified step. This combination blurs the boundary between faithful observations and hallucinated explanations, leading to unreliable conclusions. Building on this, we present VIGIL, a part-centric structured forensic framework inspired by expert forensic practice through a plan-then-examine pipeline: the model first plans which facial parts warrant inspection based on global visual cues, then examines each part with independently sourced forensic evidence. A stage-gated injection mechanism delivers part-level forensic evidence only during examination, ensuring that part selection remains driven by the model's own perception rather than biased by external signals. We further propose a progressive three-stage training paradigm whose reinforcement learning stage employs part-aware rewards to enforce anatomical validity and evidence--conclusion coherence. To enable rigorous generalizability evaluation, we construct OmniFake, a hierarchical 5-Level benchmark where the model, trained on only three foundational generators, is progressively tested up to in-the-wild social-media data. Extensive experiments on OmniFake and cross-dataset evaluations demonstrate that VIGIL consistently outperforms both expert detectors and concurrent MLLM-based methods across all generalizability levels.


Key Contributions

  • Part-centric structured reasoning framework (plan-then-examine) that decouples forensic claims from independently sourced region-specific evidence
  • Context-aware dynamic signal injection mechanism delivering part-level frequency-domain and pixel-level forensic evidence only during examination stage
  • Progressive three-stage training with RL-based part-aware rewards enforcing anatomical validity and evidence-conclusion coherence
  • OmniFake benchmark: hierarchical 5-level generalizability evaluation from in-domain to in-the-wild social media data

🛡️ Threat Analysis

Output Integrity Attack

Primary contribution is detecting AI-generated facial images (deepfakes) and distinguishing them from authentic content — this is AI-generated content detection, a core ML09 task. The paper builds a detection system for synthetic face verification and content authenticity.


Details

Domains
visionmultimodalgenerative
Model Types
multimodaltransformergandiffusion
Threat Tags
inference_timedigital
Datasets
OmniFake
Applications
deepfake detectionfacial image authenticationsynthetic media verification