defense 2026

Process Over Outcome: Cultivating Forensic Reasoning for Generalizable Multimodal Manipulation Detection

0 citations

Published on arXiv

2603.01993

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

REFORM achieves 81.52% ACC on ROM, 76.65% ACC on DGM4, and 74.9 F1 on MMFakeBench, outperforming prior manipulation detection methods with better generalization to unseen manipulation patterns.

REFORM

Novel technique introduced

Recent advances in generative AI have significantly enhanced the realism of multimodal media manipulation, thereby posing substantial challenges to manipulation detection. Existing manipulation detection and grounding approaches predominantly focus on manipulation type classification under result-oriented supervision, which not only lacks interpretability but also tends to overfit superficial artifacts. In this paper, we argue that generalizable detection requires incorporating explicit forensic reasoning, rather than merely classifying a limited set of manipulation types, which fails to generalize to unseen manipulation patterns. To this end, we propose REFORM, a reasoning-driven framework that shifts learning from outcome fitting to process modeling. REFORM adopts a three-stage curriculum that first induces forensic rationales, then aligns reasoning with final judgments, and finally refines logical consistency via reinforcement learning. To support this paradigm, we introduce ROM, a large-scale dataset with rich reasoning annotations. Extensive experiments show that REFORM establishes new state-of-the-art performance with superior generalization, achieving 81.52% ACC on ROM, 76.65% ACC on DGM4, and 74.9 F1 on MMFakeBench.

Key Contributions

REFORM: a three-stage curriculum framework that models forensic reasoning as a process (rationale induction → judgment alignment → RL-based logical consistency refinement) rather than outcome classification
ROM: a large-scale multimodal manipulation dataset with rich forensic reasoning annotations to support process-oriented supervision
Demonstrated superior generalization over existing manipulation detection methods, achieving new SOTA on ROM (81.52% ACC), DGM4 (76.65% ACC), and MMFakeBench (74.9 F1)

🛡️ Threat Analysis

Output Integrity Attack

Directly addresses AI-generated content detection by proposing a novel detection architecture (REFORM) for multimodal media manipulation; the paper's core contribution is a generalizable forensic detection method for manipulated images, text, and video — squarely within output integrity and content provenance.

Details

Domains

multimodalvisionnlpgenerative

Model Types

vlmtransformer

Threat Tags

inference_time

Datasets

ROMDGM4MMFakeBench

Applications

multimodal manipulation detectiondeepfake detectionfake news detectionmedia forensics

Read PDF arXiv

Process Over Outcome: Cultivating Forensic Reasoning for Generalizable Multimodal Manipulation Detection

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Towards Interactive Deepfake Analysis

AGMark: Attention-Guided Dynamic Watermarking for Large Vision-Language Models

LogicLens: Visual-Logical Co-Reasoning for Text-Centric Forgery Analysis

Fake-in-Facext: Towards Fine-Grained Explainable DeepFake Analysis

ForensicZip: More Tokens are Better but Not Necessary in Forensic Vision-Language Models

M4-BLIP: Advancing Multi-Modal Media Manipulation Detection through Face-Enhanced Local Analysis

EDVD-LLaMA: Explainable Deepfake Video Detection via Multimodal Large Language Model Reasoning

Training-Free Multimodal Deepfake Detection via Graph Reasoning