Semantic Discrepancy-aware Detector for Image Forgery Identification

With the rapid advancement of image generation techniques, robust forgery detection has become increasingly imperative to ensure the trustworthiness of digital media. Recent research indicates that the learned semantic concepts of pre-trained models are critical for identifying fake images. However, the misalignment between the forgery and semantic concept spaces hinders the model's forgery detection performance. To address this problem, we propose a novel Semantic Discrepancy-aware Detector (SDD) that leverages reconstruction learning to align the two spaces at a fine-grained visual level. By exploiting the conceptual knowledge embedded in the pre-trained vision language model, we specifically design a semantic token sampling module to mitigate the space shifts caused by features irrelevant to both forgery traces and semantic concepts. A concept-level forgery discrepancy learning module, built upon a visual reconstruction paradigm, is proposed to strengthen the interaction between visual semantic concepts and forgery traces, effectively capturing discrepancies under the concepts' guidance. Finally, the low-level forgery feature enhancemer integrates the learned concept level forgery discrepancies to minimize redundant forgery information. Experiments conducted on two standard image forgery datasets demonstrate the efficacy of the proposed SDD, which achieves superior results compared to existing methods. The code is available at https://github.com/wzy1111111/SSD.

Key Contributions

Semantic token sampling module to mitigate feature space misalignment between forgery traces and semantic concepts in pre-trained vision-language models
Concept-level forgery discrepancy learning module built on a visual reconstruction paradigm to capture forgery discrepancies guided by semantic concepts
Low-level forgery feature enhancer that integrates concept-level discrepancies to reduce redundant forgery information

🛡️ Threat Analysis

Output Integrity Attack

The paper's primary contribution is a novel AI-generated image detection architecture — directly fitting the ML09 category of AI-generated content detection and output integrity. It proposes forensic components (semantic token sampling, concept-level forgery discrepancy learning) specifically to identify fake/forged images produced by modern generative techniques.

Details

Domains

vision

Model Types

transformervlm

Threat Tags

inference_time

Applications

2025 0 cit.

Output Integrity Attack

89%