CLUE: Leveraging Low-Rank Adaptation to Capture Latent Uncovered Evidence for Image Forgery Localization

The increasing accessibility of image editing tools and generative AI has led to a proliferation of visually convincing forgeries, compromising the authenticity of digital media. In this paper, in addition to leveraging distortions from conventional forgeries, we repurpose the mechanism of a state-of-the-art (SOTA) text-to-image synthesis model by exploiting its internal generative process, turning it into a high-fidelity forgery localization tool. To this end, we propose CLUE (Capture Latent Uncovered Evidence), a framework that employs Low- Rank Adaptation (LoRA) to parameter-efficiently reconfigure Stable Diffusion 3 (SD3) as a forensic feature extractor. Our approach begins with the strategic use of SD3's Rectified Flow (RF) mechanism to inject noise at varying intensities into the latent representation, thereby steering the LoRAtuned denoising process to amplify subtle statistical inconsistencies indicative of a forgery. To complement the latent analysis with high-level semantic context and precise spatial details, our method incorporates contextual features from the image encoder of the Segment Anything Model (SAM), which is parameter-efficiently adapted to better trace the boundaries of forged regions. Extensive evaluations demonstrate CLUE's SOTA generalization performance, significantly outperforming prior methods. Furthermore, CLUE shows superior robustness against common post-processing attacks and Online Social Networks (OSNs). Code is publicly available at https://github.com/SZAISEC/CLUE.

Key Contributions

First framework to repurpose Stable Diffusion 3's internal generative process (via LoRA parameter-efficient adaptation) as a forensic feature extractor for image forgery localization
Demonstrates that SD3's Rectified Flow noise mechanism, guided by LoRA fine-tuning, amplifies subtle statistical inconsistencies in forged image regions across multiple noise levels
Integrates LoRA-adapted SAM image encoder for spatial-semantic context, achieving SOTA generalization against both traditional forgeries (copy-move, splicing) and AI-generated forgeries with robustness to post-processing and OSNs

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel AI-generated content detection and localization framework — detecting tampered/forged image regions including those produced by generative AI models. This is a forensic technique for output/content integrity, fitting squarely within ML09's scope of AI-generated content detection and content authenticity.

Details

Domains

visiongenerative

Model Types

diffusiontransformer

Threat Tags

digitalinference_time

Datasets

CASIAv2CoverageColumbiaNIST16

Applications

2025 0 cit.

Output Integrity Attack

92%

CLUE: Leveraging Low-Rank Adaptation to Capture Latent Uncovered Evidence for Image Forgery Localization

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Detecting AI-Generated Forgeries via Iterative Manifold Deviation Amplification

Learning to Watermark in the Latent Space of Generative Models

End4: End-to-end Denoising Diffusion for Diffusion-Based Inpainting Detection

Now You See It, Now You Don't - Instant Concept Erasure for Safe Text-to-Image and Video Generation

DiffusionFF: A Diffusion-based Framework for Joint Face Forgery Detection and Fine-Grained Artifact Localization

Towards Transferable Defense Against Malicious Image Edits

SimuFreeMark: A Noise-Simulation-Free Robust Watermarking Against Image Editing

Towards Sustainable Universal Deepfake Detection with Frequency-Domain Masking