Creating Blank Canvas Against AI-enabled Image Forgery

AIGC-based image editing technology has greatly simplified the realistic-level image modification, causing serious potential risks of image forgery. This paper introduces a new approach to tampering detection using the Segment Anything Model (SAM). Instead of training SAM to identify tampered areas, we propose a novel strategy. The entire image is transformed into a blank canvas from the perspective of neural models. Any modifications to this blank canvas would be noticeable to the models. To achieve this idea, we introduce adversarial perturbations to prevent SAM from ``seeing anything'', allowing it to identify forged regions when the image is tampered with. Due to SAM's powerful perceiving capabilities, naive adversarial attacks cannot completely tame SAM. To thoroughly deceive SAM and make it blind to the image, we introduce a frequency-aware optimization strategy, which further enhances the capability of tamper localization. Extensive experimental results demonstrate the effectiveness of our method.

Key Contributions

Novel proactive forgery detection strategy that transforms images into a 'blank canvas' for neural models via adversarial perturbations, enabling tamper localization without retraining SAM
Frequency-aware optimization strategy to comprehensively suppress SAM's perception across the full image, overcoming the limitations of naive adversarial attacks against a powerful segmentation model
Demonstrated effectiveness at localizing AIGC-edited regions by exploiting the disruption of the protective perturbation pattern post-tampering

🛡️ Threat Analysis

Output Integrity Attack

Paper proposes a content protection and forgery-detection scheme: adversarial perturbations render an image 'invisible' to SAM, so any AIGC-based tampering breaks the pattern and becomes detectable — a direct contribution to output integrity and content authenticity against AI-generated forgeries.

Details

Domains

visiongenerative

Model Types

transformerdiffusion

Threat Tags

white_boxdigitalinference_time

Applications

2025 0 cit.

Output Integrity Attack

92%