defense 2025

Creating Blank Canvas Against AI-enabled Image Forgery

Qi Song , Ziyuan Luo , Renjie Wan

0 citations · 53 references · arXiv

α

Published on arXiv

2511.22237

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Frequency-aware adversarial perturbations fully suppress SAM's segmentation capability on protected images, enabling accurate localization of AIGC-tampered regions when the perturbation pattern is disrupted by editing.

Blank Canvas

Novel technique introduced


AIGC-based image editing technology has greatly simplified the realistic-level image modification, causing serious potential risks of image forgery. This paper introduces a new approach to tampering detection using the Segment Anything Model (SAM). Instead of training SAM to identify tampered areas, we propose a novel strategy. The entire image is transformed into a blank canvas from the perspective of neural models. Any modifications to this blank canvas would be noticeable to the models. To achieve this idea, we introduce adversarial perturbations to prevent SAM from ``seeing anything'', allowing it to identify forged regions when the image is tampered with. Due to SAM's powerful perceiving capabilities, naive adversarial attacks cannot completely tame SAM. To thoroughly deceive SAM and make it blind to the image, we introduce a frequency-aware optimization strategy, which further enhances the capability of tamper localization. Extensive experimental results demonstrate the effectiveness of our method.


Key Contributions

  • Novel proactive forgery detection strategy that transforms images into a 'blank canvas' for neural models via adversarial perturbations, enabling tamper localization without retraining SAM
  • Frequency-aware optimization strategy to comprehensively suppress SAM's perception across the full image, overcoming the limitations of naive adversarial attacks against a powerful segmentation model
  • Demonstrated effectiveness at localizing AIGC-edited regions by exploiting the disruption of the protective perturbation pattern post-tampering

🛡️ Threat Analysis

Output Integrity Attack

Paper proposes a content protection and forgery-detection scheme: adversarial perturbations render an image 'invisible' to SAM, so any AIGC-based tampering breaks the pattern and becomes detectable — a direct contribution to output integrity and content authenticity against AI-generated forgeries.


Details

Domains
visiongenerative
Model Types
transformerdiffusion
Threat Tags
white_boxdigitalinference_time
Applications
image forgery detectionimage tamper localizationaigc manipulation detection