defense 2025

EditTrack: Detecting and Attributing AI-assisted Image Editing

Zhengyuan Jiang , Yuyang Zhang , Moyang Guo , Neil Zhenqiang Gong

1 citations · 29 references · arXiv

α

Published on arXiv

2510.01173

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

EditTrack consistently achieves accurate detection and attribution of AI-assisted image editing across five editing models and six datasets, significantly outperforming five baselines.

EditTrack

Novel technique introduced


In this work, we formulate and study the problem of image-editing detection and attribution: given a base image and a suspicious image, detection seeks to determine whether the suspicious image was derived from the base image using an AI editing model, while attribution further identifies the specific editing model responsible. Existing methods for detecting and attributing AI-generated images are insufficient for this problem, as they focus on determining whether an image was AI-generated/edited rather than whether it was edited from a particular base image. To bridge this gap, we propose EditTrack, the first framework for this image-editing detection and attribution problem. Building on four key observations about the editing process, EditTrack introduces a novel re-editing strategy and leverages carefully designed similarity metrics to determine whether a suspicious image originates from a base image and, if so, by which model. We evaluate EditTrack on five state-of-the-art editing models across six datasets, demonstrating that it consistently achieves accurate detection and attribution, significantly outperforming five baselines.


Key Contributions

  • First framework (EditTrack) for the novel problem of image-editing detection and attribution — determining if a suspicious image was derived from a specific base image via an AI editing model
  • Novel re-editing strategy combined with carefully designed similarity metrics grounded in four key observations about the editing process
  • Evaluation across five state-of-the-art editing models and six datasets, significantly outperforming five baselines

🛡️ Threat Analysis

Output Integrity Attack

Directly addresses AI-generated content detection and model attribution: determines whether a suspicious image was AI-edited from a specific base image and identifies the responsible editing model. This is a novel forensic technique for content provenance and authenticity — a core ML09 concern, going beyond general AI-image detection by framing the problem as base-image-conditioned editing attribution.


Details

Domains
vision
Model Types
diffusion
Threat Tags
inference_timeblack_box
Applications
image editing detectioncontent attributiondigital image forensics