defense 2026

TokenTrace: Multi-Concept Attribution through Watermarked Token Recovery

Li Zhang 1, Shruti Agarwal 2, John Collomosse 2, Pengtao Xie 1, Vishal Asnani 2

0 citations · 59 references · arXiv (Cornell University)

α

Published on arXiv

2602.19019

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Achieves state-of-the-art multi-concept attribution performance, significantly outperforming existing baselines on both object and style attribution while maintaining high visual quality and transformation robustness.

TokenTrace

Novel technique introduced


Generative AI models pose a significant challenge to intellectual property (IP), as they can replicate unique artistic styles and concepts without attribution. While watermarking offers a potential solution, existing methods often fail in complex scenarios where multiple concepts (e.g., an object and an artistic style) are composed within a single image. These methods struggle to disentangle and attribute each concept individually. In this work, we introduce TokenTrace, a novel proactive watermarking framework for robust, multi-concept attribution. Our method embeds secret signatures into the semantic domain by simultaneously perturbing the text prompt embedding and the initial latent noise that guide the diffusion model's generation process. For retrieval, we propose a query-based TokenTrace module that takes the generated image and a textual query specifying which concepts need to be retrieved (e.g., a specific object or style) as inputs. This query-based mechanism allows the module to disentangle and independently verify the presence of multiple concepts from a single generated image. Extensive experiments show that our method achieves state-of-the-art performance on both single-concept (object and style) and multi-concept attribution tasks, significantly outperforming existing baselines while maintaining high visual quality and robustness to common transformations.


Key Contributions

  • Proactive watermarking framework (TokenTrace) that simultaneously perturbs text prompt embeddings and initial latent noise to embed multi-concept signatures into diffusion model outputs
  • Query-based retrieval module that takes a generated image and a textual concept query as inputs to disentangle and independently verify the presence of multiple watermarked concepts from a single image
  • State-of-the-art performance on both single-concept (object and style) and multi-concept attribution tasks with robustness to common image transformations

🛡️ Threat Analysis

Output Integrity Attack

TokenTrace embeds secret signatures into diffusion model OUTPUTS (generated images) to trace content provenance and enable multi-concept IP attribution — a content watermarking scheme for output integrity, not model weight watermarking.


Details

Domains
visiongenerative
Model Types
diffusion
Threat Tags
inference_timedigital
Applications
ai-generated image attributionintellectual property protectionartistic style attributionmulti-concept provenance tracking