TokenTrace: Multi-Concept Attribution through Watermarked Token Recovery
Li Zhang 1, Shruti Agarwal 2, John Collomosse 2, Pengtao Xie 1, Vishal Asnani 2
Published on arXiv
2602.19019
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Achieves state-of-the-art multi-concept attribution performance, significantly outperforming existing baselines on both object and style attribution while maintaining high visual quality and transformation robustness.
TokenTrace
Novel technique introduced
Generative AI models pose a significant challenge to intellectual property (IP), as they can replicate unique artistic styles and concepts without attribution. While watermarking offers a potential solution, existing methods often fail in complex scenarios where multiple concepts (e.g., an object and an artistic style) are composed within a single image. These methods struggle to disentangle and attribute each concept individually. In this work, we introduce TokenTrace, a novel proactive watermarking framework for robust, multi-concept attribution. Our method embeds secret signatures into the semantic domain by simultaneously perturbing the text prompt embedding and the initial latent noise that guide the diffusion model's generation process. For retrieval, we propose a query-based TokenTrace module that takes the generated image and a textual query specifying which concepts need to be retrieved (e.g., a specific object or style) as inputs. This query-based mechanism allows the module to disentangle and independently verify the presence of multiple concepts from a single generated image. Extensive experiments show that our method achieves state-of-the-art performance on both single-concept (object and style) and multi-concept attribution tasks, significantly outperforming existing baselines while maintaining high visual quality and robustness to common transformations.
Key Contributions
- Proactive watermarking framework (TokenTrace) that simultaneously perturbs text prompt embeddings and initial latent noise to embed multi-concept signatures into diffusion model outputs
- Query-based retrieval module that takes a generated image and a textual concept query as inputs to disentangle and independently verify the presence of multiple watermarked concepts from a single image
- State-of-the-art performance on both single-concept (object and style) and multi-concept attribution tasks with robustness to common image transformations
🛡️ Threat Analysis
TokenTrace embeds secret signatures into diffusion model OUTPUTS (generated images) to trace content provenance and enable multi-concept IP attribution — a content watermarking scheme for output integrity, not model weight watermarking.