TokenTrace: Multi-Concept Attribution through Watermarked Token Recovery

Generative AI models pose a significant challenge to intellectual property (IP), as they can replicate unique artistic styles and concepts without attribution. While watermarking offers a potential solution, existing methods often fail in complex scenarios where multiple concepts (e.g., an object and an artistic style) are composed within a single image. These methods struggle to disentangle and attribute each concept individually. In this work, we introduce TokenTrace, a novel proactive watermarking framework for robust, multi-concept attribution. Our method embeds secret signatures into the semantic domain by simultaneously perturbing the text prompt embedding and the initial latent noise that guide the diffusion model's generation process. For retrieval, we propose a query-based TokenTrace module that takes the generated image and a textual query specifying which concepts need to be retrieved (e.g., a specific object or style) as inputs. This query-based mechanism allows the module to disentangle and independently verify the presence of multiple concepts from a single generated image. Extensive experiments show that our method achieves state-of-the-art performance on both single-concept (object and style) and multi-concept attribution tasks, significantly outperforming existing baselines while maintaining high visual quality and robustness to common transformations.

Key Contributions

Proactive watermarking framework (TokenTrace) that simultaneously perturbs text prompt embeddings and initial latent noise to embed multi-concept signatures into diffusion model outputs
Query-based retrieval module that takes a generated image and a textual concept query as inputs to disentangle and independently verify the presence of multiple watermarked concepts from a single image
State-of-the-art performance on both single-concept (object and style) and multi-concept attribution tasks with robustness to common image transformations

🛡️ Threat Analysis

Output Integrity Attack

TokenTrace embeds secret signatures into diffusion model OUTPUTS (generated images) to trace content provenance and enable multi-concept IP attribution — a content watermarking scheme for output integrity, not model weight watermarking.

Details

Domains

visiongenerative

Model Types

diffusion

Threat Tags

inference_timedigital

Applications

2026 0 cit.

Output Integrity Attack

100%