PRISM: Phase-enhanced Radial-based Image Signature Mapping framework for fingerprinting AI-generated images

A critical need has emerged for generative AI: attribution methods. That is, solutions that can identify the model originating AI-generated content. This feature, generally relevant in multimodal applications, is especially sensitive in commercial settings where users subscribe to paid proprietary services and expect guarantees about the source of the content they receive. To address these issues, we introduce PRISM, a scalable Phase-enhanced Radial-based Image Signature Mapping framework for fingerprinting AI-generated images. PRISM is based on a radial reduction of the discrete Fourier transform that leverages amplitude and phase information to capture model-specific signatures. The output of the above process is subsequently clustered via linear discriminant analysis to achieve reliable model attribution in diverse settings, even if the model's internal details are inaccessible. To support our work, we construct PRISM-36K, a novel dataset of 36,000 images generated by six text-to-image GAN- and diffusion-based models. On this dataset, PRISM achieves an attribution accuracy of 92.04%. We additionally evaluate our method on four benchmarks from the literature, reaching an average accuracy of 81.60%. Finally, we evaluate our methodology also in the binary task of detecting real vs fake images, achieving an average accuracy of 88.41%. We obtain our best result on GenImage with an accuracy of 95.06%, whereas the original benchmark achieved 82.20%. Our results demonstrate the effectiveness of frequency-domain fingerprinting for cross-architecture and cross-dataset model attribution, offering a viable solution for enforcing accountability and trust in generative AI systems.

Key Contributions

PRISM: a model-agnostic framework using radially-reduced DFT amplitude and phase features with LDA clustering for AI-generated image model attribution
PRISM-36K: a new dataset of 36,000 images from six text-to-image models (GAN- and diffusion-based) across 40 prompts
Cross-architecture, cross-dataset evaluation achieving 92.04% attribution accuracy on PRISM-36K and 95.06% real-vs-fake accuracy on GenImage (vs. 82.20% baseline)

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel content attribution and fingerprinting system for AI-generated images — directly addresses output integrity and provenance by identifying which specific generative model produced an image, including both model attribution and real-vs-fake detection tasks.