Provenance of AI-Generated Images: A Vector Similarity and Blockchain-based Approach

Rapid advancement in generative AI and large language models (LLMs) has enabled the generation of highly realistic and contextually relevant digital content. LLMs such as ChatGPT with DALL-E integration and Stable Diffusion techniques can produce images that are often indistinguishable from those created by humans, which poses challenges for digital content authentication. Verifying the integrity and origin of digital data to ensure it remains unaltered and genuine is crucial to maintaining trust and legality in digital media. In this paper, we propose an embedding-based AI image detection framework that utilizes image embeddings and a vector similarity to distinguish AI-generated images from real (human-created) ones. Our methodology is built on the hypothesis that AI-generated images demonstrate closer embedding proximity to other AI-generated content, while human-created images cluster similarly within their domain. To validate this hypothesis, we developed a system that processes a diverse dataset of AI and human-generated images through five benchmark embedding models. Extensive experimentation demonstrates the robustness of our approach, and our results confirm that moderate to high perturbations minimally impact the embedding signatures, with perturbed images maintaining close similarity matches to their original versions. Our solution provides a generalizable framework for AI-generated image detection that balances accuracy with computational efficiency.

Key Contributions

Embedding-based detection framework that exploits the hypothesis that AI-generated images cluster more closely in embedding space than human-created images
Evaluation across five benchmark embedding models demonstrating robustness of embedding signatures under moderate-to-high image perturbations
Blockchain-integrated provenance system for authenticating digital image origin and integrity

🛡️ Threat Analysis

Output Integrity Attack

Primary contribution is an AI-generated image detection system using embedding similarity to distinguish synthetic from human-created images, combined with blockchain-based content provenance authentication — directly addresses output integrity and content authenticity.

Details

Domains

visiongenerative

Model Types

diffusiongan

Threat Tags

inference_time

Applications

2026 1 cit.

Output Integrity Attack

83%