tool 2025

Provenance of AI-Generated Images: A Vector Similarity and Blockchain-based Approach

Jitendra Sharma , Arthur Carvalho , Suman Bhunia

0 citations · 27 references · Consumer Communications and Ne...

α

Published on arXiv

2510.17854

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Perturbed AI-generated images maintain close embedding similarity to their originals even under moderate-to-high perturbations, confirming the robustness of embedding proximity as a detection signal.


Rapid advancement in generative AI and large language models (LLMs) has enabled the generation of highly realistic and contextually relevant digital content. LLMs such as ChatGPT with DALL-E integration and Stable Diffusion techniques can produce images that are often indistinguishable from those created by humans, which poses challenges for digital content authentication. Verifying the integrity and origin of digital data to ensure it remains unaltered and genuine is crucial to maintaining trust and legality in digital media. In this paper, we propose an embedding-based AI image detection framework that utilizes image embeddings and a vector similarity to distinguish AI-generated images from real (human-created) ones. Our methodology is built on the hypothesis that AI-generated images demonstrate closer embedding proximity to other AI-generated content, while human-created images cluster similarly within their domain. To validate this hypothesis, we developed a system that processes a diverse dataset of AI and human-generated images through five benchmark embedding models. Extensive experimentation demonstrates the robustness of our approach, and our results confirm that moderate to high perturbations minimally impact the embedding signatures, with perturbed images maintaining close similarity matches to their original versions. Our solution provides a generalizable framework for AI-generated image detection that balances accuracy with computational efficiency.


Key Contributions

  • Embedding-based detection framework that exploits the hypothesis that AI-generated images cluster more closely in embedding space than human-created images
  • Evaluation across five benchmark embedding models demonstrating robustness of embedding signatures under moderate-to-high image perturbations
  • Blockchain-integrated provenance system for authenticating digital image origin and integrity

🛡️ Threat Analysis

Output Integrity Attack

Primary contribution is an AI-generated image detection system using embedding similarity to distinguish synthetic from human-created images, combined with blockchain-based content provenance authentication — directly addresses output integrity and content authenticity.


Details

Domains
visiongenerative
Model Types
diffusiongan
Threat Tags
inference_time
Applications
ai-generated image detectiondigital content authenticationimage provenance verification