Provenance Verification of AI-Generated Images via a Perceptual Hash Registry Anchored on Blockchain
Apoorv Mohit , Bhavya Aggarwal , Chinmay Gondhalekar
Published on arXiv
2602.02412
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
The system deterministically identifies re-uploads and near-duplicate variants of registered AI-generated images using Hamming distance over perceptual hashes stored in a tamper-proof blockchain registry.
Perceptual Hash Blockchain Registry
Novel technique introduced
The rapid advancement of artificial intelligence has made the generation of synthetic images widely accessible, increasing concerns related to misinformation, digital forgery, and content authenticity on large-scale online platforms. This paper proposes a blockchain-backed framework for verifying AI-generated images through a registry-based provenance mechanism. Each AI-generated image is assigned a digital fingerprint that preserves similarity using perceptual hashing and is registered at creation time by participating generation platforms. The hashes are stored on a hybrid on-chain/off-chain public blockchain using a Merkle Patricia Trie for tamper-resistant storage (on-chain) and a Burkhard-Keller tree (off-chain) to enable efficient similarity search over large image registries. Verification is performed when images are re-uploaded to digital platforms such as social media services, enabling identification of previously registered AI-generated images even after benign transformations or partial modifications. The proposed system does not aim to universally detect all synthetic images, but instead focuses on verifying the provenance of AI-generated content that has been registered at creation time. By design, this approach complements existing watermarking and learning-based detection methods, providing a platform-agnostic, tamper-proof mechanism for scalable content provenance and authenticity verification at the point of large-scale online distribution.
Key Contributions
- Hybrid on-chain/off-chain blockchain registry using Merkle Patricia Trie for tamper-resistant hash storage and BK-tree for efficient similarity search over large image registries
- Perceptual hash (pHash)-based fingerprinting that enables provenance verification of AI-generated images even after benign transformations such as resizing, compression, or minor edits
- Registry-based provenance framework that is platform-agnostic and complementary to watermarking and learning-based AI image detectors, focused on verifying registered content rather than universal synthetic image detection
🛡️ Threat Analysis
Directly addresses content provenance and authenticity of AI-generated images — a core ML09 concern. Proposes a novel registry mechanism using perceptual hashing to authenticate whether uploaded images match known AI-generated content, complementing watermarking and learning-based detection.