Media Integrity and Authentication: Status, Directions, and Futures
Jessica Young , Sam Vaughan , Andrew Jenks , Henrique Malvar , Christian Paquin , Paul England , Thomas Roca , Juan LaVista Ferres , Forough Poursabzi , Neil Coles , Ken Archer , Eric Horvitz
Published on arXiv
2602.18681
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Identifies sociotechnical reversal attacks — which can flip authenticity signals bidirectionally — as a critical underexplored threat to media integrity systems, and proposes secure enclave-based provenance as a high-confidence countermeasure.
We provide background on emerging challenges and future directions with media integrity and authentication methods, focusing on distinguishing AI-generated media from authentic content captured by cameras and microphones. We evaluate several approaches, including provenance, watermarking, and fingerprinting. After defining each method, we analyze three representative technologies: cryptographically secured provenance, imperceptible watermarking, and soft-hash fingerprinting. We analyze how these tools operate across modalities and evaluate relevant threat models, attack categories, and real-world workflows spanning capture, editing, distribution, and verification. We consider sociotechnical reversal attacks that can invert integrity signals, making authentic content appear synthetic and vice versa, highlighting the value of verification systems that are resilient to both technical and psychosocial manipulation. Finally, we outline techniques for delivering high-confidence provenance authentication, including directions for strengthening edge-device security using secure enclaves.
Key Contributions
- Comparative analysis of three representative media integrity technologies: cryptographically secured provenance, imperceptible watermarking, and soft-hash fingerprinting across modalities
- Taxonomy of threat models and attack categories spanning capture, editing, distribution, and verification workflows for media authentication
- Identification of sociotechnical reversal attacks that invert integrity signals (making authentic content appear synthetic and vice versa) and directions for secure-enclave-based edge authentication
🛡️ Threat Analysis
Directly surveys output integrity methods (AI-generated content detection, imperceptible watermarking, cryptographic provenance, soft-hash fingerprinting) and analyzes threat models including sociotechnical reversal attacks that invert integrity signals — all canonical ML09 territory.