defense 2025

Proto-LeakNet: Towards Signal-Leak Aware Attribution in Synthetic Human Face Imagery

Claudio Giusti , Luca Guarnera , Sebastiano Battiato

0 citations · 50 references · arXiv

α

Published on arXiv

2511.04260

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Achieves Macro AUC of 98.13% on closed-set synthetic face attribution while maintaining robust separability between real, known-generator, and unseen-generator images under post-processing, surpassing state-of-the-art methods.

Proto-LeakNet

Novel technique introduced


The growing sophistication of synthetic image and deepfake generation models has turned source attribution and authenticity verification into a critical challenge for modern computer vision systems. Recent studies suggest that diffusion pipelines unintentionally imprint persistent statistical traces, known as signal-leaks, within their outputs, particularly in latent representations. Building on this observation, we propose Proto-LeakNet, a signal-leak-aware and interpretable attribution framework that integrates closed-set classification with a density-based open-set evaluation on the learned embeddings, enabling analysis of unseen generators without retraining. Acting in the latent domain of diffusion models, our method re-simulates partial forward diffusion to expose residual generator-specific cues. A temporal attention encoder aggregates multi-step latent features, while a feature-weighted prototype head structures the embedding space and enables transparent attribution. Trained solely on closed data and achieving a Macro AUC of 98.13%, Proto-LeakNet learns a latent geometry that remains robust under post-processing, surpassing state-of-the-art methods, and achieves strong separability both between real images and known generators, and between known and unseen ones. The codebase will be available after acceptance.


Key Contributions

  • Signal-leak-aware latent-domain attribution: re-simulates partial forward diffusion to extract generator-specific statistical traces from diffusion model latent representations
  • Temporal attention encoder that aggregates multi-step latent features across diffusion timesteps (t∈{0,5,10}) via a ResNet-18 + attention pooling backbone
  • Feature-weighted prototype head enabling interpretable closed-set attribution and density-based open-set evaluation of unseen generators without retraining

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel forensic attribution framework for AI-generated content (synthetic face imagery from diffusion models), including both closed-set source attribution and open-set detection of unseen generators — core output integrity and content provenance research with novel architectural contributions (signal-leak exploitation, temporal attention encoder, prototype head).


Details

Domains
visiongenerative
Model Types
diffusioncnntransformer
Threat Tags
inference_time
Applications
synthetic face attributiondeepfake detectionai-generated image forensics