DRAG: Data Reconstruction Attack using Guided Diffusion
Wa-Kin Lei 1, Jun-Cheng Chen 2, Shang-Tse Chen 1
Published on arXiv
2509.11724
Model Inversion Attack
OWASP ML Top 10 — ML03
Key Finding
DRAG significantly outperforms state-of-the-art data reconstruction methods on deep-layer intermediate representations of vision foundation models, demonstrating previously underestimated privacy risks in split inference deployments.
DRAG
Novel technique introduced
With the rise of large foundation models, split inference (SI) has emerged as a popular computational paradigm for deploying models across lightweight edge devices and cloud servers, addressing data privacy and computational cost concerns. However, most existing data reconstruction attacks have focused on smaller CNN classification models, leaving the privacy risks of foundation models in SI settings largely unexplored. To address this gap, we propose a novel data reconstruction attack based on guided diffusion, which leverages the rich prior knowledge embedded in a latent diffusion model (LDM) pre-trained on a large-scale dataset. Our method performs iterative reconstruction on the LDM's learned image prior, effectively generating high-fidelity images resembling the original data from their intermediate representations (IR). Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art methods, both qualitatively and quantitatively, in reconstructing data from deep-layer IRs of the vision foundation model. The results highlight the urgent need for more robust privacy protection mechanisms for large models in SI scenarios. Code is available at: https://github.com/ntuaislab/DRAG.
Key Contributions
- DRAG: a novel data reconstruction attack using a pre-trained latent diffusion model as a guided prior to reconstruct private inputs from intermediate representations in split inference
- First systematic evaluation of data reconstruction attacks against deep-layer IRs of large vision foundation models (as opposed to prior CNN-focused work)
- Demonstrates that foundation models in split inference are significantly more vulnerable than previously assumed, outperforming SOTA reconstruction methods qualitatively and quantitatively
🛡️ Threat Analysis
Core contribution is a data reconstruction attack: an adversary (malicious/curious cloud server in split inference) reconstructs private input images from intermediate representations of a vision foundation model. Uses a pre-trained latent diffusion model as a prior to iteratively invert IRs back to high-fidelity images — a direct model inversion attack targeting input data privacy.