What Your Features Reveal: Data-Efficient Black-Box Feature Inversion Attack for Split DNNs
Zhihan Ren , Lijun He , Jiaxi Liang , Xinzhu Fu , Haixia Bi , Fan Li
Published on arXiv
2511.15316
Model Inversion Attack
OWASP ML Top 10 — ML03
Key Finding
FIA-Flow achieves higher-fidelity and semantically aligned feature inversion than prior methods across AlexNet, ResNet, Swin Transformer, DINO, and YOLO11, revealing more severe privacy threats in split DNNs than previously recognized.
FIA-Flow
Novel technique introduced
Split DNNs enable edge devices by offloading intensive computation to a cloud server, but this paradigm exposes privacy vulnerabilities, as the intermediate features can be exploited to reconstruct the private inputs via Feature Inversion Attack (FIA). Existing FIA methods often produce limited reconstruction quality, making it difficult to assess the true extent of privacy leakage. To reveal the privacy risk of the leaked features, we introduce FIA-Flow, a black-box FIA framework that achieves high-fidelity image reconstruction from intermediate features. To exploit the semantic information within intermediate features, we design a Latent Feature Space Alignment Module (LFSAM) to bridge the semantic gap between the intermediate feature space and the latent space. Furthermore, to rectify distributional mismatch, we develop Deterministic Inversion Flow Matching (DIFM), which projects off-manifold features onto the target manifold with one-step inference. This decoupled design simplifies learning and enables effective training with few image-feature pairs. To quantify privacy leakage from a human perspective, we also propose two metrics based on a large vision-language model. Experiments show that FIA-Flow achieves more faithful and semantically aligned feature inversion across various models (AlexNet, ResNet, Swin Transformer, DINO, and YOLO11) and layers, revealing a more severe privacy threat in Split DNNs than previously recognized.
Key Contributions
- FIA-Flow: a black-box Feature Inversion Attack framework using flow matching to reconstruct high-fidelity images from intermediate split-DNN features with few image-feature training pairs
- Latent Feature Space Alignment Module (LFSAM) to bridge semantic gap between intermediate feature space and generative latent space
- Deterministic Inversion Flow Matching (DIFM) for one-step projection of off-manifold features onto the data manifold, plus two VLM-based metrics for human-aligned privacy leakage quantification
🛡️ Threat Analysis
FIA-Flow is a model inversion attack: an adversary (cloud server or eavesdropper) reconstructs private input images from intermediate feature representations transmitted during split DNN inference, directly fitting the 'embedding inversion — recovering data from embedding vectors' criterion of ML03.