defense 2025

PrivDFS: Private Inference via Distributed Feature Sharing against Data Reconstruction Attacks

Zihan Liu 1, Jiayi Wen 1, Junru Wu 1, Xuyang Zou 1, Shouhong Tan 1, Zhirun Zheng 2, Cheng Huang 1

0 citations

α

Published on arXiv

2508.04346

Model Inversion Attack

OWASP ML Top 10 — ML03

Key Finding

On CIFAR-10, PrivDFS reduces DRA reconstruction quality from PSNR 23.25 to 12.72 and SSIM 0.963 to 0.260 while maintaining accuracy within 1% of non-private split inference.

PrivDFS

Novel technique introduced


In this paper, we introduce PrivDFS, a distributed feature-sharing framework for input-private inference in image classification. A single holistic intermediate representation in split inference gives diffusion-based Data Reconstruction Attacks (DRAs) sufficient signal to reconstruct the input with high fidelity. PrivDFS restructures this vulnerability by fragmenting the representation and processing the fragments independently across a majority-honest set of servers. As a result, each branch observes only an incomplete and reconstruction-insufficient view of the input. To realize this, PrivDFS employs learnable binary masks that partition the intermediate representation into sparse and largely non-overlapping feature shares, each processed by a separate server, while a lightweight fusion module aggregates their predictions on the client. This design preserves full task accuracy when all branches are combined, yet sharply limits the reconstructive power available to any individual server. PrivDFS applies seamlessly to both ResNet-based CNNs and Vision Transformers. Across CIFAR-10/100, CelebA, and ImageNet-1K, PrivDFS induces a pronounced collapse in DRA performance, e.g., on CIFAR-10, PSNR drops from 23.25 -> 12.72 and SSIM from 0.963 -> 0.260, while maintaining accuracy within 1% of non-private split inference. These results establish structural feature partitioning as a practical and architecture-agnostic approach to reducing reconstructive leakage in cloud-based vision inference.


Key Contributions

  • PrivDFS framework that fragments intermediate representations into sparse, non-overlapping feature shares distributed across a majority-honest set of servers so no single server can reconstruct the input
  • Learnable binary masks that partition intermediate representations for architecture-agnostic deployment across ResNet CNNs and Vision Transformers
  • Empirical demonstration that structural feature partitioning collapses DRA performance (PSNR 23.25→12.72, SSIM 0.963→0.260 on CIFAR-10) while preserving task accuracy within 1%

🛡️ Threat Analysis

Model Inversion Attack

The threat model is an adversarial server in split inference that uses diffusion-based Data Reconstruction Attacks (DRAs) to reconstruct private input data from intermediate feature representations — a data reconstruction attack from model internals. PrivDFS directly defends against this by ensuring no single server receives a reconstruction-sufficient feature view, reducing PSNR from 23.25 to 12.72 and SSIM from 0.963 to 0.260 on CIFAR-10.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
inference_timewhite_box
Datasets
CIFAR-10CIFAR-100CelebAImageNet-1K
Applications
image classificationcloud-based vision inferencesplit inference