PRIVEE: Privacy-Preserving Vertical Federated Learning Against Feature Inference Attacks

Vertical Federated Learning (VFL) enables collaborative model training across organizations that share common user samples but hold disjoint feature spaces. Despite its potential, VFL is susceptible to feature inference attacks, in which adversarial parties exploit shared confidence scores (i.e., prediction probabilities) during inference to reconstruct private input features of other participants. To counter this threat, we propose PRIVEE (PRIvacy-preserving Vertical fEderated lEarning), a novel defense mechanism named after the French word privée, meaning "private." PRIVEE obfuscates confidence scores while preserving critical properties such as relative ranking and inter-score distances. Rather than exposing raw scores, PRIVEE shares only the transformed representations, mitigating the risk of reconstruction attacks without degrading model prediction accuracy. Extensive experiments show that PRIVEE achieves a threefold improvement in privacy protection compared to state-of-the-art defenses, while preserving full predictive performance against advanced feature inference attacks.

Key Contributions

PRIVEE defense mechanism that obfuscates VFL confidence scores while preserving relative ranking and inter-score distances, preventing feature reconstruction without degrading predictive performance
Threefold improvement in privacy protection against feature inference attacks compared to state-of-the-art VFL defenses
Addresses inference-time attacks specifically, which are not mitigated by training-time defenses such as differential privacy

🛡️ Threat Analysis

Model Inversion Attack

Feature inference attacks in VFL are data reconstruction attacks: the adversary exploits shared confidence scores (model outputs) at inference time to reconstruct private input features belonging to other VFL participants. PRIVEE defends against this by transforming confidence scores to prevent reconstruction while preserving utility — this is the core ML03 threat of recovering private data from model outputs.

Details

Domains

federated-learningtabular

Model Types

federated

Threat Tags

inference_timeblack_box

Applications

2025 1 cit.

Model Inversion Attack

54%