RobPI: Robust Private Inference against Malicious Client
Jiaqi Xue , Mengxin Zheng , Qian Lou
Published on arXiv
2602.19918
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
RobPI achieves ~91.9% attack success rate reduction and forces attackers to use more than 10x additional queries compared to unprotected private inference protocols.
RobPI
Novel technique introduced
The increased deployment of machine learning inference in various applications has sparked privacy concerns. In response, private inference (PI) protocols have been created to allow parties to perform inference without revealing their sensitive data. Despite recent advances in the efficiency of PI, most current methods assume a semi-honest threat model where the data owner is honest and adheres to the protocol. However, in reality, data owners can have different motivations and act in unpredictable ways, making this assumption unrealistic. To demonstrate how a malicious client can compromise the semi-honest model, we first designed an inference manipulation attack against a range of state-of-the-art private inference protocols. This attack allows a malicious client to modify the model output with 3x to 8x fewer queries than current black-box attacks. Motivated by the attacks, we proposed and implemented RobPI, a robust and resilient private inference protocol that withstands malicious clients. RobPI integrates a distinctive cryptographic protocol that bolsters security by weaving encryption-compatible noise into the logits and features of private inference, thereby efficiently warding off malicious-client attacks. Our extensive experiments on various neural networks and datasets show that RobPI achieves ~91.9% attack success rate reduction and increases more than 10x the number of queries required by malicious-client attacks.
Key Contributions
- Novel inference manipulation attack (PNet-attack) against FHE-based private inference protocols requiring 3x–8x fewer queries than existing black-box attacks
- RobPI: a cryptographic-compatible defense that injects noise into logits and features of private inference to thwart malicious-client attacks
- Dynamic noise training (RobPI-DNT) algorithm to mitigate the accuracy degradation caused by noise injection
🛡️ Threat Analysis
The core threat is a malicious client crafting adversarial inputs (black-box query-based attack) to manipulate model outputs at inference time within FHE-based private inference protocols. RobPI is a defense against this inference manipulation attack, injecting noise into logits and features to reduce attack success by ~91.9%.