defense 2025

ZORRO: Zero-Knowledge Robustness and Privacy for Split Learning (Full Version)

Nojan Sheybani ¹, Alessandro Pegoraro ², Jonathan Knauer ², Phillip Rieger ², Elissa Mollakuqe ², Farinaz Koushanfar ¹, Ahmad-Reza Sadeghi ²

¹ University of California San Diego

² Technical University of Darmstadt

0 citations

Published on arXiv

2509.09787

Model Poisoning

OWASP ML Top 10 — ML10

Key Finding

Reduces backdoor attack success rate to below 6% in Split Learning while incurring less than 10 seconds of overhead for models with up to 1 million client-side parameters.

ZORRO

Novel technique introduced

Split Learning (SL) is a distributed learning approach that enables resource-constrained clients to collaboratively train deep neural networks (DNNs) by offloading most layers to a central server while keeping in- and output layers on the client-side. This setup enables SL to leverage server computation capacities without sharing data, making it highly effective in resource-constrained environments dealing with sensitive data. However, the distributed nature enables malicious clients to manipulate the training process. By sending poisoned intermediate gradients, they can inject backdoors into the shared DNN. Existing defenses are limited by often focusing on server-side protection and introducing additional overhead for the server. A significant challenge for client-side defenses is enforcing malicious clients to correctly execute the defense algorithm. We present ZORRO, a private, verifiable, and robust SL defense scheme. Through our novel design and application of interactive zero-knowledge proofs (ZKPs), clients prove their correct execution of a client-located defense algorithm, resulting in proofs of computational integrity attesting to the benign nature of locally trained DNN portions. Leveraging the frequency representation of model partitions enables ZORRO to conduct an in-depth inspection of the locally trained models in an untrusted environment, ensuring that each client forwards a benign checkpoint to its succeeding client. In our extensive evaluation, covering different model architectures as well as various attack strategies and data scenarios, we show ZORRO's effectiveness, as it reduces the attack success rate to less than 6\% while causing even for models storing \numprint{1000000} parameters on the client-side an overhead of less than 10 seconds.

Key Contributions

ZORRO: a Split Learning defense scheme combining DCT-based frequency-domain inspection of client model partitions with interactive zero-knowledge proofs (ZKPs) to verifiably enforce correct execution of the defense algorithm on untrusted clients.
Novel use of ZKPs to produce proofs of computational integrity attesting to the benign nature of locally trained DNN portions, addressing the fundamental challenge of enforcing client-side defenses.
Empirical evaluation showing attack success rate reduction to <6% across diverse model architectures and attack strategies, with overhead under 10 seconds for 1M-parameter client-side models.

🛡️ Threat Analysis

Model Poisoning

Paper directly defends against backdoor injection in Split Learning: malicious clients send poisoned intermediate gradients to embed hidden targeted behavior in the shared DNN. ZORRO uses DCT frequency analysis and ZKPs to verify clients correctly execute the defense, reducing attack success rate to below 6%. This is a canonical ML10 (backdoor/trojan defense) in a distributed learning setting.

Details

Domains

federated-learningvision

Model Types

cnntransformerfederated

Threat Tags

training_timetargetedgrey_box

Applications

split learningdistributed deep learningfederated inference

Read PDF arXiv

ZORRO: Zero-Knowledge Robustness and Privacy for Split Learning (Full Version)

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

MARS: A Malignity-Aware Backdoor Defense in Federated Learning

FAROS: Robust Federated Learning with Adaptive Scaling against Backdoor Attacks

On Hyperparameters and Backdoor-Resistance in Horizontal Federated Learning

SecureSplit: Mitigating Backdoor Attacks in Split Learning

Coward: Collision-based Watermark for Proactive Federated Backdoor Detection

Structure-Aware Distributed Backdoor Attacks in Federated Learning

Noise-Aware and Dynamically Adaptive Federated Defense Framework for SAR Image Target Recognition

Mingling with the Good to Backdoor Federated Learning