DP-FedLoRA: Privacy-Enhanced Federated Fine-Tuning for On-Device Large Language Models
Honghui Xu 1, Shiva Shrestha 1, Wei Chen 2, Zhiyuan Li 2, Zhipeng Cai 3
Published on arXiv
2509.09097
Membership Inference Attack
OWASP ML Top 10 — ML04
Key Finding
DP-FedLoRA delivers competitive benchmark performance while providing formal (ε,δ)-DP guarantees against membership inference attacks from a semi-honest central server
DP-FedLoRA
Novel technique introduced
As on-device large language model (LLM) systems become increasingly prevalent, federated fine-tuning enables advanced language understanding and generation directly on edge devices; however, it also involves processing sensitive, user-specific data, raising significant privacy concerns within the federated learning framework. To address these challenges, we propose DP-FedLoRA, a privacy-enhanced federated fine-tuning framework that integrates LoRA-based adaptation with differential privacy in a communication-efficient setting. Each client locally clips and perturbs its LoRA matrices using Gaussian noise to satisfy ($ε$, $δ$)-differential privacy. We further provide a theoretical analysis demonstrating the unbiased nature of the updates and deriving bounds on the variance introduced by noise, offering practical guidance for privacy-budget calibration. Experimental results across mainstream benchmarks show that DP-FedLoRA delivers competitive performance while offering strong privacy guarantees, paving the way for scalable and privacy-preserving LLM deployment in on-device environments.
Key Contributions
- DP-FedLoRA framework combining LoRA-based federated fine-tuning with (ε,δ)-differential privacy via Gaussian noise injection and norm clipping on client LoRA matrices
- Noise-injected aggregation mechanism that preserves LoRA structure and supports heterogeneous clients with varying adaptation ranks
- Theoretical analysis bounding the variance introduced by the Gaussian mechanism, providing practical guidance for privacy-budget calibration
🛡️ Threat Analysis
The explicit threat model is a semi-honest central server performing membership inference attacks on shared LoRA update matrices; DP-FedLoRA applies Gaussian noise and norm clipping specifically to defend against this binary membership-inference threat, with a dedicated threat model section (Section 4) scoped to MIA.