defense 2025

DP-FedLoRA: Privacy-Enhanced Federated Fine-Tuning for On-Device Large Language Models

Honghui Xu 1, Shiva Shrestha 1, Wei Chen 2, Zhiyuan Li 2, Zhipeng Cai 3

0 citations

α

Published on arXiv

2509.09097

Membership Inference Attack

OWASP ML Top 10 — ML04

Key Finding

DP-FedLoRA delivers competitive benchmark performance while providing formal (ε,δ)-DP guarantees against membership inference attacks from a semi-honest central server

DP-FedLoRA

Novel technique introduced


As on-device large language model (LLM) systems become increasingly prevalent, federated fine-tuning enables advanced language understanding and generation directly on edge devices; however, it also involves processing sensitive, user-specific data, raising significant privacy concerns within the federated learning framework. To address these challenges, we propose DP-FedLoRA, a privacy-enhanced federated fine-tuning framework that integrates LoRA-based adaptation with differential privacy in a communication-efficient setting. Each client locally clips and perturbs its LoRA matrices using Gaussian noise to satisfy ($ε$, $δ$)-differential privacy. We further provide a theoretical analysis demonstrating the unbiased nature of the updates and deriving bounds on the variance introduced by noise, offering practical guidance for privacy-budget calibration. Experimental results across mainstream benchmarks show that DP-FedLoRA delivers competitive performance while offering strong privacy guarantees, paving the way for scalable and privacy-preserving LLM deployment in on-device environments.


Key Contributions

  • DP-FedLoRA framework combining LoRA-based federated fine-tuning with (ε,δ)-differential privacy via Gaussian noise injection and norm clipping on client LoRA matrices
  • Noise-injected aggregation mechanism that preserves LoRA structure and supports heterogeneous clients with varying adaptation ranks
  • Theoretical analysis bounding the variance introduced by the Gaussian mechanism, providing practical guidance for privacy-budget calibration

🛡️ Threat Analysis

Membership Inference Attack

The explicit threat model is a semi-honest central server performing membership inference attacks on shared LoRA update matrices; DP-FedLoRA applies Gaussian noise and norm clipping specifically to defend against this binary membership-inference threat, with a dedicated threat model section (Section 4) scoped to MIA.


Details

Domains
nlpfederated-learning
Model Types
llmfederated
Threat Tags
training_timegrey_box
Applications
federated llm fine-tuningon-device llmsedge ai systems