Reconstructing Training Data from Adapter-based Federated Large Language Models

Adapter-based Federated Large Language Models (FedLLMs) are widely adopted to reduce the computational, storage, and communication overhead of full-parameter fine-tuning for web-scale applications while preserving user privacy. By freezing the backbone and training only compact low-rank adapters, these methods appear to limit gradient leakage and thwart existing Gradient Inversion Attacks (GIAs). Contrary to this assumption, we show that low-rank adapters create new, exploitable leakage channels. We propose the Unordered-word-bag-based Text Reconstruction (UTR) attack, a novel GIA tailored to the unique structure of adapter-based FedLLMs. UTR overcomes three core challenges: low-dimensional gradients, frozen backbones, and combinatorially large reconstruction spaces by: (i) inferring token presence from attention patterns in frozen layers, (ii) performing sentence-level inversion within the low-rank subspace of adapter gradients, and (iii) enforcing semantic coherence through constrained greedy decoding guided by language priors. Extensive experiments across diverse models (GPT2-Large, BERT, Qwen2.5-7B) and datasets (CoLA, SST-2, Rotten Tomatoes) demonstrate that UTR achieves near-perfect reconstruction accuracy (ROUGE-1/2 > 99), even with large batch size settings where prior GIAs fail completely. Our results reveal a fundamental tension between parameter efficiency and privacy in FedLLMs, challenging the prevailing belief that lightweight adaptation inherently enhances security. Our code and data are available at https://github.com/shwksnshwowk-wq/GIA.

Key Contributions

UTR attack that infers token presence from frozen-layer attention patterns and performs sentence-level inversion within the low-rank subspace of LoRA adapter gradients
Constrained greedy decoding guided by language priors to enforce semantic coherence during reconstruction, overcoming the combinatorially large search space
Near-perfect reconstruction (ROUGE-1/2 > 99) on GPT2-Large, BERT, and Qwen2.5-7B at large batch sizes where all prior GIAs fail, revealing a fundamental tension between parameter efficiency and privacy in FedLLMs

🛡️ Threat Analysis

Model Inversion Attack

Proposes UTR, a gradient inversion attack that reconstructs private training data from gradients shared by clients in federated learning — the canonical ML03 threat. The adversary (aggregation server) exploits low-rank adapter gradient structure to recover client training text with near-perfect fidelity.