LoRA and Privacy: When Random Projections Help (and When They Don't)
Yaxi Hu 1, Johanna Düngler 2, Bernhard Schölkopf 1, Amartya Sanyal 2
Published on arXiv
2601.21719
Membership Inference Attack
OWASP ML Top 10 — ML04
Key Finding
LoRA-style updates are not inherently private (MIA AUC > 0.99 noise-free), but the noisy low-rank variant achieves stronger privacy guarantees than full fine-tuning at the same noise level, enabling lower noise and improved accuracy with tighter DP accounting
Wishart projection mechanism
Novel technique introduced
We introduce the (Wishart) projection mechanism, a randomized map of the form $S \mapsto M f(S)$ with $M \sim W_d(1/r I_d, r)$ and study its differential privacy properties. For vector-valued queries $f$, we prove non-asymptotic DP guarantees without any additive noise, showing that Wishart randomness alone can suffice. For matrix-valued queries, however, we establish a sharp negative result: in the noise-free setting, the mechanism is not DP, and we demonstrate its vulnerability by implementing a near perfect membership inference attack (AUC $> 0.99$). We then analyze a noisy variant and prove privacy amplification due to randomness and low rank projection, in both large- and small-rank regimes, yielding stronger privacy guarantees than additive noise alone. Finally, we show that LoRA-style updates are an instance of the matrix-valued mechanism, implying that LoRA is not inherently private despite its built-in randomness, but that low-rank fine-tuning can be more private than full fine-tuning at the same noise level. Preliminary experiments suggest that tighter accounting enables lower noise and improved accuracy in practice.
Key Contributions
- Introduces the Wishart projection mechanism and proves non-asymptotic DP guarantees for vector-valued queries without any additive noise
- Establishes a sharp negative result that the matrix-valued (LoRA-style) mechanism is not differentially private, demonstrated concretely via a membership inference attack with AUC > 0.99
- Proves privacy amplification for the noisy variant showing that low-rank fine-tuning can achieve strictly stronger DP guarantees than full fine-tuning at the same noise level
🛡️ Threat Analysis
The core negative result is demonstrated via an explicit membership inference attack (AUC > 0.99) against the noise-free matrix projection mechanism, directly showing LoRA-style gradient updates are vulnerable to MIA; DP bounds throughout the paper characterize resistance to membership inference advantage.