Protecting Private Code in IDE Autocomplete using Differential Privacy
Evgeny Grigorenko 1, David Stanojević 1, David Ilić 1, Egor Bogomolov 1,2, Kostadin Cvejoski 1
Published on arXiv
2601.22935
Membership Inference Attack
OWASP ML Top 10 — ML04
Sensitive Information Disclosure
OWASP LLM Top 10 — LLM06
Key Finding
DP fine-tuning reduces membership inference attack AUC from 0.901 to 0.606 while maintaining utility comparable to a non-private model trained on 100x more data.
DP-SGD fine-tuning (Mellum)
Novel technique introduced
Modern Integrated Development Environments (IDEs) increasingly leverage Large Language Models (LLMs) to provide advanced features like code autocomplete. While powerful, training these models on user-written code introduces significant privacy risks, making the models themselves a new type of data vulnerability. Malicious actors can exploit this by launching attacks to reconstruct sensitive training data or infer whether a specific code snippet was used for training. This paper investigates the use of Differential Privacy (DP) as a robust defense mechanism for training an LLM for Kotlin code completion. We fine-tune a \texttt{Mellum} model using DP and conduct a comprehensive evaluation of its privacy and utility. Our results demonstrate that DP provides a strong defense against Membership Inference Attacks (MIAs), reducing the attack's success rate close to a random guess (AUC from 0.901 to 0.606). Furthermore, we show that this privacy guarantee comes at a minimal cost to model performance, with the DP-trained model achieving utility scores comparable to its non-private counterpart, even when trained on 100x less data. Our findings suggest that DP is a practical and effective solution for building private and trustworthy AI-powered IDE features.
Key Contributions
- First code completion LLM (Mellum fine-tune) trained with formal differential privacy guarantees for IDE integration
- Empirical demonstration that DP reduces MIA success from AUC 0.901 to 0.606 (near random) with minimal utility loss
- Shows that DP-trained model achieves comparable utility scores even when trained on 100x less data than the non-private counterpart
🛡️ Threat Analysis
The paper's primary empirical evaluation is a defense against Membership Inference Attacks — it quantitatively measures MIA success rate (AUC) before and after DP fine-tuning, reducing it from 0.901 to 0.606 (near random). This is the core security contribution.