defense 2026

Protecting Private Code in IDE Autocomplete using Differential Privacy

Evgeny Grigorenko 1, David Stanojević 1, David Ilić 1, Egor Bogomolov 1,2, Kostadin Cvejoski 1

0 citations · 44 references · arXiv

α

Published on arXiv

2601.22935

Membership Inference Attack

OWASP ML Top 10 — ML04

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Key Finding

DP fine-tuning reduces membership inference attack AUC from 0.901 to 0.606 while maintaining utility comparable to a non-private model trained on 100x more data.

DP-SGD fine-tuning (Mellum)

Novel technique introduced


Modern Integrated Development Environments (IDEs) increasingly leverage Large Language Models (LLMs) to provide advanced features like code autocomplete. While powerful, training these models on user-written code introduces significant privacy risks, making the models themselves a new type of data vulnerability. Malicious actors can exploit this by launching attacks to reconstruct sensitive training data or infer whether a specific code snippet was used for training. This paper investigates the use of Differential Privacy (DP) as a robust defense mechanism for training an LLM for Kotlin code completion. We fine-tune a \texttt{Mellum} model using DP and conduct a comprehensive evaluation of its privacy and utility. Our results demonstrate that DP provides a strong defense against Membership Inference Attacks (MIAs), reducing the attack's success rate close to a random guess (AUC from 0.901 to 0.606). Furthermore, we show that this privacy guarantee comes at a minimal cost to model performance, with the DP-trained model achieving utility scores comparable to its non-private counterpart, even when trained on 100x less data. Our findings suggest that DP is a practical and effective solution for building private and trustworthy AI-powered IDE features.


Key Contributions

  • First code completion LLM (Mellum fine-tune) trained with formal differential privacy guarantees for IDE integration
  • Empirical demonstration that DP reduces MIA success from AUC 0.901 to 0.606 (near random) with minimal utility loss
  • Shows that DP-trained model achieves comparable utility scores even when trained on 100x less data than the non-private counterpart

🛡️ Threat Analysis

Membership Inference Attack

The paper's primary empirical evaluation is a defense against Membership Inference Attacks — it quantitatively measures MIA success rate (AUC) before and after DP fine-tuning, reducing it from 0.901 to 0.606 (near random). This is the core security contribution.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
training_timeblack_box
Datasets
Kotlin code repositories
Applications
code completionide autocompletekotlin code generation