NOIR: Privacy-Preserving Generation of Code with Open-Source LLMs
Khoa Nguyen 1, Khiem Ton 1, NhatHai Phan 1, Issa Khalil 2, Khang Tran 1, Cristian Borcea 1, Ruoming Jin 3, Abdallah Khreishah 1, My T. Thai 4
Published on arXiv
2601.16354
Model Inversion Attack
OWASP ML Top 10 — ML03
Sensitive Information Disclosure
OWASP LLM Top 10 — LLM06
Key Finding
NOIR achieves Pass@1 of 76.7/77.4 on MBPP/HumanEval and 38.7 on BigCodeBench (only a 1.77% drop from the unprotected LLM) under strong privacy guarantees against cloud reconstruction attacks.
NOIR
Novel technique introduced
Although boosting software development performance, large language model (LLM)-powered code generation introduces intellectual property and data security risks rooted in the fact that a service provider (cloud) observes a client's prompts and generated code, which can be proprietary in commercial systems. To mitigate this problem, we propose NOIR, the first framework to protect the client's prompts and generated code from the cloud. NOIR uses an encoder and a decoder at the client to encode and send the prompts' embeddings to the cloud to get enriched embeddings from the LLM, which are then decoded to generate the code locally at the client. Since the cloud can use the embeddings to infer the prompt and the generated code, NOIR introduces a new mechanism to achieve indistinguishability, a local differential privacy protection at the token embedding level, in the vocabulary used in the prompts and code, and a data-independent and randomized tokenizer on the client side. These components effectively defend against reconstruction and frequency analysis attacks by an honest-but-curious cloud. Extensive analysis and results using open-source LLMs show that NOIR significantly outperforms existing baselines on benchmarks, including the Evalplus (MBPP and HumanEval, Pass@1 of 76.7 and 77.4), and BigCodeBench (Pass@1 of 38.7, only a 1.77% drop from the original LLM) under strong privacy against attacks.
Key Contributions
- NOIR: a split-LLM framework where the client retains encoder and decoder locally and passes only differentially-private embeddings to an untrusted cloud for the middle layers
- Local differential privacy mechanism at the token embedding level achieving indistinguishability against reconstruction and frequency analysis attacks by an honest-but-curious cloud
- Data-independent randomized tokenizer on the client side that further obfuscates vocabulary signals without degrading code generation quality
🛡️ Threat Analysis
The paper's core threat model is an honest-but-curious cloud adversary performing embedding inversion — reconstructing private client prompts and generated code from token embeddings passed to the cloud LLM. NOIR defends against this via local differential privacy at the token embedding level and a randomized tokenizer, directly targeting the data reconstruction attack vector that ML03 covers.