Public Data Assisted Differentially Private In-Context Learning

In-context learning (ICL) in Large Language Models (LLMs) has shown remarkable performance across various tasks without requiring fine-tuning. However, recent studies have highlighted the risk of private data leakage through the prompt in ICL, especially when LLMs are exposed to malicious attacks. While differential privacy (DP) provides strong privacy guarantees, it often significantly reduces the utility of in-context learning (ICL). To address this challenge, we incorporate task-related public data into the ICL framework while maintaining the DP guarantee. Based on this approach, we propose a private in-context learning algorithm that effectively balances privacy protection and model utility. Through experiments, we demonstrate that our approach significantly improves the utility of private ICL with the assistance of public data. Additionally, we show that our method is robust against membership inference attacks, demonstrating empirical privacy protection.

Key Contributions

Private ICL algorithm that incorporates task-related public data to improve utility while maintaining differential privacy guarantees
Demonstrates that public data assistance significantly closes the utility gap introduced by DP in ICL settings
Empirically validates robustness against membership inference attacks to confirm privacy protection

🛡️ Threat Analysis

Membership Inference Attack

The paper explicitly evaluates its DP-ICL defense against membership inference attacks to demonstrate empirical privacy protection — MIA is the primary adversarial threat model validated in experiments.