defense 2025

Public Data Assisted Differentially Private In-Context Learning

Seongho Joo , Hyukhun Koh , Kyomin Jung

0 citations

α

Published on arXiv

2509.10932

Membership Inference Attack

OWASP ML Top 10 — ML04

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Key Finding

Public data assistance significantly improves utility of differentially private ICL while maintaining robustness against membership inference attacks.

DP-ICL (Public Data Assisted Differentially Private In-Context Learning)

Novel technique introduced


In-context learning (ICL) in Large Language Models (LLMs) has shown remarkable performance across various tasks without requiring fine-tuning. However, recent studies have highlighted the risk of private data leakage through the prompt in ICL, especially when LLMs are exposed to malicious attacks. While differential privacy (DP) provides strong privacy guarantees, it often significantly reduces the utility of in-context learning (ICL). To address this challenge, we incorporate task-related public data into the ICL framework while maintaining the DP guarantee. Based on this approach, we propose a private in-context learning algorithm that effectively balances privacy protection and model utility. Through experiments, we demonstrate that our approach significantly improves the utility of private ICL with the assistance of public data. Additionally, we show that our method is robust against membership inference attacks, demonstrating empirical privacy protection.


Key Contributions

  • Private ICL algorithm that incorporates task-related public data to improve utility while maintaining differential privacy guarantees
  • Demonstrates that public data assistance significantly closes the utility gap introduced by DP in ICL settings
  • Empirically validates robustness against membership inference attacks to confirm privacy protection

🛡️ Threat Analysis

Membership Inference Attack

The paper explicitly evaluates its DP-ICL defense against membership inference attacks to demonstrate empirical privacy protection — MIA is the primary adversarial threat model validated in experiments.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
inference_timeblack_box
Applications
large language model inferencein-context learning with private data