Sangdoo Yun

h-index: 8 618 citations 11 papers (total)

Papers in Database (1)

benchmark arXiv Jan 21, 2026 · 10w ago

Privacy Collapse: Benign Fine-Tuning Can Break Contextual Privacy in Language Models

Anmol Goel, Cornelius Emde, Sangdoo Yun et al. · Parameter Lab · TU Darmstadt +3 more

Benign fine-tuning silently breaks contextual privacy in LLMs, causing inappropriate data disclosure undetected by standard safety benchmarks

Transfer Learning Attack Sensitive Information Disclosure nlp
PDF