Bernhard Schölkopf

h-index: 7 368 citations 19 papers (total)

Papers in Database (2)

defense arXiv Jan 29, 2026 · 9w ago

LoRA and Privacy: When Random Projections Help (and When They Don't)

Yaxi Hu, Johanna Düngler, Bernhard Schölkopf et al. · Max Planck Institute for Intelligent Systems · University of Copenhagen

Proves LoRA lacks inherent privacy via near-perfect MIA, then derives tighter DP bounds for noisy low-rank fine-tuning

Membership Inference Attack nlp
PDF
benchmark arXiv Nov 28, 2025 · Nov 2025

Are LLMs Good Safety Agents or a Propaganda Engine?

Neemesh Yadav, Francesco Ortu, Jiarui Liu et al. · Southern Methodist University · University of Trieste +6 more

Benchmarks LLM refusal behaviors using prompt injection attacks to distinguish genuine safety guardrails from political censorship

Prompt Injection nlp
PDF