Anthony Hughes

h-index: 2 4 citations 4 papers (total)

Papers in Database (1)

defense arXiv Oct 8, 2025 · Oct 2025

PATCH: Mitigating PII Leakage in Language Models with Privacy-Aware Targeted Circuit PatcHing

Anthony Hughes, Vasisht Duddu, N. Asokan et al. · University of Sheffield · University of Waterloo

Defends LLMs against PII extraction attacks by identifying and surgically patching memorization circuits, reducing recall by 65%

Model Inversion Attack Sensitive Information Disclosure nlp
PDF