Fatmazohra Rezkellah

h-index: 1 2 citations 1 papers (total)

Papers in Database (1)

defense arXiv Oct 3, 2025 · Oct 2025

Machine Unlearning Meets Adversarial Robustness via Constrained Interventions on LLMs

Fatmazohra Rezkellah, Ramzi Dakhmouche · Université Paris-Dauphine · EPFL +1 more

Defends LLMs against jailbreaking and unlearns sensitive content via minimal constrained weight interventions, no classifier required

Prompt Injection Sensitive Information Disclosure nlp
2 citations PDF