Latest papers

3 papers
defense arXiv Jan 7, 2026 · 12w ago

Shadow Unlearning: A Neuro-Semantic Approach to Fidelity-Preserving Faceless Forgetting in LLMs

Dinesh Srivasthav P, Ashok Urlana, Rahul Mishra et al. · TCS Research · IIIT Hyderabad

Defends PII in LLM unlearning requests by operating on anonymized forget sets, validated against membership inference attacks

Membership Inference Attack nlp
PDF Code
benchmark arXiv Aug 9, 2025 · Aug 2025

Who's the Evil Twin? Differential Auditing for Undesired Behavior

Ishwar Balappanawar, Venkata Hasith Vattikuti, Greta Kintzley et al. · IIIT Hyderabad · University of Texas at Austin +1 more

Adversarial auditing game framework detects backdoored CNNs and misaligned LLMs using model diffing, gradients, and adversarial probing

Model Poisoning Prompt Injection visionnlp
PDF
defense arXiv Jan 12, 2025 · Jan 2025

KeTS: Kernel-based Trust Segmentation against Model Poisoning Attacks

Ankit Gangwal, Mauro Conti, Tommaso Pauselli · IIIT Hyderabad · University of Padua +1 more

Defends federated learning against Byzantine model poisoning by segmenting malicious clients via KDE on historical update evolution

Data Poisoning Attack federated-learningvisiontabular
PDF