Abhishek Mishra

benchmark arXiv Oct 9, 2025 · Oct 2025

The Model's Language Matters: A Comparative Privacy Analysis of LLMs

Abhishek K. Mishra, Antoine Boutet, Lucas Magnana · INRIA · INSA Lyon +1 more

Benchmarks training data extraction, memorization, and membership inference attacks on LLMs across four languages, finding Italian most vulnerable due to linguistic redundancy

Model Inversion Attack Membership Inference Attack Sensitive Information Disclosure nlp

PDF

benchmark arXiv Jan 30, 2026 · 9w ago

Assessing Domain-Level Susceptibility to Emergent Misalignment from Narrow Finetuning

Abhishek Mishra, Mugilan Arulvanan, Reshma Ashok et al. · University of Massachusetts Amherst

Benchmarks domain-level LLM misalignment susceptibility from insecure fine-tuning and backdoor triggers, ranking 11 domains from 0% to 87.67% vulnerability

Transfer Learning Attack Model Poisoning nlp

PDF Code

Papers in Database (2)

The Model's Language Matters: A Comparative Privacy Analysis of LLMs

Assessing Domain-Level Susceptibility to Emergent Misalignment from Narrow Finetuning