Sarvesh Bhatnagar

h-index: 3 35 citations 7 papers (total)

Papers in Database (1)

benchmark arXiv Oct 22, 2025 · Oct 2025

Subliminal Corruption: Mechanisms, Thresholds, and Interpretability

Reya Vir, Sarvesh Bhatnagar · Columbia University · University of Michigan

Quantifies subliminal data poisoning in LLM fine-tuning: finds sharp alignment-failure phase transition, not gradual degradation

Data Poisoning Attack Training Data Poisoning nlp
2 citations PDF