Reya Vir

h-index: 1 6 citations 4 papers (total)

Papers in Database (1)

benchmark arXiv Oct 22, 2025 · Oct 2025

Subliminal Corruption: Mechanisms, Thresholds, and Interpretability

Reya Vir, Sarvesh Bhatnagar · Columbia University · University of Michigan

Quantifies subliminal data poisoning in LLM fine-tuning: finds sharp alignment-failure phase transition, not gradual degradation

Data Poisoning Attack Training Data Poisoning nlp
2 citations PDF