Solon Barocas

h-index: 37 10,624 citations 92 papers (total)

Papers in Database (1)

benchmark arXiv Jan 26, 2026 · 10w ago

Comparison requires valid measurement: Rethinking attack success rate comparisons in AI red teaming

Alexandra Chouldechova, A. Feder Cooper, Solon Barocas et al. · Microsoft Research · Microsoft

Critiques LLM jailbreak ASR comparisons as methodologically invalid using social science measurement theory and inferential statistics

Prompt Injection nlp
1 citations PDF