A. Feder Cooper

h-index: 5 122 citations 12 papers (total)

Papers in Database (2)

attack arXiv Jan 6, 2026 · Jan 2026

Extracting books from production language models

Ahmed Ahmed, A. Feder Cooper, Sanmi Koyejo et al. · Stanford University · Yale University

Extracts copyrighted books near-verbatim from Claude, GPT-4.1, Gemini, and Grok using Best-of-N jailbreaks and iterative continuation prompts

Model Inversion Attack Sensitive Information Disclosure Prompt Injection nlp
5 citations PDF
benchmark arXiv Jan 26, 2026 · 10w ago

Comparison requires valid measurement: Rethinking attack success rate comparisons in AI red teaming

Alexandra Chouldechova, A. Feder Cooper, Solon Barocas et al. · Microsoft Research · Microsoft

Critiques LLM jailbreak ASR comparisons as methodologically invalid using social science measurement theory and inferential statistics

Prompt Injection nlp
1 citations PDF