Alexandra Souly

h-index: 6 504 citations 10 papers (total)

Papers in Database (2)

attack arXiv Oct 8, 2025 · Oct 2025

Alexandra Souly, Javier Rando, Ed Chapman et al. · UK AI Security Institute · Anthropic +3 more

Shows LLM backdoor poisoning needs only ~250 documents regardless of model size, making attacks more practical at scale

Model Poisoning Data Poisoning Attack Training Data Poisoning nlp

32 citations 2 influentialPDF

benchmark arXiv Oct 26, 2025 · Oct 2025

Julia Bazinska, Max Mathys, Francesco Casucci et al. · Lakera AI · ETH Zürich +2 more

Benchmarks 34 backbone LLMs against 194K crowdsourced adversarial attacks using a threat-snapshot framework for AI agent security

Prompt Injection Excessive Agency nlp

1 citations PDF