Alexandra Souly

h-index: 6 504 citations 10 papers (total)

Papers in Database (2)

attack arXiv Oct 8, 2025 · Oct 2025

Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples

Alexandra Souly, Javier Rando, Ed Chapman et al. · UK AI Security Institute · Anthropic +3 more

Shows LLM backdoor poisoning needs only ~250 documents regardless of model size, making attacks more practical at scale

Model Poisoning Data Poisoning Attack Training Data Poisoning nlp
32 citations 2 influentialPDF
benchmark arXiv Oct 26, 2025 · Oct 2025

Breaking Agent Backbones: Evaluating the Security of Backbone LLMs in AI Agents

Julia Bazinska, Max Mathys, Francesco Casucci et al. · Lakera AI · ETH Zürich +2 more

Benchmarks 34 backbone LLMs against 194K crowdsourced adversarial attacks using a threat-snapshot framework for AI agent security

Prompt Injection Excessive Agency nlp
1 citations PDF