Harry Owiredu-Ashley

Papers in Database (1)

benchmark arXiv Mar 10, 2026 ยท 27d ago

ADVERSA: Measuring Multi-Turn Guardrail Degradation and Judge Reliability in Large Language Models

Harry Owiredu-Ashley

Automated multi-turn red-teaming framework measures LLM guardrail degradation as continuous compliance trajectories, not binary jailbreak events

Prompt Injection nlp
PDF