Amazon Nova AI Challenge -- Trusted AI: Advancing secure, AI-assisted software development
Sattvik Sahai , Prasoon Goyal , Michael Johnston , Anna Gottardi , Yao Lu , Lucy Hu , Luke Dai , Shaohua Liu , Samyuth Sagi , Hangjie Shi , Desheng Zhang , Lavina Vaz , Leslie Ball , Maureen Murray , Rahul Gupta , Shankar Ananthakrishna
Published on arXiv
2508.10108
Prompt Injection
OWASP LLM Top 10 — LLM01
Key Finding
Adversarial tournament structure revealed state-of-the-art multi-turn jailbreaking and safety alignment techniques across 10 university teams competing on LLM coding assistant safety.
AI systems for software development are rapidly gaining prominence, yet significant challenges remain in ensuring their safety. To address this, Amazon launched the Trusted AI track of the Amazon Nova AI Challenge, a global competition among 10 university teams to drive advances in secure AI. In the challenge, five teams focus on developing automated red teaming bots, while the other five create safe AI assistants. This challenge provides teams with a unique platform to evaluate automated red-teaming and safety alignment methods through head-to-head adversarial tournaments where red teams have multi-turn conversations with the competing AI coding assistants to test their safety alignment. Along with this, the challenge provides teams with a feed of high quality annotated data to fuel iterative improvement. Throughout the challenge, teams developed state-of-the-art techniques, introducing novel approaches in reasoning-based safety alignment, robust model guardrails, multi-turn jail-breaking, and efficient probing of large language models (LLMs). To support these efforts, the Amazon Nova AI Challenge team made substantial scientific and engineering investments, including building a custom baseline coding specialist model for the challenge from scratch, developing a tournament orchestration service, and creating an evaluation harness. This paper outlines the advancements made by university teams and the Amazon Nova AI Challenge team in addressing the safety challenges of AI for software development, highlighting this collaborative effort to raise the bar for AI safety.
Key Contributions
- Adversarial tournament framework pairing 5 automated red-teaming teams against 5 safety-aligned LLM coding assistant teams in multi-turn head-to-head evaluations
- Custom 8B coding specialist baseline model and tournament orchestration service enabling iterative adversarial improvement with annotated data feeds
- Novel team-developed techniques including reasoning-based safety alignment, robust model guardrails, and multi-turn jailbreaking strategies for AI coding assistants