Latest papers

1 papers
defense arXiv Feb 26, 2026 · 5w ago

CourtGuard: A Model-Agnostic Framework for Zero-Shot Policy Adaptation in LLM Safety

Umid Suleymanov, Rufiz Bayramov, Suad Gafarli et al. · Virginia Tech · ADA University

Retrieval-augmented multi-agent framework enforces LLM safety policies via adversarial debate without fine-tuning, generalizing zero-shot to new governance rules

Prompt Injection nlp
PDF