Latest papers

2 papers
defense arXiv Mar 22, 2026 · 8w ago

Emergent Formal Verification: How an Autonomous AI Ecosystem Independently Discovered SMT-Based Safety Across Six Domains

Octavian Untila · Aisophical SRL

Autonomous AI system independently discovers SMT-based formal verification for AI safety across six domains with 100% accuracy

Output Integrity Attack Insecure Plugin Design Excessive Agency Vulnerability Discovery Patch & Remediation nlpmultimodal
PDF
defense arXiv Jan 27, 2026 · Jan 2026

RvB: Automating AI System Hardening via Iterative Red-Blue Games

Lige Huang, Zicheng Liu, Jie Zhang et al. · Shanghai Artificial Intelligence Laboratory · Institute of Information Engineering +1 more

Automates LLM jailbreak guardrail hardening via iterative red-blue adversarial game without model parameter updates

Prompt Injection Red-Team Agents Patch & Remediation Blue-Team Agents nlp
PDF