Yuwei Han

h-index: 3 19 citations 9 papers (total)

Papers in Database (1)

defense arXiv Oct 2, 2025 · Oct 2025

AdvEvo-MARL: Shaping Internalized Safety through Adversarial Co-Evolution in Multi-Agent Reinforcement Learning

Zhenyu Pan, Yiting Zhang, Zhuo Liu et al. · Northwestern University · University of Illinois at Chicago +2 more

Adversarial co-evolution MARL framework that trains LLM agents to resist jailbreaks and prompt injection without external guard modules

Prompt Injection Excessive Agency nlpreinforcement-learning
1 citations PDF