ML Security Papers

Latest papers

2 papers

survey arXiv Mar 8, 2026 · 29d ago

From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents

Xiaolei Zhang, Lu Zhou, Xiaogang Xu et al. · Nanjing University of Aeronautics and Astronautics · Collaborative Innovation Center of Novel Software Technology and Industrialization +5 more

Surveys LLM agent security threats across three autonomy tiers: cognitive manipulation, tool misuse, and multi-agent systemic failures

Prompt Injection Insecure Plugin Design Excessive Agency nlp

PDF

attack arXiv Aug 14, 2025 · Aug 2025

Jailbreaking Commercial Black-Box LLMs with Explicitly Harmful Prompts

Chiyu Zhang, Lu Zhou, Xiaogang Xu et al. · Nanjing University of Aeronautics and Astronautics · Collaborative Innovation Center of Novel Software Technology and Industrialization +2 more

Novel black-box jailbreak attack combining adversarial context alignment and fake chain-of-thought to bypass reasoning LLM safety guardrails

Prompt Injection nlp

PDF

Latest papers

From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents

Jailbreaking Commercial Black-Box LLMs with Explicitly Harmful Prompts

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue