Latest papers

2 papers
attack arXiv Feb 26, 2026 · 5w ago

Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search

Xun Huang, Simeng Qin, Xiaoshuang Jia et al. · Nanyang Technological University · BraneMatrix AI +7 more

Bio-inspired optimization generates classical Chinese jailbreak prompts that defeat modern-language safety guardrails in black-box LLMs

Prompt Injection nlp
PDF
defense arXiv Oct 10, 2025 · Oct 2025

SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG

Xiaonan Si, Meilin Zhu, Simeng Qin et al. · Institute of Software · University of Chinese Academy of Sciences +5 more

Defends RAG systems from corpus poisoning via two-stage semantic and conflict-aware filtering before LLM generation

Prompt Injection nlp
2 citations PDF