Latest papers

2 papers
defense arXiv Oct 10, 2025 · Oct 2025

SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG

Xiaonan Si, Meilin Zhu, Simeng Qin et al. · Institute of Software · University of Chinese Academy of Sciences +5 more

Defends RAG systems from corpus poisoning via two-stage semantic and conflict-aware filtering before LLM generation

Prompt Injection nlp
2 citations PDF
defense arXiv Oct 10, 2025 · Oct 2025

Provable Watermarking for Data Poisoning Attacks

Yifan Zhu, Lijia Yu, Xiao-Shan Gao · Chinese Academy of Sciences · University of Chinese Academy of Sciences +1 more

Embeds provably detectable watermarks into poisoned training datasets so generators can claim ownership and disclose poisoning to authorized users

Output Integrity Attack Data Poisoning Attack vision
1 citations PDF