Latest papers

2 papers
defense arXiv Mar 24, 2026 · 13d ago

ProGRank: Probe-Gradient Reranking to Defend Dense-Retriever RAG from Corpus Poisoning

Xiangyu Yin, Yi Qi, Chih-hong Cheng · Chalmers University of Technology · Carl von Ossietzky University of Oldenburg +1 more

Reranking defense for RAG that detects corpus-poisoned passages using gradient-based instability signals under perturbations

Data Poisoning Attack Prompt Injection nlp
PDF
defense arXiv Sep 19, 2025 · Sep 2025

Randomized Smoothing Meets Vision-Language Models

Emmanouil Seferis, Changshun Wu, Stefanos Kollias et al. · National Technical University of Athens · Université Grenoble Alpes +2 more

Extends Randomized Smoothing certification to VLMs via oracle classification, defending against adversarial image perturbations and jailbreak-style attacks

Input Manipulation Attack Prompt Injection visionnlpmultimodal
PDF