Latest papers

3 papers
defense arXiv Feb 5, 2026 · 8w ago

ShapePuri: Shape Guided and Appearance Generalized Adversarial Purification

Zhe Li, Bernhard Kainz · FAU Erlangen-Nürnberg

Defends image classifiers against adversarial attacks using shape-guided purification with SDFs, surpassing 80% robust accuracy on AutoAttack

Input Manipulation Attack vision
PDF
benchmark International Conference on Cy... Nov 12, 2025 · Nov 2025

Sure! Here's a short and concise title for your paper: "Contamination in Generated Text Detection Benchmarks"

Philipp Dingfelder, Christian Riess · FAU Erlangen-Nürnberg

Benchmark contamination in DetectRL causes shortcut learning, enabling spoofing attacks on AI-generated text detectors

Output Integrity Attack nlp
PDF Code
attack arXiv Aug 11, 2025 · Aug 2025

Towards Effective MLLM Jailbreaking Through Balanced On-Topicness and OOD-Intensity

Zuoou Li, Weitong Zhang, Jingyuan Wang et al. · Imperial College London · FAU Erlangen-Nürnberg +1 more

Jailbreaks MLLMs by balancing on-topic prompts with OOD visual cues, achieving 67% higher attack success across 13 models

Input Manipulation Attack Prompt Injection multimodalnlpvision
PDF