Latest papers

5 papers
benchmark arXiv Feb 2, 2026 · 9w ago

AICD Bench: A Challenging Benchmark for AI-Generated Code Detection

Daniil Orel, Dilshod Azizov, Indraneil Paul et al. · Mohamed bin Zayed University of Artificial Intelligence · TU Darmstadt +1 more

Large-scale benchmark revealing AI-generated code detectors fail severely under distribution shift and adversarial conditions

Output Integrity Attack nlp
PDF Code
attack arXiv Jan 19, 2026 · 11w ago

ChartAttack: Testing the Vulnerability of LLMs to Malicious Prompting in Chart Generation

Jesus-German Ortiz-Barajas, Jonathan Tonglet, Vivek Gupta et al. · INSAIT · Sofia University +3 more

Jailbreaks MLLMs via adversarial prompting to auto-generate misleading charts, reducing human and MLLM QA accuracy by ~20 points

Prompt Injection multimodalvisionnlp
PDF Code
defense arXiv Nov 26, 2025 · Nov 2025

Multimodal Robust Prompt Distillation for 3D Point Cloud Models

Xiang Gu, Liming Lu, Xu Zheng et al. · Nanjing University of Science and Technology · The Hong Kong University of Science and Technology (Guangzhou) +3 more

Defends 3D point cloud models against adversarial attacks via multimodal teacher-student prompt distillation with zero inference overhead

Input Manipulation Attack visionmultimodal
PDF Code
attack arXiv Oct 28, 2025 · Oct 2025

SPEAR++: Scaling Gradient Inversion via Sparsely-Used Dictionary Learning

Alexander Bakarsky, Dimitar I. Dimitrov, Maximilian Baader et al. · ETH Zürich · INSAIT +1 more

Scales gradient inversion attacks in federated learning to 10x larger batch sizes using sparse dictionary learning

Model Inversion Attack federated-learning
PDF
defense arXiv Sep 16, 2025 · Sep 2025

CIARD: Cyclic Iterative Adversarial Robustness Distillation

Liming Lu, Shuchao Pang, Xu Zheng et al. · Nanjing University of Science and Technology · HKUST(GZ) +4 more

Defends lightweight student models against adversarial attacks via cyclic multi-teacher distillation with contrastive alignment and continuous adversarial retraining

Input Manipulation Attack vision
PDF Code