Latest papers

4 papers
benchmark arXiv Jan 27, 2026 · 9w ago

Unveiling Perceptual Artifacts: A Fine-Grained Benchmark for Interpretable AI-Generated Image Detection

Yao Xiao, Weiyan Chen, Jiahao Chen et al. · Sun Yat-Sen University · Xi’an Jiaotong University +3 more

Introduces X-AIGD benchmark with pixel-level perceptual artifact annotations to enable interpretable AI-generated image detection evaluation

Output Integrity Attack vision
PDF Code
attack arXiv Dec 26, 2025 · Dec 2025

Few Tokens Matter: Entropy Guided Attacks on Vision-Language Models

Mengqi He, Xinyu Tian, Xin Shen et al. · Australian National University · The University of Queensland +1 more

Targets high-entropy VLM decoding positions with adversarial visual perturbations, converting 35-49% of benign outputs to harmful content at 93-95% attack success rate

Input Manipulation Attack Prompt Injection visionnlpmultimodal
PDF
defense arXiv Sep 21, 2025 · Sep 2025

MARS: A Malignity-Aware Backdoor Defense in Federated Learning

Wei Wan, Yuxuan Ning, Zhicong Huang et al. · City University of Macau · Australian National University +4 more

Defends federated learning against backdoor attacks using neuron-level backdoor energy and Wasserstein clustering to detect malicious model updates

Model Poisoning federated-learningvision
5 citations PDF
attack arXiv Aug 3, 2025 · Aug 2025

Are All Prompt Components Value-Neutral? Understanding the Heterogeneous Adversarial Robustness of Dissected Prompt in Large Language Models

Yujia Zheng, Tianhao Li, Haotian Huang et al. · Duke University · North China University of Technology +7 more

Attacks LLMs via component-wise text perturbations, revealing heterogeneous adversarial robustness across dissected prompt structures

Prompt Injection nlp
PDF Code