ML Security Papers

Latest papers

5 papers

attack arXiv Apr 14, 2026 · 5w ago

Junyu Ren, Xingjian Pan, Wensheng Gan et al. · Jinan University · University of Illinois Chicago

Automated prompt injection framework combining semantic and character-level mutations to jailbreak DeepSeek LLM safety guardrails

Prompt Injection nlp

defense arXiv Jan 19, 2026 · Jan 2026

Ali Shafiee Sarvestani, Jason Schmidt, Arman Roohi · University of Illinois Chicago

Defends traffic sign classifiers against FGSM/PGD attacks by enforcing symbolic logic constraints on shape and color attributes during training

Input Manipulation Attack vision

defense arXiv Nov 18, 2025 · Nov 2025

Sungik Choi, Hankook Lee, Moontae Lee · LG AI Research · University of Illinois Chicago +1 more

Training-free AI-generated image detector using Haar wavelet sensitivity and self-supervised model cropping robustness

Output Integrity Attack visiongenerative

1 citations PDF

attack arXiv Oct 16, 2025 · Oct 2025

Yingguang Yang, Xianghua Zeng, Qi Wu et al. · University of Science and Technology of China · Beihang University +3 more

MARL-based black-box evasion attack on GNN social bot detectors using diffusion-generated adversarial accounts and graph manipulation

Input Manipulation Attack graphreinforcement-learninggenerative

defense arXiv Oct 13, 2025 · Oct 2025

Wei-Chieh Huang, Henry Peng Zou, Yaozu Wu et al. · University of Illinois Chicago · University of Tokyo +2 more

Multi-stage guardrail framework defending LLM deep-research agents from harmful web content injection across planning and synthesis stages

Prompt Injection Excessive Agency nlp

2 citations PDF