ML Security Papers

Latest papers

10 papers

attack arXiv Mar 3, 2026 · 4w ago

DSBA: Dynamic Stealthy Backdoor Attack with Collaborative Optimization in Self-Supervised Learning

Jiayao Wang, Mohammad Maruf Hasan, Yiping Zhang et al. · Yangzhou University · Chaohu University +1 more

Proposes a stealthy backdoor attack on SSL encoders via collaborative optimization of dynamic trigger generation and feature space manipulation

Model Poisoning vision

PDF

attack arXiv Mar 1, 2026 · 5w ago

BadRSSD: Backdoor Attacks on Regularized Self-Supervised Diffusion Models

Jiayao Wang, Yiping Zhang, Mohammad Maruf Hasan et al. · Yangzhou University · Chaohu University +1 more

Backdoor attack on self-supervised diffusion models hijacks PCA-space representations to steer generation toward attacker-specified targets on trigger activation

Model Poisoning visiongenerative

PDF

attack arXiv Feb 5, 2026 · 8w ago

ADCA: Attention-Driven Multi-Party Collusion Attack in Federated Self-Supervised Learning

Jiayao Wang, Yiping Zhang, Jiale Zhang et al. · Yangzhou University · Jiaxing University +2 more

Proposes a federated SSL backdoor attack using distributed trigger decomposition and attention-driven malicious client collusion to resist aggregation dilution

Model Poisoning Data Poisoning Attack visionfederated-learning

PDF

attack arXiv Feb 2, 2026 · 9w ago

HPE: Hallucinated Positive Entanglement for Backdoor Attacks in Federated Self-Supervised Learning

Jiayao Wang, Yang Song, Zhendong Zhao et al. · Yangzhou University · Chinese Academy of Sciences +3 more

Proposes HPE backdoor attack for federated self-supervised learning using synthetic positive entanglement and selective parameter poisoning to persist through aggregation

Model Poisoning visionfederated-learning

PDF

benchmark arXiv Jan 26, 2026 · 10w ago

MalURLBench: A Benchmark Evaluating Agents' Vulnerabilities When Processing Web URLs

Dezhang Kong, Zhuxi Wu, Shiqi Liu et al. · Zhejiang University · National University of Malaysia +4 more

Benchmark revealing LLM web agents fail to detect disguised malicious URLs across 61K attack instances in 10 real-world scenarios

Prompt Injection nlp

PDF Code

defense Industrial Conference on Data ... Jan 2, 2026 · Jan 2026

Explainability-Guided Defense: Attribution-Aware Model Refinement Against Adversarial Data Attacks

Longwei Wang, Mohammad Navid Nayyem, Abdullah Al Rakin et al. · University of South Dakota · Yangzhou University +1 more

Defends against adversarial examples by using LIME attributions to suppress spurious features during adversarial training of image classifiers

Input Manipulation Attack vision

PDF

defense arXiv Oct 17, 2025 · Oct 2025

Bridging Symmetry and Robustness: On the Role of Equivariance in Enhancing Adversarial Robustness

Longwei Wang, Ifrat Ikhtear Uddin, KC Santosh et al. · University of South Dakota · Yangzhou University +1 more

Embeds rotation- and scale-equivariant CNN layers as architectural defense against FGSM and PGD attacks without adversarial training

Input Manipulation Attack vision

3 citations PDF Code

attack arXiv Aug 11, 2025 · Aug 2025

IPBA: Imperceptible Perturbation Backdoor Attack in Federated Self-Supervised Learning

Jiayao Wang, Yang Song, Zhendong Zhao et al. · Yangzhou University · Chinese Academy of Sciences +2 more

Imperceptible backdoor attack on federated self-supervised learning using Sliced-Wasserstein distance for stealthy trigger optimization

Model Poisoning visionfederated-learning

PDF

defense arXiv Aug 5, 2025 · Aug 2025

BDFirewall: Towards Effective and Expeditiously Black-Box Backdoor Defense in MLaaS

Ye Li, Chengcheng Zhu, Yanchao Zhao et al. · Nanjing University of Aeronautics and Astronautics · Nanjing University +1 more

Defends against backdoor attacks in black-box MLaaS by progressively purging HVT, SVT, and LVT triggers at inference time

Model Poisoning vision

PDF

defense arXiv Jan 10, 2025 · Jan 2025

Fine-tuning is Not Fine: Mitigating Backdoor Attacks in GNNs with Limited Clean Data

Jiale Zhang, Bosen Rao, Chengcheng Zhu et al. · Yangzhou University · Zhejiang University +1 more

Defends GNNs against backdoor attacks via attention-transfer distillation using only 3% clean data to drop ASR below 5%

Model Poisoning graph

PDF

Latest papers

DSBA: Dynamic Stealthy Backdoor Attack with Collaborative Optimization in Self-Supervised Learning

BadRSSD: Backdoor Attacks on Regularized Self-Supervised Diffusion Models

ADCA: Attention-Driven Multi-Party Collusion Attack in Federated Self-Supervised Learning

HPE: Hallucinated Positive Entanglement for Backdoor Attacks in Federated Self-Supervised Learning

MalURLBench: A Benchmark Evaluating Agents' Vulnerabilities When Processing Web URLs

Explainability-Guided Defense: Attribution-Aware Model Refinement Against Adversarial Data Attacks

Bridging Symmetry and Robustness: On the Role of Equivariance in Enhancing Adversarial Robustness

IPBA: Imperceptible Perturbation Backdoor Attack in Federated Self-Supervised Learning

BDFirewall: Towards Effective and Expeditiously Black-Box Backdoor Defense in MLaaS

Fine-tuning is Not Fine: Mitigating Backdoor Attacks in GNNs with Limited Clean Data

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue