ML Security Papers

Latest papers

12 papers

attack arXiv Apr 1, 2026 · 7d ago

G-Drift MIA: Membership Inference via Gradient-Induced Feature Drift in LLMs

Ravi Ranjan, Utkarsh Grover, Xiaomin Lin et al. · Florida International University · University of South Florida

White-box membership inference attack using gradient-induced feature drift, outperforming confidence-based and reference-based MIAs on LLMs

Membership Inference Attack nlp

PDF

attack arXiv Mar 25, 2026 · 14d ago

Generative Adversarial Perturbations with Cross-paradigm Transferability on Localized Crowd Counting

Alabi Mehzabin Anisha, Guangjing Wang, Sriram Chellappan · University of South Florida

Cross-paradigm transferable adversarial attack on crowd counting models achieving 7X error increase across both density-map and point-regression architectures

Input Manipulation Attack vision

PDF Code

defense arXiv Dec 12, 2025 · Dec 2025

DFedReweighting: A Unified Framework for Objective-Oriented Reweighting in Decentralized Federated Learning

Kaichuang Zhang, Wei Yin, Jinghao Yang et al. · University of South Florida · The University of Texas Rio Grande Valley +1 more

Defends decentralized federated learning against Byzantine attacks via objective-oriented reweighting aggregation with convergence guarantees

Data Poisoning Attack federated-learning

PDF

defense arXiv Dec 3, 2025 · Dec 2025

Studying Various Activation Functions and Non-IID Data for Machine Learning Model Robustness

Long Dang, Thushari Hapuarachchi, Kaiqi Xiong et al. · University of South Florida

Defends against adversarial examples via advanced adversarial training across ten activation functions and non-IID federated learning settings

Input Manipulation Attack visionfederated-learning

PDF

survey arXiv Oct 27, 2025 · Oct 2025

Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges

Anshuman Chhabra, Shrestha Datta, Shahriar Kabir Nahin et al. · University of South Florida

Surveys threats, defenses, and open challenges for agentic LLM systems acting autonomously across digital and physical environments

Prompt Injection Insecure Plugin Design Excessive Agency nlpmultimodal

8 citations 3 influentialPDF

attack arXiv Oct 4, 2025 · Oct 2025

Less Diverse, Less Safe: The Indirect But Pervasive Risk of Test-Time Scaling in Large Language Models

Shahriar Kabir Nahin, Hadi Askari, Muhao Chen et al. · University of South Florida · University of California

RefDiv exploits candidate diversity reduction in test-time scaling to bypass LLM safety guardrails, surpassing direct adversarial prompts

Prompt Injection nlp

1 citations PDF

survey ICDMW Sep 24, 2025 · Sep 2025

RAG Security and Privacy: Formalizing the Threat Model and Attack Surface

Atousa Arzanipour, Rouzbeh Behnia, Reza Ebrahimi et al. · University of South Florida

Surveys and formalizes the RAG security threat landscape: document membership inference, data poisoning, and indirect prompt injection

Membership Inference Attack Data Poisoning Attack Prompt Injection nlp

2 citations PDF

defense arXiv Sep 24, 2025 · Sep 2025

Advancing Practical Homomorphic Encryption for Federated Learning: Theoretical Guarantees and Efficiency Optimizations

Ren-Yi Huang, Dumindu Samaraweera, Prashant Shekhar et al. · University of South Florida · Embry-Riddle Aeronautical University

Theoretical BCRLB framework analyzes selective homomorphic encryption as a defense against gradient reconstruction attacks in federated learning

Model Inversion Attack federated-learning

1 citations PDF

benchmark arXiv Sep 12, 2025 · Sep 2025

When Your Reviewer is an LLM: Biases, Divergence, and Prompt Injection Risks in Peer Review

Changjia Zhu, Junjie Xiong, Renkai Ma et al. · University of South Florida · Missouri University of Science and Technology +2 more

Evaluates LLM peer reviewer bias and susceptibility to indirect prompt injection via covert instructions embedded in academic paper PDFs

Prompt Injection nlp

PDF

defense arXiv Aug 25, 2025 · Aug 2025

ClearMask: Noise-Free and Naturalness-Preserving Protection Against Voice Deepfake Attacks

Yuanda Wang, Bocheng Chen, Hanqing Guo et al. · Michigan State University · University of Hawaii +1 more

Defends against voice deepfakes with noise-free adversarial audio perturbations that disrupt voice synthesis encoders while preserving speech naturalness

Input Manipulation Attack Output Integrity Attack audio

PDF

defense arXiv Aug 8, 2025 · Aug 2025

Learning to Forget with Information Divergence Reweighted Objectives for Noisy Labels

Jeremiah Birrell, Reza Ebrahimi · Texas State University · University of South Florida

Defends against label-flipping data poisoning via information-divergence reweighted loss that adaptively down-weights mislabeled training samples

Data Poisoning Attack vision

PDF Code

survey arXiv Aug 7, 2025 · Aug 2025

Guardians and Offenders: A Survey on Harmful Content Generation and Safety Mitigation of LLM

Chi Zhang, Changjia Zhu, Junjie Xiong et al. · University of South Florida · Missouri University of Science and Technology

Surveys LLM jailbreaking attacks, unintentional toxicity, multimodal exploits, and safety mitigations including RLHF and alignment

Prompt Injection nlpmultimodal

PDF

Latest papers

G-Drift MIA: Membership Inference via Gradient-Induced Feature Drift in LLMs

Generative Adversarial Perturbations with Cross-paradigm Transferability on Localized Crowd Counting

DFedReweighting: A Unified Framework for Objective-Oriented Reweighting in Decentralized Federated Learning

Studying Various Activation Functions and Non-IID Data for Machine Learning Model Robustness

Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges

Less Diverse, Less Safe: The Indirect But Pervasive Risk of Test-Time Scaling in Large Language Models

RAG Security and Privacy: Formalizing the Threat Model and Attack Surface

Advancing Practical Homomorphic Encryption for Federated Learning: Theoretical Guarantees and Efficiency Optimizations

When Your Reviewer is an LLM: Biases, Divergence, and Prompt Injection Risks in Peer Review

ClearMask: Noise-Free and Naturalness-Preserving Protection Against Voice Deepfake Attacks

Learning to Forget with Information Divergence Reweighted Objectives for Noisy Labels

Guardians and Offenders: A Survey on Harmful Content Generation and Safety Mitigation of LLM

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue