ML Security Papers

Latest papers

5 papers

survey arXiv Mar 24, 2026 · 13d ago

SoK: The Attack Surface of Agentic AI -- Tools, and Autonomy

Ali Dehghantanha, Sajad Homayoun · University of Guelph · Aalborg University

Surveys attack surface of agentic LLM systems: prompt injection, RAG poisoning, tool exploits, and multi-agent threats with defense taxonomy

Prompt Injection Insecure Plugin Design Excessive Agency nlpmultimodal

PDF

defense IEEE Annual Congress on Artifi... Nov 12, 2025 · Nov 2025

Privacy-Preserving Explainable AIoT Application via SHAP Entropy Regularization

Dilli Prasad Sharma, Xiaowei Sun, Liang Xue et al. · York University · University of Guelph +1 more

Defends against membership inference and attribute-inference attacks on SHAP explanations in smart home LSTM models via entropy regularization

Membership Inference Attack Model Inversion Attack timeseries

PDF

defense TrustCom Nov 9, 2025 · Nov 2025

Enhancing Adversarial Robustness of IoT Intrusion Detection via SHAP-Based Attribution Fingerprinting

Dilli Prasad Sharma, Liang Xue, Xiaowei Sun et al. · York University · University of Guelph +1 more

Defends ML-based IoT intrusion detection against adversarial evasion by detecting perturbed inputs via SHAP attribution fingerprints and autoencoder anomaly detection

Input Manipulation Attack tabular

PDF

attack CIKM Oct 24, 2025 · Oct 2025

Uncovering the Persuasive Fingerprint of LLMs in Jailbreaking Attacks

Havva Alizadeh Noughabi, Julien Serbanescu, Fattane Zarrinkalam et al. · University of Guelph

Exploits social-science persuasion theories to craft natural-language jailbreak prompts that bypass LLM alignment safeguards

Prompt Injection nlp

PDF Code

defense arXiv Oct 6, 2025 · Oct 2025

Indirect Prompt Injections: Are Firewalls All You Need, or Stronger Benchmarks?

Rishika Bhagwatkar, Kevin Kasa, Abhay Puri et al. · ServiceNow Research · Mila - Québec AI Institute +3 more

Modular agent-tool firewall achieves perfect indirect prompt injection defense on four benchmarks, while exposing those benchmarks as too weak

Prompt Injection nlp

4 citations PDF

Latest papers

SoK: The Attack Surface of Agentic AI -- Tools, and Autonomy

Privacy-Preserving Explainable AIoT Application via SHAP Entropy Regularization

Enhancing Adversarial Robustness of IoT Intrusion Detection via SHAP-Based Attribution Fingerprinting

Uncovering the Persuasive Fingerprint of LLMs in Jailbreaking Attacks

Indirect Prompt Injections: Are Firewalls All You Need, or Stronger Benchmarks?

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue