ML Security Papers

Latest papers

5 papers

attack arXiv Mar 12, 2026 · 25d ago

Cascade: Composing Software-Hardware Attack Gadgets for Adversarial Threat Amplification in Compound AI Systems

Sarbartha Banerjee, Prateek Sahu, Anjo Vahldiek-Oberwagner et al. · Georgia Tech · The University of Texas at Austin +3 more

Compounds Rowhammer hardware faults and RAG database injection with LLM attacks to jailbreak guardrails and exfiltrate user data

Prompt Injection Sensitive Information Disclosure nlp

PDF

defense arXiv Mar 3, 2026 · 4w ago

Contextualized Privacy Defense for LLM Agents

Yule Wen, Yanzhe Zhang, Jianxun Lian et al. · Tsinghua University · Georgia Tech +2 more

RL-trained instructor model provides context-aware privacy guidance to LLM agents, preventing sensitive data disclosure with 94.2% preservation rate

Sensitive Information Disclosure Prompt Injection nlp

PDF

benchmark arXiv Feb 23, 2026 · 6w ago

Can a Teenager Fool an AI? Evaluating Low-Cost Cosmetic Attacks on Age Estimation Systems

Xingyu Shen, Tommy Duong, Xiaodong An et al. · UC Berkeley · Duke University +4 more

Evaluates cosmetic physical attacks (beard, makeup, wrinkles) that fool age-estimation AI into misclassifying minors as adults, achieving up to 83% success rate

Input Manipulation Attack vision

PDF

attack arXiv Aug 16, 2025 · Aug 2025

ComplicitSplat: Downstream Models are Vulnerable to Blackbox Attacks by 3D Gaussian Splat Camouflages

Matthew Hull, Haoyang Yang, Pratham Mehta et al. · Georgia Tech · Technology Innovation Institute

Black-box adversarial attack embeds viewpoint-specific camouflage in 3DGS scenes to evade object detectors without model access

Input Manipulation Attack vision

PDF

tool arXiv Aug 14, 2025 · Aug 2025

Searching for Privacy Risks in LLM Agents via Simulation

Yanzhe Zhang, Diyi Yang · Stanford University · Georgia Tech

Search-based framework discovers LLM agent privacy extraction attacks and defenses through automated multi-agent simulation

Sensitive Information Disclosure Prompt Injection nlp

PDF Code

Latest papers

Cascade: Composing Software-Hardware Attack Gadgets for Adversarial Threat Amplification in Compound AI Systems

Contextualized Privacy Defense for LLM Agents

Can a Teenager Fool an AI? Evaluating Low-Cost Cosmetic Attacks on Age Estimation Systems

ComplicitSplat: Downstream Models are Vulnerable to Blackbox Attacks by 3D Gaussian Splat Camouflages

Searching for Privacy Risks in LLM Agents via Simulation

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue