ML Security Papers

Latest papers

11 papers

attack arXiv Apr 27, 2026 · 24d ago

DETOUR: A Practical Backdoor Attack against Object Detection

Dazhuang Liu, Yanqi Qiao, Rui Wang et al. · Delft University of Technology · University of Turku

Backdoor attack on detection transformers using semantic triggers optimized for real-world deployment across varying viewpoints and spatial configurations

Model Poisoning vision

PDF

attack arXiv Apr 21, 2026 · 4w ago

PASTA: A Patch-Agnostic Twofold-Stealthy Backdoor Attack on Vision Transformers

Dazhuang Liu, Yanqi Qiao, Rui Wang et al. · Delft University of Technology · University of Turku

Patch-agnostic backdoor attack on Vision Transformers achieving 99% success across arbitrary trigger locations while evading detection

Model Poisoning vision

PDF

benchmark arXiv Mar 12, 2026 · 10w ago

Understanding LLM Behavior When Encountering User-Supplied Harmful Content in Harmless Tasks

Junjie Chu, Yiting Qu, Ye Leng et al. · CISPA Helmholtz Center for Information Security · Delft University of Technology

Benchmarks LLM safety alignment failures when harmful content is embedded in benign tasks like translation, revealing a content-level ethical blind spot

Prompt Injection nlp

PDF

attack arXiv Mar 10, 2026 · 10w ago

Removing the Trigger, Not the Backdoor: Alternative Triggers and Latent Backdoors

Gorka Abad, Ermes Franch, Stefanos Koffas et al. · University of Bergen · Delft University of Technology +2 more

Proves backdoor-trained models stay exploitable via alternative triggers even after defenses neutralize the original training trigger

Model Poisoning vision

PDF

attack arXiv Jan 30, 2026 · Jan 2026

AST-PAC: AST-guided Membership Inference for Code

Roham Koohestani, Ali Al-Kaswan, Jonathan Katzy et al. · Delft University of Technology

AST-guided membership inference attack for code LLMs using syntax-aware perturbations to audit training data provenance

Membership Inference Attack nlp

PDF

defense arXiv Jan 30, 2026 · Jan 2026

Protecting Private Code in IDE Autocomplete using Differential Privacy

Evgeny Grigorenko, David Stanojević, David Ilić et al. · JetBrains Research · Delft University of Technology

Defends LLM code completion against membership inference and memorization using DP fine-tuning, cutting MIA AUC from 0.901 to 0.606

Membership Inference Attack Sensitive Information Disclosure nlp

PDF

benchmark arXiv Jan 23, 2026 · Jan 2026

How does Graph Structure Modulate Membership-Inference Risk for Graph Neural Networks?

Megha Khosla · Delft University of Technology

Analyzes how graph structure and edge access at inference time modulate membership inference risk in GNNs, beyond generalization gap

Membership Inference Attack graph

PDF

attack arXiv Jan 6, 2026 · Jan 2026

Quality Degradation Attack in Synthetic Data

Qinyi Liu, Dong Liu, Farhad Vadiee et al. · University of Bergen · Delft University of Technology

Attacks synthetic data generators via label flipping and feature interventions, substantially degrading downstream predictive quality

Data Poisoning Attack tabulargenerative

PDF

survey arXiv Nov 17, 2025 · Nov 2025

SoK: The Last Line of Defense: On Backdoor Defense Evaluation

Gorka Abad, Marina Krček, Stefanos Koffas et al. · University of Bergen · Radboud University +3 more

Surveys 183 backdoor defense papers revealing critical evaluation inconsistencies and proposing standardized assessment recommendations

Model Poisoning vision

1 citations PDF

attack arXiv Nov 8, 2025 · Nov 2025

CatBack: Universal Backdoor Attacks on Tabular Data via Categorical Encoding

Behrad Tajalli, Stefanos Koffas, Stjepan Picek · Radboud University · Delft University of Technology +1 more

Backdoor attack on tabular ML models via categorical-to-float encoding enabling gradient-based universal triggers with 100% ASR

Model Poisoning tabular

PDF

attack arXiv Jan 10, 2025 · Jan 2025

Towards Backdoor Stealthiness in Model Parameter Space

Xiaoyun Xu, Zhuoran Liu, Stefanos Koffas et al. · Radboud University Nijmegen · Delft University of Technology +1 more

Proposes Grond, a backdoor attack stealthy in parameter space that evades 17 diverse defenses via adaptive neuron-level injection

Model Poisoning vision

PDF Code

Latest papers

DETOUR: A Practical Backdoor Attack against Object Detection

PASTA: A Patch-Agnostic Twofold-Stealthy Backdoor Attack on Vision Transformers

Understanding LLM Behavior When Encountering User-Supplied Harmful Content in Harmless Tasks

Removing the Trigger, Not the Backdoor: Alternative Triggers and Latent Backdoors

AST-PAC: AST-guided Membership Inference for Code

Protecting Private Code in IDE Autocomplete using Differential Privacy

How does Graph Structure Modulate Membership-Inference Risk for Graph Neural Networks?

Quality Degradation Attack in Synthetic Data

SoK: The Last Line of Defense: On Backdoor Defense Evaluation

CatBack: Universal Backdoor Attacks on Tabular Data via Categorical Encoding

Towards Backdoor Stealthiness in Model Parameter Space

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue