ML Security Papers

Latest papers

6 papers

defense arXiv Feb 21, 2026 · 6w ago

Watermarking LLM Agent Trajectories

Wenlong Meng, Chen Gong, Terry Yue Zhuo et al. · Zhejiang University · University of Virginia +2 more

Watermarks LLM agent training trajectories so models trained on stolen datasets emit detectable hook behaviors under a secret key

Output Integrity Attack nlpreinforcement-learning

PDF Code

defense arXiv Dec 12, 2025 · Dec 2025

CLOAK: Contrastive Guidance for Latent Diffusion-Based Data Obfuscation

Xin Yang, Omid Ardakanian · University of Alberta

Latent diffusion model obfuscates IoT sensor data to defeat ML-based private attribute inference while preserving utility

Input Manipulation Attack timeseriesvision

PDF

defense arXiv Nov 12, 2025 · Nov 2025

GuardFed: A Trustworthy Federated Learning Framework Against Dual-Facet Attacks

Yanli Li, Yanan Zhou, Zhongliang Guo et al. · Nantong University · The University of Sydney +3 more

Introduces dual-facet Byzantine FL attack degrading accuracy and fairness simultaneously, defended by trust-score aggregation in GuardFed

Data Poisoning Attack federated-learning

PDF

attack arXiv Oct 16, 2025 · Oct 2025

Membership Inference over Diffusion-models-based Synthetic Tabular Data

Peini Cheng, Amir Bahmani · University of Alberta

Develops query-based membership inference attacks on TabDDPM and TabSyn diffusion models for synthetic tabular data generation

Membership Inference Attack tabulargenerative

1 citations 1 influentialPDF

attack arXiv Sep 22, 2025 · Sep 2025

Budgeted Adversarial Attack against Graph-Based Anomaly Detection in Sensor Networks

Sanju Xaviar, Omid Ardakanian · University of Alberta

Grey-box budgeted evasion attack exploiting GNN graph topology to suppress anomalies or trigger false alarms in sensor network detectors

Input Manipulation Attack graphtimeseries

PDF

defense arXiv Sep 17, 2025 · Sep 2025

Scrub It Out! Erasing Sensitive Memorization in Code Language Models via Machine Unlearning

Zhaoyang Chu, Yao Wan, Zhikun Zhang et al. · Huazhong University of Science and Technology · Zhejiang University +4 more

Defends code LLMs against sensitive training data extraction by selectively unlearning memorized PII and credentials via gradient ascent

Model Inversion Attack Sensitive Information Disclosure nlp

PDF

Latest papers

Watermarking LLM Agent Trajectories

CLOAK: Contrastive Guidance for Latent Diffusion-Based Data Obfuscation

GuardFed: A Trustworthy Federated Learning Framework Against Dual-Facet Attacks

Membership Inference over Diffusion-models-based Synthetic Tabular Data

Budgeted Adversarial Attack against Graph-Based Anomaly Detection in Sensor Networks

Scrub It Out! Erasing Sensitive Memorization in Code Language Models via Machine Unlearning

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue