ML Security Papers

Latest papers

23 papers

defense ICLR Mar 2, 2026 · 5w ago

Protection against Source Inference Attacks in Federated Learning

Andreas Athanasiou, Kangsoo Jung, Catuscia Palamidessi · TU Delft · INRIA +1 more

Defends federated learning against source inference attacks using parameter-level shuffling combined with the residue number system

Membership Inference Attack federated-learning

PDF

attack arXiv Feb 18, 2026 · 6w ago

Sequential Membership Inference Attacks

Thomas Michel, Debabrota Basu, Emilie Kaufmann · Univ. Lille · INRIA +2 more

Derives optimal membership inference attack exploiting model update sequences, achieving tighter DP privacy audits than static-model baselines

Membership Inference Attack

PDF

defense arXiv Feb 3, 2026 · 8w ago

From Inexact Gradients to Byzantine Robustness: Acceleration and Optimization under Similarity

Renaud Gaucher, Aymeric Dieuleveut, Hadrien Hendrikx · Institut Polytechnique de Paris · INRIA

Casts Byzantine-robust federated learning as inexact gradient optimization, enabling accelerated algorithms with reduced communication complexity

Data Poisoning Attack federated-learning

PDF

defense arXiv Feb 2, 2026 · 9w ago

Learning Better Certified Models from Empirically-Robust Teachers

Alessandro De Palma · London School of Economics and Political Science · INRIA

Distills adversarially-trained teachers into certifiably-robust student models to improve certified robustness-accuracy trade-offs for ReLU networks

Input Manipulation Attack vision

PDF

benchmark arXiv Feb 2, 2026 · 9w ago

Membership Inference Attacks from Causal Principles

Mathieu Even, Clément Berenfeld, Linus Bleistein et al. · INRIA · EPFL

Reframes MIA evaluation as causal inference, identifying and correcting systematic biases in one-run and zero-run privacy protocols

Membership Inference Attack nlp

PDF

defense arXiv Feb 1, 2026 · 9w ago

Key Principles of Graph Machine Learning: Representation, Robustness, and Generalization

Yassine Abbahaddou, Céline Hudelot, Charlotte Laclau et al. · École Polytechnique · CentraleSupélec +4 more

Defends GNNs against adversarial graph perturbations via orthonormalization and noise-based techniques, alongside representation and generalization contributions

Input Manipulation Attack graph

PDF

tool arXiv Jan 20, 2026 · 10w ago

Orthogonium : A Unified, Efficient Library of Orthogonal and 1-Lipschitz Building Blocks

Thibaut Boissin, Franck Mamalet, Valentin Lafargue et al. · Institut de Recherche Technologique Saint-Exupéry · Artificial and Natural Intelligence Toulouse Institute +3 more

PyTorch library unifying orthogonal and 1-Lipschitz layers to enable certified adversarial robustness at scale

Input Manipulation Attack visiongenerative

PDF Code

attack arXiv Jan 13, 2026 · 11w ago

Double Strike: Breaking Approximation-Based Side-Channel Countermeasures for DNNs

Lorenzo Casalino, Maria Méndez Real, Jean-Christophe Prévotet et al. · CentraleSupélec · INRIA +7 more

Side-channel attack breaks MACPRUNING defense to recover 96–100% of DNN weights from embedded hardware implementations

Model Theft

PDF

defense USENIX Security Dec 17, 2025 · Dec 2025

From Risk to Resilience: Towards Assessing and Mitigating the Risk of Data Reconstruction Attacks in Federated Learning

Xiangrui Xu, Zhize Li, Yufei Han et al. · Beijing Jiaotong University · Singapore Management University +3 more

Theoretical framework quantifying data reconstruction attack risk in federated learning via Jacobian spectral analysis, with adaptive noise defenses

Model Inversion Attack federated-learningvision

1 citations PDF

attack arXiv Dec 12, 2025 · Dec 2025

Persistent Backdoor Attacks under Continual Fine-Tuning of LLMs

Jing Cui, Yufei Han, Jianbin Jiao et al. · University of Chinese Academy of Sciences · Institute of Automation +1 more

P-Trojan backdoor attack survives repeated LLM fine-tuning by aligning poisoned and clean task gradients at injection time

Model Poisoning Transfer Learning Attack nlp

PDF

defense arXiv Nov 27, 2025 · Nov 2025

Do You See What I Say? Generalizable Deepfake Detection based on Visual Speech Recognition

Maheswar Bora, Tashvik Dhamija, Shukesh Reddy et al. · Birla Institute of Technology and Science · INRIA

Proposes FauxNet, a VSR-based deepfake video detector achieving generalizable zero-shot detection across unseen generation techniques

Output Integrity Attack visionmultimodal

PDF

attack Machine Learning for Biomedica... Nov 26, 2025 · Nov 2025

Data Exfiltration by Compression Attack: Definition and Evaluation on Medical Image Data

Huiyu Li, Nicholas Ayache, Hervé Delingette · INRIA

Insider attack encodes compressed medical training images into exported model weights, enabling high-fidelity reconstruction outside secure data lakes

Model Inversion Attack vision

PDF

defense arXiv Oct 23, 2025 · Oct 2025

Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization

Antônio H. Ribeiro, David Vävinggren, Dave Zachariah et al. · Uppsala University · PSL Research University +1 more

Defends against adversarial input perturbations by reformulating adversarial training as feature-space perturbations in RKHS, enabling exact inner maximization and adaptive regularization

Input Manipulation Attack

1 citations PDF

defense arXiv Oct 23, 2025 · Oct 2025

Adversary-Aware Private Inference over Wireless Channels

Mohamed Seif, Malcolm Egan, Andrea J. Goldsmith et al. · Princeton University · INRIA +1 more

Defends against adversarial inversion of ML feature embeddings during wireless transmission using differential privacy and channel-aware encoding

Model Inversion Attack vision

PDF

benchmark arXiv Oct 9, 2025 · Oct 2025

The Model's Language Matters: A Comparative Privacy Analysis of LLMs

Abhishek K. Mishra, Antoine Boutet, Lucas Magnana · INRIA · INSA Lyon +1 more

Benchmarks training data extraction, memorization, and membership inference attacks on LLMs across four languages, finding Italian most vulnerable due to linguistic redundancy

Model Inversion Attack Membership Inference Attack Sensitive Information Disclosure nlp

PDF

defense arXiv Oct 7, 2025 · Oct 2025

Data Provenance Auditing of Fine-Tuned Large Language Models with a Text-Preserving Technique

Yanming Li, Cédric Eichler, Nicolas Anciaux et al. · INRIA · INSA CVL +4 more

Embeds invisible Unicode watermarks in training documents to audit whether copyrighted text was used in LLM fine-tuning under black-box access

Output Integrity Attack nlp

PDF

defense arXiv Sep 26, 2025 · Sep 2025

Guidance Watermarking for Diffusion Models

Enoal Gesny, Eva Giboulot, Teddy Furon et al. · Univ. Rennes · INRIA +3 more

Guides diffusion sampling with watermark-decoder gradients to embed robust provenance signals in generated images without retraining

Output Integrity Attack visiongenerative

1 citations PDF

defense arXiv Sep 15, 2025 · Sep 2025

Improving Out-of-Domain Audio Deepfake Detection via Layer Selection and Fusion of SSL-Based Countermeasures

Pierre Serrano, Raphaël Duroselle, Florian Angulo et al. · INRIA

Improves out-of-domain audio deepfake detection by identifying optimal SSL encoder layers and fusing multiple encoders at score level

Output Integrity Attack audio

PDF

defense arXiv Sep 1, 2025 · Sep 2025

Practical and Private Hybrid ML Inference with Fully Homomorphic Encryption

Sayan Biswas, Philippe Chartier, Akash Dhasade et al. · EPFL · INRIA +4 more

Defends model IP in hybrid FHE inference by randomized shuffling of intermediate outputs, preventing clients from reconstructing server-side model weights

Model Theft vision

PDF

defense 6th MICCAI Workshop on "Distri... Aug 20, 2025 · Aug 2025

Mitigating Data Exfiltration Attacks through Layer-Wise Learning Rate Decay Fine-Tuning

Elie Thellier, Huiyu Li, Nicholas Ayache et al. · INRIA

Defends medical data lakes against training-data exfiltration by corrupting embedded data via layer-wise LR decay fine-tuning at model export time

Model Inversion Attack vision

PDF

Loading more papers…

Latest papers

Protection against Source Inference Attacks in Federated Learning

Sequential Membership Inference Attacks

From Inexact Gradients to Byzantine Robustness: Acceleration and Optimization under Similarity

Learning Better Certified Models from Empirically-Robust Teachers

Membership Inference Attacks from Causal Principles

Key Principles of Graph Machine Learning: Representation, Robustness, and Generalization

Orthogonium : A Unified, Efficient Library of Orthogonal and 1-Lipschitz Building Blocks

Double Strike: Breaking Approximation-Based Side-Channel Countermeasures for DNNs

From Risk to Resilience: Towards Assessing and Mitigating the Risk of Data Reconstruction Attacks in Federated Learning

Persistent Backdoor Attacks under Continual Fine-Tuning of LLMs

Do You See What I Say? Generalizable Deepfake Detection based on Visual Speech Recognition

Data Exfiltration by Compression Attack: Definition and Evaluation on Medical Image Data

Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization

Adversary-Aware Private Inference over Wireless Channels

The Model's Language Matters: A Comparative Privacy Analysis of LLMs

Data Provenance Auditing of Fine-Tuned Large Language Models with a Text-Preserving Technique

Guidance Watermarking for Diffusion Models

Improving Out-of-Domain Audio Deepfake Detection via Layer Selection and Fusion of SSL-Based Countermeasures

Practical and Private Hybrid ML Inference with Fully Homomorphic Encryption

Mitigating Data Exfiltration Attacks through Layer-Wise Learning Rate Decay Fine-Tuning

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue