ML Security Papers

Latest papers

12 papers

survey arXiv Apr 28, 2026 · 23d ago

Verification of Neural Networks (Lecture Notes)

Benedikt Bollig · Université Paris-Saclay · CNRS +1 more

Theoretical introduction to formal verification techniques for neural networks including feed-forward, recurrent, attention, and transformer architectures

Input Manipulation Attack visionnlp

PDF

attack arXiv Apr 16, 2026 · 5w ago

No More Guessing: a Verifiable Gradient Inversion Attack in Federated Learning

Francesco Diana, Chuan Xu, André Nusser et al. · Université Côte d’Azur · Inria +2 more

Gradient inversion attack on federated learning that algebraically verifies when exact training records are reconstructed from gradients

Model Inversion Attack tabularfederated-learning

PDF

attack arXiv Feb 18, 2026 · Feb 2026

Sequential Membership Inference Attacks

Thomas Michel, Debabrota Basu, Emilie Kaufmann · Univ. Lille · Inria +2 more

Derives optimal membership inference attack exploiting model update sequences, achieving tighter DP privacy audits than static-model baselines

Membership Inference Attack

PDF

benchmark arXiv Feb 17, 2026 · Feb 2026

Generalized Leverage Score for Scalable Assessment of Privacy Vulnerability

Valentin Dorseuil, Jamal Atif, Olivier Cappé · École normale supérieure · Université PSL +3 more

Proposes Generalized Leverage Score as a training-free metric for individual membership inference vulnerability in deep learning

Membership Inference Attack

PDF

attack arXiv Jan 13, 2026 · Jan 2026

Double Strike: Breaking Approximation-Based Side-Channel Countermeasures for DNNs

Lorenzo Casalino, Maria Méndez Real, Jean-Christophe Prévotet et al. · CentraleSupélec · Inria +7 more

Side-channel attack breaks MACPRUNING defense to recover 96–100% of DNN weights from embedded hardware implementations

Model Theft

PDF

benchmark arXiv Nov 10, 2025 · Nov 2025

Formal Reasoning About Confidence and Automated Verification of Neural Networks

Mohammad Afzal, S. Akshay, Blaise Genest et al. · Indian Institute of Technology Bombay · TCS Research +1 more

Formal verification framework extending neural network robustness checking to confidence-based specifications via grammar and layer augmentation

Input Manipulation Attack vision

PDF

survey arXiv Oct 23, 2025 · Oct 2025

On the Detectability of LLM-Generated Text: What Exactly Is LLM-Generated Text?

Mingmeng Geng, Thierry Poibeau · École normale supérieure · Université Paris Sciences et Lettres +1 more

Surveys LLM text detection limitations, arguing reliable detection fails due to definitional gaps and benchmark inadequacies

Output Integrity Attack nlp

PDF

benchmark EMNLP Oct 15, 2025 · Oct 2025

How Sampling Affects the Detectability of Machine-written texts: A Comprehensive Study

Matthieu Dubois, François Yvon, Pablo Piantanida · Sorbonne Université · CNRS +2 more

Benchmarks AI text detectors across 37 decoding configs, showing AUROC collapses from 0.99 to 0.01 with minor sampling changes

Output Integrity Attack nlp

2 citations PDF Code

defense arXiv Sep 26, 2025 · Sep 2025

Guidance Watermarking for Diffusion Models

Enoal Gesny, Eva Giboulot, Teddy Furon et al. · Univ. Rennes · Inria +3 more

Guides diffusion sampling with watermark-decoder gradients to embed robust provenance signals in generated images without retraining

Output Integrity Attack visiongenerative

1 citations PDF

defense arXiv Sep 1, 2025 · Sep 2025

Practical and Private Hybrid ML Inference with Fully Homomorphic Encryption

Sayan Biswas, Philippe Chartier, Akash Dhasade et al. · EPFL · Inria +4 more

Defends model IP in hybrid FHE inference by randomized shuffling of intermediate outputs, preventing clients from reconstructing server-side model weights

Model Theft vision

PDF

benchmark arXiv Aug 15, 2025 · Aug 2025

Semantically Guided Adversarial Testing of Vision Models Using Language Models

Katarzyna Filus, Jorge M. Cruz-Duarte · Polish Academy of Sciences · University of Lille +3 more

Semantically guided target label selection using BERT/CLIP/TinyLLAMA improves adversarial benchmarking interpretability and scalability over WordNet

Input Manipulation Attack visionnlp

PDF Code

attack arXiv Aug 1, 2025 · Aug 2025

Backdoor Attacks on Deep Learning Face Detection

Quentin Le Roux, Yannick Teglia, Teddy Furon et al. · Thales · Inria +3 more

Novel backdoor attacks on face detectors shift landmark coordinates and generate phantom faces via poisoned training data

Model Poisoning vision

PDF

Latest papers

Verification of Neural Networks (Lecture Notes)

No More Guessing: a Verifiable Gradient Inversion Attack in Federated Learning

Sequential Membership Inference Attacks

Generalized Leverage Score for Scalable Assessment of Privacy Vulnerability

Double Strike: Breaking Approximation-Based Side-Channel Countermeasures for DNNs

Formal Reasoning About Confidence and Automated Verification of Neural Networks

On the Detectability of LLM-Generated Text: What Exactly Is LLM-Generated Text?

How Sampling Affects the Detectability of Machine-written texts: A Comprehensive Study

Guidance Watermarking for Diffusion Models

Practical and Private Hybrid ML Inference with Fully Homomorphic Encryption

Semantically Guided Adversarial Testing of Vision Models Using Language Models

Backdoor Attacks on Deep Learning Face Detection

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue