ML Security Papers

Latest papers

4 papers

defense arXiv Jan 12, 2026 · 12w ago

Lucas Schott, Elies Gherbi, Hatem Hajri et al. · IRT SystemX · Sorbonne Université +2 more

Adaptive adversarial training for RL using reward-preserving attacks that calibrate perturbation strength to avoid making tasks unsolvable

Input Manipulation Attack reinforcement-learning

benchmark EMNLP Oct 15, 2025 · Oct 2025

Matthieu Dubois, François Yvon, Pablo Piantanida · Sorbonne Université · CNRS +2 more

Benchmarks AI text detectors across 37 decoding configs, showing AUROC collapses from 0.99 to 0.01 with minor sampling changes

Output Integrity Attack nlp

2 citations PDF Code

defense arXiv Sep 30, 2025 · Sep 2025

Akash Dhasade, Sadegh Farhadkhani, Rachid Guerraoui et al. · EPFL · University of Copenhagen +1 more

Defends federated inference aggregators against Byzantine clients using DeepSet adversarial training, beating existing methods by up to 22%

Data Poisoning Attack federated-learningvisionnlp

1 citations PDF

attack arXiv Sep 30, 2025 · Sep 2025

Valentin Barbaza, Alan Rodrigo Diaz-Rizo, Hassan Aboushady et al. · Sorbonne Université

Hardware Trojan in AI accelerators covertly exfiltrates model weights via wireless channel, enabling complete architecture-agnostic model theft

Model Theft