ML Security Papers

Latest papers

3 papers

defense arXiv Jan 12, 2026 · 12w ago

Self-Creating Random Walks for Decentralized Learning under Pac-Man Attacks

Xingran Chen, Parimal Parag, Rohit Bhagat et al. · Rutgers University · Indian Institute of Science

Defends decentralized random-walk ML against Pac-Man Byzantine nodes that stealthily terminate walks, halting learning without triggering alarms

Data Poisoning Attack federated-learning

PDF

benchmark arXiv Oct 3, 2025 · Oct 2025

A Granular Study of Safety Pretraining under Model Abliteration

Shashank Agnihotri, Jonas Jakubassa, Priyam Dey et al. · University of Mannheim · Max-Planck-Institute for Informatics +2 more

Benchmarks safety pretraining robustness against model abliteration across 20 LLMs, revealing refusal-only training is most fragile to activation-level jailbreaking

Prompt Injection nlp

2 citations PDF Code

defense arXiv Aug 1, 2025 · Aug 2025

Random Walk Learning and the Pac-Man Attack

Xingran Chen, Parimal Parag, Rohit Bhagat et al. · Rutgers University · Indian Institute of Science

Defends decentralized RW-SGD against stealthy node attacks that kill random walks by duplicating them via the AC algorithm

Data Poisoning Attack federated-learning

PDF

Latest papers

Self-Creating Random Walks for Decentralized Learning under Pac-Man Attacks

A Granular Study of Safety Pretraining under Model Abliteration

Random Walk Learning and the Pac-Man Attack

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue