Latest papers

1 papers
defense FLLM Oct 16, 2025 · Oct 2025

PoTS: Proof-of-Training-Steps for Backdoor Detection in Large Language Models

Issam Seddik, Sami Souihi, Mohamed Tamaazousti et al. · Université Paris-Saclay · CEA LIST

Proposes PoTS protocol to catch backdoor injections in LLM training by auditing LM-Head sensitivity at each training step

Model Poisoning Data Poisoning Attack nlp
PDF