Latest papers

8 papers
attack arXiv Mar 17, 2026 · 20d ago

SOMP: Scalable Gradient Inversion for Large Language Models via Subspace-Guided Orthogonal Matching Pursuit

Yibo Li, Qiongxiu Li · Politecnico di Milano · Aalborg University

Scalable gradient inversion attack recovering private training text from aggregated LLM gradients in federated learning settings

Model Inversion Attack Sensitive Information Disclosure nlpfederated-learning
PDF
attack arXiv Oct 16, 2025 · Oct 2025

Sequential Comics for Jailbreaking Multimodal Large Language Models via Structured Visual Storytelling

Deyue Zhang, Dongdong Yang, Junjie Mu et al. · 360 AI Security Lab · Politecnico di Milano +1 more

Jailbreaks multimodal LLMs with diffusion-generated comic sequences that exploit narrative coherence to bypass safety alignment

Input Manipulation Attack Prompt Injection visionnlpmultimodalgenerative
1 citations PDF
benchmark arXiv Oct 7, 2025 · Oct 2025

Beyond Spectral Peaks: Interpreting the Cues Behind Synthetic Image Detection

Sara Mandelli, Diego Vila-Portela, David Vázquez-Padín et al. · Politecnico di Milano · University of Vigo

Systematic study revealing that most AI-generated image detectors do not rely on spectral peak artifacts as widely assumed

Output Integrity Attack vision
PDF
benchmark arXiv Sep 8, 2025 · Sep 2025

When Secure Isn't: Assessing the Security of Machine Learning Model Sharing

Gabriele Digregorio, Marco Di Gennaro, Stefano Zanero et al. · Politecnico di Milano

Discovers six 0-day ACE vulnerabilities in ML model-sharing frameworks and hubs, debunking secure-format myths in the supply chain

AI Supply Chain Attacks
PDF
attack arXiv Sep 8, 2025 · Sep 2025

Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?

Junjie Mu, Zonghao Ying, Zhekui Fan et al. · Beihang University · 360 AI Security Lab +4 more

Identifies redundant tokens in GCG adversarial suffixes via learnable masking, reducing LLM jailbreak attack time by 16.8%.

Input Manipulation Attack Prompt Injection nlp
PDF
defense arXiv Aug 1, 2025 · Aug 2025

LeakSealer: A Semisupervised Defense for LLMs Against Prompt Injection and Leakage Attacks

Francesco Panebianco, Stefano Bonfanti, Francesco Trovò et al. · Politecnico di Milano · ML cube

Defends LLMs against jailbreaks and PII leakage via semisupervised anomaly detection with forensic usage maps

Prompt Injection Sensitive Information Disclosure nlp
PDF
defense arXiv Jan 12, 2025 · Jan 2025

KeTS: Kernel-based Trust Segmentation against Model Poisoning Attacks

Ankit Gangwal, Mauro Conti, Tommaso Pauselli · IIIT Hyderabad · University of Padua +1 more

Defends federated learning against Byzantine model poisoning by segmenting malicious clients via KDE on historical update evolution

Data Poisoning Attack federated-learningvisiontabular
PDF
tool IEEE Transactions on Software ... Jan 3, 2025 · Jan 2025

How Toxic Can You Get? Search-based Toxicity Testing for Large Language Models

Simone Corbo, Luca Bancale, Valeria De Gennaro et al. · Politecnico di Milano · Karlsruhe Institute of Technology

Evolutionary search-based tool that auto-generates fluent prompts to elicit toxic outputs from aligned LLMs, outperforming jailbreak baselines

Prompt Injection nlp
PDF