Latest papers

7 papers
survey arXiv Mar 24, 2026 · 13d ago

SoK: The Attack Surface of Agentic AI -- Tools, and Autonomy

Ali Dehghantanha, Sajad Homayoun · University of Guelph · Aalborg University

Surveys attack surface of agentic LLM systems: prompt injection, RAG poisoning, tool exploits, and multi-agent threats with defense taxonomy

Prompt Injection Insecure Plugin Design Excessive Agency nlpmultimodal
PDF
attack arXiv Mar 17, 2026 · 20d ago

SOMP: Scalable Gradient Inversion for Large Language Models via Subspace-Guided Orthogonal Matching Pursuit

Yibo Li, Qiongxiu Li · Politecnico di Milano · Aalborg University

Scalable gradient inversion attack recovering private training text from aggregated LLM gradients in federated learning settings

Model Inversion Attack Sensitive Information Disclosure nlpfederated-learning
PDF
benchmark arXiv Mar 2, 2026 · 5w ago

Characterizing Memorization in Diffusion Language Models: Generalized Extraction and Sampling Effects

Xiaoyu Luo, Wenrui Yu, Qiongxiu Li et al. · Aalborg University

Characterizes training data memorization in diffusion LMs via a generalized extraction framework, proving sampling resolution controls verbatim PII leakage

Model Inversion Attack Sensitive Information Disclosure nlpgenerative
PDF
defense arXiv Feb 13, 2026 · 7w ago

TCRL: Temporal-Coupled Adversarial Training for Robust Constrained Reinforcement Learning in Worst-Case Scenarios

Wentao Xu, Zhongming Yao, Weihao Li et al. · Northeastern University · Zhejiang University +1 more

Defends constrained RL agents against temporally coupled adversarial observation attacks via novel cost constraints and dual reward defense

Input Manipulation Attack reinforcement-learning
PDF
attack arXiv Jan 30, 2026 · 9w ago

Semantic Leakage from Image Embeddings

Yiyi Chen, Qiongkai Xu, Desmond Elliott et al. · Aalborg University · Macquarie University +1 more

Recovers semantic content from compressed image embeddings via alignment and retrieval, exposing privacy risks in CLIP, GEMINI, COHERE, and NOMIC APIs

Model Inversion Attack visionmultimodal
PDF
benchmark arXiv Jan 7, 2026 · 12w ago

Do LLMs Really Memorize Personally Identifiable Information? Revisiting PII Leakage with a Cue-Controlled Memorization Framework

Xiaoyu Luo, Yiyi Chen, Qiongxiu Li et al. · Aalborg University

Proposes CRM framework showing most reported LLM PII leakage is cue-driven generalization, not true memorization, across 32 languages

Membership Inference Attack Sensitive Information Disclosure nlp
1 citations PDF Code
attack arXiv Nov 10, 2025 · Nov 2025

Breaking Privacy in Federated Clustering: Perfect Input Reconstruction via Temporal Correlations

Guang Yang, Lixia Luo, Qiongxiu Li · Université Paris Cité · Hunan University of Science and Technology +1 more

Exploits temporal correlations in federated k-means iterations to perfectly reconstruct private training data from disclosed centroids

Model Inversion Attack federated-learning
PDF