Latest papers

3 papers
defense arXiv Feb 16, 2026 · 7w ago

Weight space Detection of Backdoors in LoRA Adapters

David Puertolas Merenciano, Ekaterina Vasyagina, Raghav Dixit et al. · Algoverse AI Research · University of Aberdeen +1 more

Detects backdoored LoRA adapters via SVD spectral statistics on weight matrices, achieving 97% accuracy without model execution

Model Poisoning AI Supply Chain Attacks nlp
PDF
attack arXiv Jan 26, 2026 · 10w ago

Malicious Repurposing of Open Science Artefacts by Using Large Language Models

Zahra Hashemi, Zhiqiang Zhong, Jun Pang et al. · University of Luxembourg · University of Aberdeen

Persuasion-based jailbreak pipeline exploits LLMs to repurpose open NLP artefacts into harmful research proposals

Prompt Injection nlp
PDF
attack arXiv Oct 22, 2025 · Oct 2025

Training data membership inference via Gaussian process meta-modeling: a post-hoc analysis approach

Yongchao Huang, Pengfei Zhang, Shahzad Mumtaz · University of Aberdeen · Binance

Proposes Gaussian process meta-model for membership inference attacks using post-hoc features without shadow models

Membership Inference Attack visionnlptabular
PDF