Latest papers

4 papers
defense arXiv Mar 7, 2026 · 4w ago

Governance Architecture for Autonomous Agent Systems: Threats, Framework, and Engineering Practice

Yuxu Ge · University of York

Four-layer governance framework defends LLM agents against prompt injection, RAG poisoning, and malicious plugins with 96% interception rate

Prompt Injection Insecure Plugin Design Excessive Agency nlp
PDF
benchmark arXiv Feb 13, 2026 · 7w ago

A Calibrated Memorization Index (MI) for Detecting Training Data Leakage in Generative MRI Models

Yash Deo, Yan Jia, Toni Lassila et al. · University of York · University of Leeds +3 more

Proposes calibrated memorization metrics using MRI foundation model features to detect training data duplication in generative MRI models

Model Inversion Attack vision
PDF Code
benchmark arXiv Nov 4, 2025 · Nov 2025

Trustworthy Quantum Machine Learning: A Roadmap for Reliability, Robustness, and Security in the NISQ Era

Ferhat Ozgur Catak, Jungwon Seo, Umit Cali · University of Stavanger · University of York

Roadmap formalizing adversarial robustness, privacy, and uncertainty metrics for trustworthy quantum ML on NISQ hardware

Input Manipulation Attack Model Inversion Attack federated-learning
PDF Code
defense arXiv Sep 29, 2025 · Sep 2025

Guided Uncertainty Learning Using a Post-Hoc Evidential Meta-Model

Charmaine Barker, Daniel Bethell, Simos Gerasimou · University of York

Post-hoc evidential meta-model detects adversarial inputs and OOD samples via noise-driven uncertainty curriculum on frozen models

Input Manipulation Attack vision
1 citations PDF