Latest papers

2 papers
defense arXiv Jan 29, 2026 · 9w ago

The Unseen Threat: Residual Knowledge in Machine Unlearning under Perturbed Samples

Hsiang Hsu, Pradeep Niroula, Zichang He et al. · JPMorganChase

Reveals that unlearned models still recognize adversarially perturbed forget samples, enabling membership inference; proposes RURK fine-tuning defense

Membership Inference Attack vision
1 citations PDF
defense SSRN Oct 8, 2025 · Oct 2025

A2AS: Agentic AI Runtime Security and Self-Defense

Eugene Neelou, Ivan Novikov, Max Moroz et al. · A2AS · OWASP +10 more

Proposes A2AS runtime security framework for LLM agents enforcing prompt authentication, behavior boundaries, and in-context defenses

Prompt Injection Excessive Agency nlp
3 citations PDF