Latest papers

4 papers
defense arXiv Mar 27, 2026 · 10d ago

ROAST: Risk-aware Outlier-exposure for Adversarial Selective Training of Anomaly Detectors Against Evasion Attacks

Mohammed Elnawawy, Gargi Mitra, Shahrear Iqbal et al. · University of British Columbia · National Research Council Canada

Selective training framework that improves anomaly detector recall against evasion attacks by focusing on less vulnerable patient data

Input Manipulation Attack tabular
PDF
benchmark arXiv Feb 23, 2026 · 6w ago

Agents of Chaos

Natalie Shapira, Chris Wendler, Avery Yen et al. · Northeastern University · Independent Researcher +11 more

Red-teams live autonomous LLM agents over two weeks, documenting 11 case studies of dangerous failures including system takeover, DoS, and sensitive data disclosure

Excessive Agency Prompt Injection Insecure Plugin Design nlp
3 citations PDF
attack arXiv Feb 2, 2026 · 9w ago

FaceLinkGen: Rethinking Identity Leakage in Privacy-Preserving Face Recognition with Identity Extraction

Wenqi Guo, Shan Du · University of British Columbia · Weathon Software

Attacks privacy-preserving face recognition by inverting protected templates to extract identity embeddings and regenerate realistic faces

Model Inversion Attack vision
PDF
benchmark arXiv Sep 5, 2025 · Sep 2025

Behind the Mask: Benchmarking Camouflaged Jailbreaks in Large Language Models

Youjia Zheng, Mohammad Zandsalimy, Shanu Sushmita · Stevens Institute of Technology · University of British Columbia +1 more

Benchmarks camouflaged natural-language jailbreaks on LLMs with 500-prompt dataset and 7-dimension harmfulness evaluation framework

Prompt Injection nlp
PDF