ML Security Papers

Latest papers

2 papers

defense arXiv Mar 12, 2026 · 25d ago

OrthoEraser: Coupled-Neuron Orthogonal Projection for Concept Erasure

Chuancheng Shi, Wenhua Wu, Fei Shen et al. · University of Sydney · National University of Singapore +2 more

Defends T2I diffusion models from adversarial induction of harmful content via orthogonal projection that preserves benign semantic subspaces during concept erasure

Prompt Injection visiongenerative

PDF

attack arXiv Mar 1, 2026 · 5w ago

Turning Black Box into White Box: Dataset Distillation Leaks

Huajie Chen, Tianqing Zhu, Yuchen Zhong et al. · City University of Macau · CISPA Helmholtz Center for Information Security +2 more

Reveals that dataset distillation leaks training data via three-stage attack: architecture inference, membership inference, and model inversion

Model Inversion Attack Membership Inference Attack vision

PDF

Latest papers

OrthoEraser: Coupled-Neuron Orthogonal Projection for Concept Erasure

Turning Black Box into White Box: Dataset Distillation Leaks

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue