Latest papers

2 papers
defense arXiv Mar 12, 2026 · 25d ago

OrthoEraser: Coupled-Neuron Orthogonal Projection for Concept Erasure

Chuancheng Shi, Wenhua Wu, Fei Shen et al. · University of Sydney · National University of Singapore +2 more

Defends T2I diffusion models from adversarial induction of harmful content via orthogonal projection that preserves benign semantic subspaces during concept erasure

Prompt Injection visiongenerative
PDF
attack arXiv Mar 1, 2026 · 5w ago

Turning Black Box into White Box: Dataset Distillation Leaks

Huajie Chen, Tianqing Zhu, Yuchen Zhong et al. · City University of Macau · CISPA Helmholtz Center for Information Security +2 more

Reveals that dataset distillation leaks training data via three-stage attack: architecture inference, membership inference, and model inversion

Model Inversion Attack Membership Inference Attack vision
PDF