Latest papers

3 papers
defense arXiv Mar 12, 2026 · 25d ago

OrthoEraser: Coupled-Neuron Orthogonal Projection for Concept Erasure

Chuancheng Shi, Wenhua Wu, Fei Shen et al. · University of Sydney · National University of Singapore +2 more

Defends T2I diffusion models from adversarial induction of harmful content via orthogonal projection that preserves benign semantic subspaces during concept erasure

Prompt Injection visiongenerative
PDF
attack arXiv Nov 18, 2025 · Nov 2025

Certified but Fooled! Breaking Certified Defences with Ghost Certificates

Quoc Viet Vo, Tashreque M. Haq, Paul Montague et al. · University of Adelaide · Defence Science and Technology Group +1 more

Imperceptible adversarial examples spoof randomized-smoothing certificates, making misclassified inputs appear strongly certified to bypass DensePure and similar defenses

Input Manipulation Attack vision
PDF Code
survey arXiv Jan 2, 2025 · Jan 2025

State-of-the-art AI-based Learning Approaches for Deepfake Generation and Detection, Analyzing Opportunities, Threading through Pros, Cons, and Future Prospects

Harshika Goyal, Mohammad Saif Wajid, Mohd Anas Wajid et al. · Indian Institute of Technology · Tecnológico de Monterrey +6 more

Surveys ~400 papers on deepfake generation (GANs, VAEs, Transformers) and detection, benchmarking datasets and future challenges

Output Integrity Attack visiongenerative
5 citations PDF