Latest papers

4 papers
defense arXiv Dec 7, 2025 · Dec 2025

AlignGemini: Generalizable AI-Generated Image Detection Through Task-Model Alignment

Ruoxin Chen, Jiahui Gao, Kaiqing Lin et al. · Tencent · East China University of Science and Technology +2 more

Proposes task-model alignment combining VLMs and vision models for generalizable AI-generated image detection

Output Integrity Attack visionmultimodal
PDF
defense arXiv Nov 10, 2025 · Nov 2025

3D-ANC: Adaptive Neural Collapse for Robust 3D Point Cloud Recognition

Yuanmin Huang, Wenxuan Li, Mi Zhang et al. · Fudan University · East China University of Science and Technology

Defends 3D point cloud classifiers against adversarial perturbations using Neural Collapse to build maximally separable, disentangled feature spaces

Input Manipulation Attack vision
PDF
attack arXiv Nov 4, 2025 · Nov 2025

An Automated Framework for Strategy Discovery, Retrieval, and Evolution in LLM Jailbreak Attacks

Xu Liu, Yan Chen, Kan Ling et al. · East China University of Science and Technology

Automated jailbreak framework learns from failed attempts and evolves reusable prompt strategies, achieving 82.7% ASR against LLMs in black-box settings

Prompt Injection nlp
2 citations PDF
defense arXiv Aug 6, 2025 · Aug 2025

ReasoningGuard: Safeguarding Large Reasoning Models with Inference-time Safety Aha Moments

Yuquan Wang, Mi Zhang, Yining Wang et al. · Fudan University · East China University of Science and Technology

Inference-time defense for Large Reasoning Models that injects safety reflections mid-reasoning to block jailbreak attacks

Prompt Injection nlp
PDF