Mengling Feng

Papers in Database (2)

tool arXiv Mar 19, 2026 · 18d ago

MedForge: Interpretable Medical Deepfake Detection via Forgery-aware Reasoning

Zhihui Chen, Kai He, Qingyuan Lei et al. · National University of Singapore · The Chinese University of Hong Kong +3 more

Detects medical image deepfakes via localize-then-analyze reasoning with expert-aligned explanations on synthetic lesion edits

Output Integrity Attack visionmultimodal
PDF Code
defense arXiv Sep 8, 2025 · Sep 2025

Anchoring Refusal Direction: Mitigating Safety Risks in Tuning via Projection Constraint

Yanrui Du, Fenglei Fan, Sendong Zhao et al. · Harbin Institute of Technology · City University of Hong Kong +1 more

Defends LLM safety during fine-tuning by anchoring the internal refusal direction via projection-constrained loss regularization

Transfer Learning Attack Prompt Injection nlp
PDF