Vrizlynn L. L. Thing

attack arXiv Oct 13, 2025 · Oct 2025

CoSPED: Consistent Soft Prompt Targeted Data Extraction and Defense

Zhuochen Yang, Kar Wai Fok, Vrizlynn L. L. Thing · Nanyang Technological University · ST Engineering

Soft prompt attack extracts 65.2% of memorized LLM training data; ROME-based defense reduces leakage to 1.6%

Model Inversion Attack Sensitive Information Disclosure nlp

PDF

attack arXiv Oct 24, 2025 · Oct 2025

Enhanced MLLM Black-Box Jailbreaking Attacks and Defenses

Xingwei Zhong, Kar Wai Fok, Vrizlynn L.L. Thing · ST Engineering

Proposes Re-attack, a black-box jailbreak for MLLMs using provocative text and typography/multi-image prompts, achieving >70% ASR on open-source models and 4.6× improvement on GPT-4o

Prompt Injection multimodalvisionnlp

PDF

defense arXiv Dec 1, 2025 · Dec 2025

DefenSee: Dissecting Threat from Sight and Text -- A Multi-View Defensive Pipeline for Multi-modal Jailbreaks

Zihao Wang, Kar Wai Fok, Vrizlynn L. L. Thing · ST Engineering

Defends VLMs against multi-modal jailbreaks by transcribing image variants and performing cross-modal consistency checks to flag harmful intent

Input Manipulation Attack Prompt Injection visionnlpmultimodal

PDF

Papers in Database (3)

CoSPED: Consistent Soft Prompt Targeted Data Extraction and Defense

Enhanced MLLM Black-Box Jailbreaking Attacks and Defenses

DefenSee: Dissecting Threat from Sight and Text -- A Multi-View Defensive Pipeline for Multi-modal Jailbreaks