Usman Naseem

defense arXiv Oct 15, 2025 · Oct 2025

SHIELD: Classifier-Guided Prompting for Robust and Safer LVLMs

Juan Ren, Mark Dras, Usman Naseem · Macquarie University

Plug-and-play preprocessing guardrail for LVLMs that classifies harm categories and applies tailored Block/Reframe/Forward safety prompts against multimodal jailbreaks

Input Manipulation Attack Prompt Injection visionnlpmultimodal

4 citations PDF Code

defense BigData Congress Oct 29, 2025 · Oct 2025

Agentic Moderation: Multi-Agent Design for Safer Vision-Language Models

Juan Ren, Mark Dras, Usman Naseem · Macquarie University

Multi-agent safety framework defending VLMs against jailbreak attacks via cooperative Shield, Evaluator, and Reflector agents with context-aware moderation

Input Manipulation Attack Prompt Injection multimodalvisionnlp

1 citations PDF

defense arXiv Jan 24, 2026 · 10w ago

Revealing the Truth with ConLLM for Detecting Multi-Modal Deepfakes

Gautam Siddharth Kashyap, Harsh Joshi, Niharika Jain et al. · Macquarie University · Bharati Vidyapeeth’s College Of Engineering +4 more

Proposes ConLLM, a contrastive learning + LLM framework for detecting audio, video, and audio-visual deepfakes

Output Integrity Attack multimodalaudiovisionnlp

PDF Code

Papers in Database (3)

SHIELD: Classifier-Guided Prompting for Robust and Safer LVLMs

Agentic Moderation: Multi-Agent Design for Safer Vision-Language Models

Revealing the Truth with ConLLM for Detecting Multi-Modal Deepfakes