Latest papers

11 papers
attack arXiv Mar 24, 2026 · 13d ago

AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screenshots-based Mobile GUI Agents

Yutao Luo, Haotian Zhu, Shuchao Pang et al. · Nanjing University of Science and Technology · Macquarie University +3 more

Backdoor attack on mobile GUI agents using benign notification icons to trigger malicious actions with 90%+ success rate

Model Poisoning visionmultimodal
PDF
attack arXiv Jan 30, 2026 · 9w ago

Semantic Leakage from Image Embeddings

Yiyi Chen, Qiongkai Xu, Desmond Elliott et al. · Aalborg University · Macquarie University +1 more

Recovers semantic content from compressed image embeddings via alignment and retrieval, exposing privacy risks in CLIP, GEMINI, COHERE, and NOMIC APIs

Model Inversion Attack visionmultimodal
PDF
defense arXiv Jan 24, 2026 · 10w ago

Revealing the Truth with ConLLM for Detecting Multi-Modal Deepfakes

Gautam Siddharth Kashyap, Harsh Joshi, Niharika Jain et al. · Macquarie University · Bharati Vidyapeeth’s College Of Engineering +4 more

Proposes ConLLM, a contrastive learning + LLM framework for detecting audio, video, and audio-visual deepfakes

Output Integrity Attack multimodalaudiovisionnlp
PDF Code
attack arXiv Jan 5, 2026 · Jan 2026

Crafting Adversarial Inputs for Large Vision-Language Models Using Black-Box Optimization

Jiwei Guan, Haibo Jin, Haohan Wang · Macquarie University · University of Illinois Urbana-Champaign

Black-box gradient-free attack crafts adversarial images to jailbreak vision-language models with 83% ASR

Input Manipulation Attack Prompt Injection visionnlpmultimodal
PDF
defense Annual Computer Security Appli... Dec 15, 2025 · Dec 2025

CTIGuardian: A Few-Shot Framework for Mitigating Privacy Leakage in Fine-Tuned LLMs

Shashie Dilhara Batan Arachchige, Benjamin Zi Hao Zhao, Hassan Jameel Asghar et al. · Macquarie University

Defends fine-tuned CTI LLMs against data-extraction attacks using few-shot privacy alignment with classifier and redactor components

Model Inversion Attack Sensitive Information Disclosure nlp
PDF Code
attack arXiv Dec 10, 2025 · Dec 2025

Reference Recommendation based Membership Inference Attack against Hybrid-based Recommender Systems

Xiaoxiao Chi, Xuyun Zhang, Yan Wang et al. · Macquarie University · The University of Newcastle +1 more

Novel metric-based membership inference attack against hybrid recommender systems using reference recommendations to infer user training membership

Membership Inference Attack tabular
PDF
attack arXiv Nov 19, 2025 · Nov 2025

As If We've Met Before: LLMs Exhibit Certainty in Recognizing Seen Files

Haodong Li, Jingqi Zhang, Xiao Cheng et al. · Huazhong University of Science and Technology · National University of Singapore +1 more

Novel membership inference framework exploiting LLM overconfidence and uncertainty signals to detect copyrighted training data

Membership Inference Attack nlp
PDF
defense BigData Congress Oct 29, 2025 · Oct 2025

Agentic Moderation: Multi-Agent Design for Safer Vision-Language Models

Juan Ren, Mark Dras, Usman Naseem · Macquarie University

Multi-agent safety framework defending VLMs against jailbreak attacks via cooperative Shield, Evaluator, and Reflector agents with context-aware moderation

Input Manipulation Attack Prompt Injection multimodalvisionnlp
1 citations PDF
defense arXiv Oct 15, 2025 · Oct 2025

SHIELD: Classifier-Guided Prompting for Robust and Safer LVLMs

Juan Ren, Mark Dras, Usman Naseem · Macquarie University

Plug-and-play preprocessing guardrail for LVLMs that classifies harm categories and applies tailored Block/Reframe/Forward safety prompts against multimodal jailbreaks

Input Manipulation Attack Prompt Injection visionnlpmultimodal
4 citations PDF Code
survey arXiv Sep 10, 2025 · Sep 2025

Adversarial Attacks Against Automated Fact-Checking: A Survey

Fanzhen Liu, Alsharif Abuadbba, Kristen Moore et al. · Macquarie University · CSIRO’s Data61 +1 more

Surveys adversarial attacks against automated fact-checking ML models, covering claim manipulation, evidence injection, and adversary-aware defenses

Input Manipulation Attack Data Poisoning Attack Prompt Injection nlpmultimodal
PDF Code
attack arXiv Aug 21, 2025 · Aug 2025

Retrieval-Augmented Review Generation for Poisoning Recommender Systems

Shiyi Yang, Xinshu Li, Guanglin Zhou et al. · University of New South Wales · CSIRO’s Data61 +2 more

Poisons recommender systems by injecting LLM-generated fake user profiles using retrieval-augmented ICL and jailbreaking to evade detection

Data Poisoning Attack nlp
PDF