Latest papers

19 papers
attack arXiv Mar 18, 2026 · 19d ago

ARES: Scalable and Practical Gradient Inversion Attack in Federated Learning through Activation Recovery

Zirui Gong, Leo Yu Zhang, Yanjun Zhang et al. · Griffith University · Swinburne University of Technology +2 more

Gradient inversion attack reconstructing training data from federated learning updates via sparse activation recovery without architectural changes

Model Inversion Attack visionfederated-learning
PDF
attack arXiv Mar 17, 2026 · 20d ago

Poisoning the Pixels: Revisiting Backdoor Attacks on Semantic Segmentation

Guangsheng Zhang, Huan Tian, Leo Zhang et al. · University of Technology Sydney · Griffith University +2 more

Backdoor framework for semantic segmentation introducing six attack vectors and optimized triggers, bypassing existing defenses

Model Poisoning Data Poisoning Attack vision
PDF
attack arXiv Mar 5, 2026 · 4w ago

Osmosis Distillation: Model Hijacking with the Fewest Samples

Yuchen Shi, Huajie Chen, Heng Xu et al. · City University of Macau · Jinan University +1 more

Poisons distilled synthetic datasets to embed hidden hijacking tasks in models fine-tuned via transfer learning

Data Poisoning Attack Transfer Learning Attack vision
PDF
defense arXiv Mar 4, 2026 · 4w ago

From Spark to Fire: Modeling and Mitigating Error Cascades in LLM-Based Multi-Agent Collaboration

Yizhe Xie, Congcong Zhu, Xinyue Zhang et al. · City University of Macau · Minzu University of China

Models and defends against injected error-seed cascades in LLM multi-agent systems via genealogy-graph message governance

Prompt Injection Excessive Agency nlp
PDF Code
attack arXiv Mar 1, 2026 · 5w ago

Turning Black Box into White Box: Dataset Distillation Leaks

Huajie Chen, Tianqing Zhu, Yuchen Zhong et al. · City University of Macau · CISPA Helmholtz Center for Information Security +2 more

Reveals that dataset distillation leaks training data via three-stage attack: architecture inference, membership inference, and model inversion

Model Inversion Attack Membership Inference Attack vision
PDF
attack arXiv Mar 1, 2026 · 5w ago

Hide&Seek: Remove Image Watermarks with Negligible Cost via Pixel-wise Reconstruction

Huajie Chen, Tianqing Zhu, Hailin Yang et al. · City University of Macau · CISPA Helmholtz Center for Information Security +1 more

Pixel-wise reconstruction attack removes AI-image watermarks without querying detectors or knowing the watermarking scheme

Output Integrity Attack visiongenerative
PDF
attack arXiv Dec 18, 2025 · Dec 2025

Dual-View Inference Attack: Machine Unlearning Amplifies Privacy Exposure

Lulu Xue, Shengshan Hu, Linqiang Qian et al. · Huazhong University of Science and Technology · Tsinghua University +4 more

Novel black-box MIA exploits dual-model access after unlearning to infer membership of retained data via likelihood ratio inference

Membership Inference Attack vision
2 citations PDF
benchmark arXiv Dec 16, 2025 · Dec 2025

Black-Box Auditing of Quantum Model: Lifted Differential Privacy with Quantum Canaries

Baobao Song, Shiva Raj Pokhrel, Athanasios V. Vasilakos et al. · University of Technology Sydney · Deakin University +2 more

Black-box canary framework audits quantum ML models for memorization, empirically lower-bounding privacy leakage via quantum differential privacy

Membership Inference Attack
PDF
defense TDSC Nov 25, 2025 · Nov 2025

Frequency Bias Matters: Diving into Robust and Generalized Deep Image Forgery Detection

Chi Liu, Tianqing Zhu, Wanlei Zhou et al. · City University of Macau · Chinese Academy of Sciences

Frequency alignment method that both evades 12 deepfake detectors as a black-box attack and improves detector robustness and generalization

Output Integrity Attack Input Manipulation Attack vision
PDF
attack arXiv Nov 19, 2025 · Nov 2025

When Harmless Words Harm: A New Threat to LLM Safety via Conceptual Triggers

Zhaoxin Zhang, Borui Chen, Yiming Hu et al. · City University of Macau · University of Vienna +3 more

Novel LLM jailbreak using conceptual morphology triggers to shift ideological orientation in outputs without triggering safety filters

Prompt Injection nlp
PDF
defense arXiv Nov 16, 2025 · Nov 2025

DINO-Detect: A Simple yet Effective Framework for Blur-Robust AI-Generated Image Detection

Jialiang Shen, Jiyang Zheng, Yunqi Xue et al. · The University of Sydney · Shanghai Jiao Tong University +3 more

Proposes blur-robust AI-generated image detector via DINO-based teacher-student knowledge distillation for real-world motion degradation

Output Integrity Attack vision
1 citations PDF Code
defense arXiv Nov 12, 2025 · Nov 2025

GuardFed: A Trustworthy Federated Learning Framework Against Dual-Facet Attacks

Yanli Li, Yanan Zhou, Zhongliang Guo et al. · Nantong University · The University of Sydney +3 more

Introduces dual-facet Byzantine FL attack degrading accuracy and fairness simultaneously, defended by trust-score aggregation in GuardFed

Data Poisoning Attack federated-learning
PDF
benchmark arXiv Oct 21, 2025 · Oct 2025

The Trust Paradox in LLM-Based Multi-Agent Systems: When Collaboration Becomes a Security Vulnerability

Zijie Xu, Minfeng Qi, Shiqing Wu et al. · Minzu University of China · City University of Macau +1 more

Empirically validates that higher inter-agent trust in LLM multi-agent systems increases sensitive data over-exposure and authorization boundary violations

Excessive Agency Sensitive Information Disclosure nlp
2 citations PDF
attack TIFS Oct 9, 2025 · Oct 2025

DarkHash: A Data-Free Backdoor Attack Against Deep Hashing

Ziqi Zhou, Menghao Deng, Yufei Song et al. · Huazhong University of Science and Technology · City University of Macau +1 more

Data-free backdoor attack on deep hashing models using surrogate datasets and topological alignment loss to manipulate image retrieval results

Model Poisoning vision
7 citations PDF
defense arXiv Sep 21, 2025 · Sep 2025

MARS: A Malignity-Aware Backdoor Defense in Federated Learning

Wei Wan, Yuxuan Ning, Zhicong Huang et al. · City University of Macau · Australian National University +4 more

Defends federated learning against backdoor attacks using neuron-level backdoor energy and Wasserstein clustering to detect malicious model updates

Model Poisoning federated-learningvision
5 citations PDF
defense arXiv Sep 18, 2025 · Sep 2025

Causal Fingerprints of AI Generative Models

Hui Xu, Chi Liu, Congcong Zhu et al. · City University of Macau · Qilu University of Technology +1 more

Proposes causal fingerprinting framework to attribute AI-generated images to source GANs or diffusion models via disentangled model traces

Output Integrity Attack visiongenerative
PDF
attack arXiv Sep 9, 2025 · Sep 2025

Transferable Direct Prompt Injection via Activation-Guided MCMC Sampling

Minghui Li, Hao Zhang, Yechao Zhang et al. · Huazhong University of Science and Technology · Nanyang Technological University +1 more

Transfers direct prompt injection across black-box LLMs using surrogate activations and gradient-free MCMC token optimization

Prompt Injection nlp
PDF
defense arXiv Sep 7, 2025 · Sep 2025

RetinaGuard: Obfuscating Retinal Age in Fundus Images for Biometric Privacy Preserving

Zhengquan Luo, Chi Liu, Dongfu Xiao et al. · City University of Macau · Monash University +1 more

Defends biometric privacy by generating adversarial GAN perturbations that blind black-box retinal age prediction models while preserving diagnostic image utility

Input Manipulation Attack vision
PDF
attack arXiv Jan 1, 2025 · Jan 2025

Everywhere Attack: Attacking Locally and Globally to Boost Targeted Transferability

Hui Zeng, Sanshuai Cui, Biwei Chen et al. · Southwest University of Science and Technology · Guangan Institute of Technology +2 more

Attacks every local image region simultaneously to boost targeted adversarial transferability across black-box vision models by 28–300%

Input Manipulation Attack vision
5 citations PDF Code