Latest papers

11 papers
attack arXiv Mar 19, 2026 · 18d ago

In-the-Wild Camouflage Attack on Vehicle Detectors through Controllable Image Editing

Xiao Fang, Yiming Gong, Stanislav Panev et al. · Carnegie Mellon University · DEVCOM Army Research Laboratory +1 more

Physical-world camouflage attack synthesizing adversarial vehicle textures via ControlNet fine-tuning, achieving 38% AP50 drop with transferability

Input Manipulation Attack vision
PDF Code
defense arXiv Feb 23, 2026 · 6w ago

CITED: A Decision Boundary-Aware Signature for GNNs Towards Model Extraction Defense

Bolin Shen, Md Shamim Seraj, Zhan Cheng et al. · Florida State University · University of Wisconsin

Defends GNN models against extraction attacks via decision boundary-aware signatures enabling ownership verification at both embedding and label levels

Model Theft graph
PDF Code
defense arXiv Feb 23, 2026 · 6w ago

CREDIT: Certified Ownership Verification of Deep Neural Networks Against Model Extraction Attacks

Bolin Shen, Zhan Cheng, Neil Zhenqiang Gong et al. · Florida State University · University of Wisconsin +2 more

Certifies DNN ownership against model extraction using mutual information similarity with theoretical verification guarantees

Model Theft visionnlp
PDF Code
benchmark arXiv Feb 10, 2026 · 7w ago

Benchmarking Knowledge-Extraction Attack and Defense on Retrieval-Augmented Generation

Zhisheng Qi, Utkarsh Sahu, Li Ma et al. · University of Oregon · Michigan State University +6 more

First systematic benchmark comparing knowledge-extraction attacks and defenses on RAG systems under unified evaluation protocols

Sensitive Information Disclosure nlp
PDF Code
benchmark arXiv Dec 4, 2025 · Dec 2025

Topology Matters: Measuring Memory Leakage in Multi-Agent LLMs

Jinbo Liu, Defu Cao, Yifei Wei et al. · University of Southern California · Florida State University +1 more

Benchmarks PII leakage in multi-agent LLM systems across six topologies, showing dense connectivity and proximity amplify adversarial memory extraction

Sensitive Information Disclosure nlp
1 citations 1 influentialPDF
attack arXiv Nov 14, 2025 · Nov 2025

A Systematic Study of Model Extraction Attacks on Graph Foundation Models

Haoyan Xu, Ruizhi Qian, Jiate Li et al. · University of Southern California · Florida State University +2 more

Systematically extracts Graph Foundation Models via black-box embedding regression, cloning victim models at 0.07% of original training cost

Model Theft graphmultimodal
PDF
defense arXiv Oct 27, 2025 · Oct 2025

PRO: Enabling Precise and Robust Text Watermark for Open-Source LLMs

Jiaqi Xue, Yifei Zhao, Mansour Al Ghanim et al. · University of Central Florida · Florida State University +1 more

Embeds robust text watermarks into open-source LLM weights to detect AI-generated content even after fine-tuning or model merging

Output Integrity Attack nlp
PDF
defense arXiv Oct 24, 2025 · Oct 2025

DictPFL: Efficient and Private Federated Learning on Encrypted Gradients

Jiaqi Xue, Mayank Kumar, Yuzhang Shang et al. · University of Central Florida · Florida State University +2 more

Defends federated learning against gradient inversion attacks via efficient homomorphic encryption, achieving 2× overhead of plaintext FL

Model Inversion Attack federated-learning
1 citations PDF Code
survey arXiv Aug 27, 2025 · Aug 2025

Intellectual Property in Graph-Based Machine Learning as a Service: Attacks and Defenses

Lincan Li, Bolin Shen, Chenxi Zhao et al. · Florida State University · Northeastern University +3 more

Survey of model theft, data reconstruction, and membership inference attacks and defenses for graph ML-as-a-service, with open-source evaluation library PyGIP

Model Theft Model Inversion Attack Membership Inference Attack graph
PDF Code
survey arXiv Aug 20, 2025 · Aug 2025

A Systematic Survey of Model Extraction Attacks and Defenses: State-of-the-Art and Perspectives

Kaixiang Zhao, Lincan Li, Kaize Ding et al. · University of Notre Dame · Florida State University +3 more

Surveys model extraction attacks and defenses across MLaaS platforms, proposing a taxonomy of attack mechanisms and computing environments

Model Theft visionnlptabular
PDF Code
attack arXiv Aug 20, 2025 · Aug 2025

Universal and Transferable Adversarial Attack on Large Language Models Using Exponentiated Gradient Descent

Sajib Biswas, Mao Nishino, Samuel Jacob Chacko et al. · Florida State University

Gradient-based adversarial suffix attack on LLMs using exponentiated gradient descent to bypass safety alignment with universal and transferable triggers

Input Manipulation Attack Prompt Injection nlp
PDF Code