ML Security Papers

Latest papers

12 papers

attack arXiv Jan 3, 2026 · Jan 2026

RefSR-Adv: Adversarial Attack on Reference-based Image Super-Resolution Models

Jiazhu Dai, Huihui Jiang · Shanghai University

Adversarial attack on reference-based super-resolution models that degrades output by perturbing only the auxiliary reference image

Input Manipulation Attack vision

PDF

defense arXiv Dec 14, 2025 · Dec 2025

CODE ACROSTIC: Robust Watermarking for Code Generation

Li Lin, Siyuan Xin, Yang Cao et al. · Institude of Science Tokyo · Shanghai University +1 more

Proposes sparse LLM code watermarking using high-entropy token cues to resist comment-removal attacks that defeat existing methods

Output Integrity Attack nlp

PDF

defense arXiv Nov 26, 2025 · Nov 2025

AuthenLoRA: Entangling Stylization with Imperceptible Watermarks for Copyright-Secure LoRA Adapters

Fangming Shi, Li Li, Kejiang Chen et al. · Shanghai University · University of Science and Technology of China +1 more

Embeds imperceptible traceable watermarks into LoRA adapter training so every generated image carries a provenance mark for copyright enforcement

Output Integrity Attack Model Theft visiongenerative

PDF Code

defense arXiv Nov 12, 2025 · Nov 2025

Tighter Truncated Rectangular Prism Approximation for RNN Robustness Verification

Xingqi Lin, Liangyu Chen, Min Wu et al. · Shanghai Key Laboratory of Trustworthy Computing · Shanghai University

Tighter linear relaxation for RNN robustness certification using truncated rectangular prism over-approximation of Hadamard products

Input Manipulation Attack visionnlpaudio

PDF Code

attack arXiv Oct 23, 2025 · Oct 2025

BadGraph: A Backdoor Attack Against Latent Diffusion Model for Text-Guided Graph Generation

Liang Ye, Shengqin Chen, Jiazhu Dai · Shanghai University

Backdoor attack on text-guided graph diffusion models using textual triggers to covertly implant attacker-specified subgraphs in generated molecules

Model Poisoning graphgenerative

PDF Code

defense arXiv Oct 19, 2025 · Oct 2025

Rotation, Scale, and Translation Resilient Black-box Fingerprinting for Intellectual Property Protection of EaaS Models

Hongjie Zhang, Zhiqi Zhao, Hanzhou Wu et al. · Sichuan Normal University · Shanghai University +3 more

Fingerprints EaaS embedding models via point-cloud topology analysis to verify ownership, resilient to rotation, scale, and translation attacks

Model Theft visionnlp

PDF

defense arXiv Oct 16, 2025 · Oct 2025

An Information Asymmetry Game for Trigger-based DNN Model Watermarking

Chaoyue Huang, Gejian Zhao, Hanzhou Wu et al. · Shanghai University · Guizhou Normal University +2 more

Game-theoretic framework for robust DNN model watermarking derives attacker's optimal pruning budget and exponential WSR lower bound

Model Theft vision

PDF

attack arXiv Sep 23, 2025 · Sep 2025

Trigger Where It Hurts: Unveiling Hidden Backdoors through Sensitivity with Sensitron

Gejian Zhao, Hanzhou Wu, Xinpeng Zhang · Shanghai University

XAI-guided NLP backdoor attack using SHAP attribution to pinpoint vulnerable tokens and craft high-ASR triggers in language models

Model Poisoning nlp

PDF

defense International Conference on Co... Sep 16, 2025 · Sep 2025

Yet Another Watermark for Large Language Models

Siyuan Bao, Ying Shi, Zhiguang Yang et al. · Shanghai University · Guizhou Normal University

Embeds LLM watermarks via output-layer weight manipulation, detectable from generated text without model access for IP protection

Model Theft Output Integrity Attack nlp

PDF

attack arXiv Aug 27, 2025 · Aug 2025

The Art of Hide and Seek: Making Pickle-Based Model Supply Chain Poisoning Stealthy Again

Tong Liu, Guozhu Meng, Peng Zhou et al. · Chinese Academy of Sciences · University of Chinese Academy of Sciences +2 more

Reveals 22 pickle model loading attack paths and 133 gadgets that bypass all SOTA supply chain scanners on HuggingFace

AI Supply Chain Attacks

PDF

defense arXiv Aug 15, 2025 · Aug 2025

Robust Convolution Neural ODEs via Contractivity-promoting regularization

Muhammad Zakwan, Liang Xu, Giancarlo Ferrari-Trecate · Inspire AG · ETH Zürich +3 more

Defends Convolutional Neural ODEs against FGSM/PGD attacks using contraction-theory regularization that bounds feature perturbation propagation

Input Manipulation Attack vision

PDF

defense International Symposium on Dig... Jan 2, 2025 · Jan 2025

A Game Between the Defender and the Attacker for Trigger-based Black-box Model Watermarking

Chaoyue Huang, Hanzhou Wu · Shanghai University

Game-theoretic framework derives optimal strategies for defenders and attackers in trigger-based black-box DNN model watermarking

Model Theft vision

1 citations PDF

Latest papers

RefSR-Adv: Adversarial Attack on Reference-based Image Super-Resolution Models

CODE ACROSTIC: Robust Watermarking for Code Generation

AuthenLoRA: Entangling Stylization with Imperceptible Watermarks for Copyright-Secure LoRA Adapters

Tighter Truncated Rectangular Prism Approximation for RNN Robustness Verification

BadGraph: A Backdoor Attack Against Latent Diffusion Model for Text-Guided Graph Generation

Rotation, Scale, and Translation Resilient Black-box Fingerprinting for Intellectual Property Protection of EaaS Models

An Information Asymmetry Game for Trigger-based DNN Model Watermarking

Trigger Where It Hurts: Unveiling Hidden Backdoors through Sensitivity with Sensitron

Yet Another Watermark for Large Language Models

The Art of Hide and Seek: Making Pickle-Based Model Supply Chain Poisoning Stealthy Again

Robust Convolution Neural ODEs via Contractivity-promoting regularization

A Game Between the Defender and the Attacker for Trigger-based Black-box Model Watermarking

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue