ML Security Papers

Latest papers

5 papers

defense arXiv Mar 9, 2026 · 28d ago

Where, What, Why: Toward Explainable 3D-GS Watermarking

Mingshu Cai, Jiajun Li, Osamu Yoshie et al. · Waseda University · Southeast University +1 more

Watermarks 3D Gaussian Splatting assets with explainable carrier selection, improving visual quality by +0.83 dB and bit-accuracy by +1.24% over prior methods

Output Integrity Attack visiongenerative

PDF

defense arXiv Jan 5, 2026 · Jan 2026

FAROS: Robust Federated Learning with Adaptive Scaling against Backdoor Attacks

Chenyu Hu, Qiming Hu, Sinan Chen et al. · Southwest University · University of Electronic Science and Technology of China +3 more

Defends federated learning against adaptive backdoor attacks using dynamic gradient scaling and robust core-set aggregation

Model Poisoning federated-learningvision

PDF

attack arXiv Nov 13, 2025 · Nov 2025

Trapped by Their Own Light: Deployable and Stealth Retroreflective Patch Attacks on Traffic Sign Recognition Systems

Go Tsuruoka, Takami Sato, Qi Alfred Chen et al. · Waseda University · University of California +2 more

Physical retroreflective adversarial patch on traffic signs achieves 93.4% attack success while remaining visually indistinguishable from benign signs

Input Manipulation Attack vision

PDF

benchmark arXiv Oct 18, 2025 · Oct 2025

OpenLVLM-MIA: A Controlled Benchmark Revealing the Limits of Membership Inference Attacks on Large Vision-Language Models

Ryoto Miyamoto, Xin Fan, Fuyuko Kido et al. · Waseda University · Hitotsubashi University

Controlled benchmark exposes that prior MIA successes on VLMs stem from distributional bias, not true membership detection

Membership Inference Attack visionmultimodal

1 citations PDF

attack arXiv Sep 6, 2025 · Sep 2025

Yours or Mine? Overwriting Attacks Against Neural Audio Watermarking

Lingfeng Yao, Chenpei Huang, Shengyao Wang et al. · University of Houston · Waseda University +3 more

Overwriting attacks replace legitimate audio watermarks with forged ones, achieving ~100% success across white-, gray-, and black-box threat models

Output Integrity Attack audiogenerative

PDF

Latest papers

Where, What, Why: Toward Explainable 3D-GS Watermarking

FAROS: Robust Federated Learning with Adaptive Scaling against Backdoor Attacks

Trapped by Their Own Light: Deployable and Stealth Retroreflective Patch Attacks on Traffic Sign Recognition Systems

OpenLVLM-MIA: A Controlled Benchmark Revealing the Limits of Membership Inference Attacks on Large Vision-Language Models

Yours or Mine? Overwriting Attacks Against Neural Audio Watermarking

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue