ML Security Papers

Latest papers

3 papers

attack arXiv Jan 21, 2026 · 10w ago

Muhammad Khalifa, Lajanugen Logeswaran, Jaekyeom Kim et al. · University of Michigan · LG AI Research +1 more

Crafted agent chain-of-thought reasoning inflates LLM/VLM judge false positives by up to 90% across 800 web-task trajectories

Prompt Injection nlpmultimodal

1 citations PDF

defense arXiv Nov 18, 2025 · Nov 2025

Sungik Choi, Hankook Lee, Moontae Lee · LG AI Research · University of Illinois Chicago +1 more

Training-free AI-generated image detector using Haar wavelet sensitivity and self-supervised model cropping robustness

Output Integrity Attack visiongenerative

1 citations PDF

attack arXiv Aug 11, 2025 · Aug 2025

Yerin Hwang, Dongryeol Lee, Taegwan Kang et al. · Seoul National University · LG AI Research

Embeds Aristotelian persuasion techniques in responses to manipulate LLM judges into inflating scores on incorrect math solutions by up to 8%

Prompt Injection nlp