ML Security Papers

Latest papers

7 papers

benchmark arXiv Feb 8, 2026 · 8w ago

Deepfake Synthesis vs. Detection: An Uneven Contest

Md. Tarek Hasan, Sanjay Saha, Shaojing Fan et al. · United International University · National University of Singapore +1 more

Benchmarks state-of-the-art deepfake detectors against modern synthesis methods, revealing critical detection gaps including poor human performance

Output Integrity Attack visiongenerative

PDF

defense arXiv Dec 25, 2025 · Dec 2025

FUSE: Unifying Spectral and Semantic Cues for Robust AI-Generated Image Detection

Md. Zahid Hossain, Most. Sharmin Sultana Samu, Md. Kamrozzaman Bhuiyan et al. · Ahsanullah University of Science and Technology · BRAC University +2 more

Proposes FUSE, a hybrid FFT-spectral and CLIP-semantic detector for AI-generated images achieving SOTA on Chameleon benchmark

Output Integrity Attack visiongenerative

PDF

defense arXiv Dec 18, 2025 · Dec 2025

Beyond the Benchmark: Innovative Defenses Against Prompt Injection Attacks

Safwan Shaheer, G.M. Refatul Islam, Mohammad Rafid Hamid et al. · BRAC University

Defends LLaMA models from goal-hijacking via iterative CoT-seeded prompt defense generation, reducing attack success rates

Prompt Injection nlp

PDF

defense arXiv Dec 14, 2025 · Dec 2025

Detecting Prompt Injection Attacks Against Application Using Classifiers

Safwan Shaheer, G. M. Refatul Islam, Mohammad Rafid Hamid et al. · BRAC University

Trains LSTM, FNN, Random Forest, and Naive Bayes classifiers to detect prompt injection attacks in LLM-integrated web applications

Prompt Injection nlp

PDF

defense arXiv Dec 14, 2025 · Dec 2025

The Laminar Flow Hypothesis: Detecting Jailbreaks via Semantic Turbulence in Large Language Models

Md. Hasib Ur Rahman · BRAC University

Detects LLM jailbreaks by measuring latent-space trajectory variance, exposing distinct conflict signatures in RLHF-aligned vs. SFT models

Prompt Injection nlp

PDF

attack arXiv Nov 13, 2025 · Nov 2025

destroR: Attacking Transfer Models with Obfuscous Examples to Discard Perplexity

Saadat Rafid Ahmed, Rubayet Shareen, Radoan Sharkar et al. · BRAC University

Proposes adversarial text attacks on NLP transfer models using obfuscated high-perplexity examples, including Bangla language

Input Manipulation Attack nlp

PDF

survey arXiv Oct 7, 2025 · Oct 2025

A Survey on Agentic Security: Applications, Threats and Defenses

Asif Shahriar, Md Nafiu Rahman, Sadif Ahmed et al. · BRAC University · Qatar Computing Research Institute

First holistic survey of LLM agentic security covering 160+ papers across applications, threats, and defenses

Prompt Injection Excessive Agency Insecure Plugin Design nlp

8 citations PDF Code

Latest papers

Deepfake Synthesis vs. Detection: An Uneven Contest

FUSE: Unifying Spectral and Semantic Cues for Robust AI-Generated Image Detection

Beyond the Benchmark: Innovative Defenses Against Prompt Injection Attacks

Detecting Prompt Injection Attacks Against Application Using Classifiers

The Laminar Flow Hypothesis: Detecting Jailbreaks via Semantic Turbulence in Large Language Models

destroR: Attacking Transfer Models with Obfuscous Examples to Discard Perplexity

A Survey on Agentic Security: Applications, Threats and Defenses

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue