ML Security Papers

attack arXiv Jan 29, 2026 · 9w ago

Devanshu Sahoo, Manish Prasad, Vasudev Majhi et al. · BITS Pilani · Trustwise +1 more

Embeds adversarial directives in AST comment nodes to hijack LLM-based code graders, achieving >95% manipulation success across 9 SOTA models

Prompt Injection nlp

defense arXiv Dec 22, 2025 · Dec 2025

Akshaj Prashanth Rao, Advait Singh, Saumya Kumaar Saksena et al. · Birla Institute of Technology and Science · Trustwise

Lightweight TF-IDF + Linear SVM multi-stage pipeline defends LLMs against prompt injection and jailbreaks with 10x lower latency than ShieldGemma

Prompt Injection nlp

1 citations PDF

Latest papers