Latest papers

2 papers
attack arXiv Jan 29, 2026 · 9w ago

The Compliance Paradox: Semantic-Instruction Decoupling in Automated Academic Code Evaluation

Devanshu Sahoo, Manish Prasad, Vasudev Majhi et al. · BITS Pilani · Trustwise +1 more

Embeds adversarial directives in AST comment nodes to hijack LLM-based code graders, achieving >95% manipulation success across 9 SOTA models

Prompt Injection nlp
PDF
attack arXiv Dec 11, 2025 · Dec 2025

When Reject Turns into Accept: Quantifying the Vulnerability of LLM-Based Scientific Reviewers to Indirect Prompt Injection

Devanshu Sahoo, Manish Prasad, Vasudev Majhi et al. · BITS Pilani · KIIT University

Indirect prompt injection via adversarial PDF manipulation flips LLM paper-review decisions from Reject to Accept at up to 86% success rate

Prompt Injection nlp
2 citations PDF Code