tool 2025

PhantomLint: Principled Detection of Hidden LLM Prompts in Structured Documents

Toby Murray

0 citations

α

Published on arXiv

2508.17884

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

PhantomLint detects hidden LLM prompts across diverse document types and hiding methods with a false positive rate of approximately 0.092%, and successfully identified real-world hidden prompt injections targeting peer review and hiring pipelines.

PhantomLint

Novel technique introduced


Hidden LLM prompts have appeared in online documents with increasing frequency. Their goal is to trigger indirect prompt injection attacks while remaining undetected from human oversight, to manipulate LLM-powered automated document processing systems, against applications as diverse as résumé screeners through to academic peer review processes. Detecting hidden LLM prompts is therefore important for ensuring trust in AI-assisted human decision making. This paper presents the first principled approach to hidden LLM prompt detection in structured documents. We implement our approach in a prototype tool called PhantomLint. We evaluate PhantomLint against a corpus of 3,402 documents, including both PDF and HTML documents, and covering academic paper preprints, CVs, theses and more. We find that our approach is generally applicable against a wide range of methods for hiding LLM prompts from visual inspection, has a very low false positive rate (approx. 0.092%), is practically useful for detecting hidden LLM prompts in real documents, while achieving acceptable performance.


Key Contributions

  • First principled, format-agnostic approach to detecting hidden LLM prompts in structured documents (PDF and HTML) that is independent of the specific hiding technique used
  • PhantomLint open-source prototype tool with very low false positive rate (~0.092%) evaluated on 3,402 real-world documents including preprints, CVs, and theses
  • Empirical demonstration that hidden prompt injection is widespread in real documents, including academic preprints targeting peer review systems and CVs targeting résumé screeners

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
inference_timedigital
Datasets
3,402 PDF and HTML documents (academic preprints, CVs, theses, blog posts)
Applications
résumé screeningacademic peer reviewdocument summarizationai-assisted document analysis