Prompt-Induced Linguistic Fingerprints for LLM-Generated Fake News Detection

With the rapid development of large language models, the generation of fake news has become increasingly effortless, posing a growing societal threat and underscoring the urgent need for reliable detection methods. Early efforts to identify LLM-generated fake news have predominantly focused on the textual content itself; however, because much of that content may appear coherent and factually consistent, the subtle traces of falsification are often difficult to uncover. Through distributional divergence analysis, we uncover prompt-induced linguistic fingerprints: statistically distinct probability shifts between LLM-generated real and fake news when maliciously prompted. Based on this insight, we propose a novel method named Linguistic Fingerprints Extraction (LIFE). By reconstructing word-level probability distributions, LIFE can find discriminative patterns that facilitate the detection of LLM-generated fake news. To further amplify these fingerprint patterns, we also leverage key-fragment techniques that accentuate subtle linguistic differences, thereby improving detection reliability. Our experiments show that LIFE achieves state-of-the-art performance in LLM-generated fake news and maintains high performance in human-written fake news. The code and data are available at https://anonymous.4open.science/r/LIFE-E86A.

Key Contributions

Discovery of prompt-induced linguistic fingerprints: statistically distinct word-level probability distribution shifts between LLM-generated real and fake news under malicious prompting
LIFE (Linguistic Fingerprints Extraction) method that reconstructs word-level probability distributions to surface discriminative patterns for fake news detection
Key-fragment technique that selectively amplifies subtle linguistic differences in critical content segments, improving detection reliability

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel AI-generated text detection method that forensically identifies LLM-generated fake news via probability distribution divergence and linguistic fingerprint extraction — directly targets output integrity and authenticity of LLM-generated content.

Details

Domains

nlp

Model Types

llmtransformer

Threat Tags

inference_time

Applications

2025 4 cit.

Output Integrity Attack

90%

Prompt-Induced Linguistic Fingerprints for LLM-Generated Fake News Detection

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

SpecDetect: Simple, Fast, and Training-Free Detection of LLM-Generated Text via Spectral Analysis

Detecting LLM-Generated Text with Performance Guarantees

Multi-Hierarchical Feature Detection for Large Language Model Generated Text

SOGPTSpotter: Detecting ChatGPT-Generated Answers on Stack Overflow

On the Effectiveness of LLM-Specific Fine-Tuning for Detecting AI-Generated Text

EMMM, Explain Me My Model! Explainable Machine Generated Text Detection in Dialogues

NOTAI.AI: Explainable Detection of Machine-Generated Text via Curvature and Feature Attribution

Diversity Boosts AI-Generated Text Detection