defense 2025

MGT-Prism: Enhancing Domain Generalization for Machine-Generated Text Detection via Spectral Alignment

Shengchao Liu 1, Xiaoming Liu 1, Chengzhengxu Li 1, Zhaohan Zhang 2, Guoxin Ma 1, Yu Lan 1, Shuai Xiao 3

0 citations

α

Published on arXiv

2508.13768

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

MGT-Prism outperforms state-of-the-art baselines by an average of 0.90% accuracy and 0.92% F1 score across 11 test datasets in three domain-generalization scenarios

MGT-Prism

Novel technique introduced


Large Language Models have shown growing ability to generate fluent and coherent texts that are highly similar to the writing style of humans. Current detectors for Machine-Generated Text (MGT) perform well when they are trained and tested in the same domain but generalize poorly to unseen domains, due to domain shift between data from different sources. In this work, we propose MGT-Prism, an MGT detection method from the perspective of the frequency domain for better domain generalization. Our key insight stems from analyzing text representations in the frequency domain, where we observe consistent spectral patterns across diverse domains, while significant discrepancies in magnitude emerge between MGT and human-written texts (HWTs). The observation initiates the design of a low frequency domain filtering module for filtering out the document-level features that are sensitive to domain shift, and a dynamic spectrum alignment strategy to extract the task-specific and domain-invariant features for improving the detector's performance in domain generalization. Extensive experiments demonstrate that MGT-Prism outperforms state-of-the-art baselines by an average of 0.90% in accuracy and 0.92% in F1 score on 11 test datasets across three domain-generalization scenarios.


Key Contributions

  • Empirical observation that LLM-generated vs. human-written text exhibit consistent, discriminative spectral magnitude discrepancies across diverse domains in the frequency domain
  • Low-frequency domain filtering module that removes domain-sensitive document-level features to reduce domain shift
  • Dynamic spectrum alignment strategy that extracts task-specific, domain-invariant features for improved cross-domain MGT detection generalization

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel machine-generated text detection method (MGT-Prism) using DFT-based spectral analysis — directly addresses AI-generated content detection with a new forensic architecture, not merely applying existing methods to a new domain.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
inference_time
Datasets
11 cross-domain MGT detection test datasets (3 domain-generalization scenarios)
Applications
machine-generated text detectionai content authentication