benchmark 2025

Mixture of Detectors: A Compact View of Machine-Generated Text Detection

Lekkala Sai Teja , Annepaka Yadagiri , Arun Kumar Challa , Samatha Reddy Machireddy , Partha Pakray , Chukhu Chunka

0 citations · 49 references · arXiv

α

Published on arXiv

2509.22147

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Introduces BMAS English, a unified multi-task MGT detection benchmark enabling evaluation across binary classification, generator attribution, adversarial robustness against five evasion attack types, and sentence-level boundary detection.

Mixture of Detectors

Novel technique introduced


Large Language Models (LLMs) are gearing up to surpass human creativity. The veracity of the statement needs careful consideration. In recent developments, critical questions arise regarding the authenticity of human work and the preservation of their creativity and innovative abilities. This paper investigates such issues. This paper addresses machine-generated text detection across several scenarios, including document-level binary and multiclass classification or generator attribution, sentence-level segmentation to differentiate between human-AI collaborative text, and adversarial attacks aimed at reducing the detectability of machine-generated text. We introduce a new work called BMAS English: an English language dataset for binary classification of human and machine text, for multiclass classification, which not only identifies machine-generated text but can also try to determine its generator, and Adversarial attack addressing where it is a common act for the mitigation of detection, and Sentence-level segmentation, for predicting the boundaries between human and machine-generated text. We believe that this paper will address previous work in Machine-Generated Text Detection (MGTD) in a more meaningful way.


Key Contributions

  • BMAS English dataset covering four MGT detection tasks: binary classification, multiclass/generator attribution, adversarial attack robustness (five attack types), and sentence-level mixed-text boundary detection
  • Comprehensive comparative evaluation of multiple supervised detector models across all four scenarios within a unified framework
  • Mixed-text boundary detection task with precise word-index boundary annotations for three human-AI interleaving patterns

🛡️ Threat Analysis

Output Integrity Attack

Core focus is AI-generated text detection (binary and multiclass/generator attribution), robustness of detectors against evasion techniques (synonym substitution, homoglyph replacement, paraphrasing), and mixed human-AI text boundary detection — all directly addressing output integrity and content provenance authentication.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
inference_time
Datasets
BMAS English
Applications
ai-generated text detectionllm generator attributionmixed-authorship text segmentationacademic integrity