benchmark 2025

LLMTrace: A Corpus for Classification and Fine-Grained Localization of AI-Written Text

Irina Tolstykh , Aleksandra Tsybina , Sergey Yakubson , Maksim Kuprashevich

0 citations · 47 references · arXiv

α

Published on arXiv

2509.21269

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

LLMTrace provides the first corpus with character-level AI-authorship annotations for mixed human-AI documents, enabling fine-grained interval localization of AI-generated segments across two languages.

LLMTrace

Novel technique introduced


The widespread use of human-like text from Large Language Models (LLMs) necessitates the development of robust detection systems. However, progress is limited by a critical lack of suitable training data; existing datasets are often generated with outdated models, are predominantly in English, and fail to address the increasingly common scenario of mixed human-AI authorship. Crucially, while some datasets address mixed authorship, none provide the character-level annotations required for the precise localization of AI-generated segments within a text. To address these gaps, we introduce LLMTrace, a new large-scale, bilingual (English and Russian) corpus for AI-generated text detection. Constructed using a diverse range of modern proprietary and open-source LLMs, our dataset is designed to support two key tasks: traditional full-text binary classification (human vs. AI) and the novel task of AI-generated interval detection, facilitated by character-level annotations. We believe LLMTrace will serve as a vital resource for training and evaluating the next generation of more nuanced and practical AI detection models. The project page is available at \href{https://sweetdream779.github.io/LLMTrace-info/}{iitolstykh/LLMTrace}.


Key Contributions

  • Large-scale bilingual (English and Russian) corpus for AI-generated text detection covering modern proprietary and open-source LLMs
  • Character-level annotations enabling the novel task of AI-generated interval detection within mixed human-AI authored documents
  • Benchmark supporting both full-text binary classification and fine-grained localization evaluation tasks

🛡️ Threat Analysis

Output Integrity Attack

Directly targets AI-generated content detection — the corpus supports training and evaluating systems that determine whether text was produced by an LLM, including the novel interval-level localization task that identifies which specific segments within a document are AI-authored. This is core output integrity / content provenance work.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
inference_time
Datasets
LLMTrace
Applications
ai-generated text detectionmixed authorship detectionnlp content provenance