Overview of PAN 2026: Voight-Kampff Generative AI Detection, Text Watermarking, Multi-Author Writing Style Analysis, Generative Plagiarism Detection, and Reasoning Trajectory Detection
Janek Bevendorff 1, Maik Fröbe 2, André Greiner-Petter 3, Andreas Jakoby 1, Maximilian Mayerl 4, Preslav Nakov 5, Henry Plutz 6, Martin Potthast 6,7,8, Benno Stein 1, Minh Ngoc Ta 5, Yuxia Wang 9, Eva Zangerle 10
2 Friedrich Schiller University Jena
4 University of Applied Sciences BFI
5 Mohamed bin Zayed University of Artificial Intelligence
8 ScaDS.AI
Published on arXiv
2602.09147
Output Integrity Attack
OWASP ML Top 10 — ML09
Prompt Injection
OWASP LLM Top 10 — LLM01
Key Finding
PAN 2026 organizes five shared evaluation tasks spanning AI-text detection, watermarking robustness, and LLM reasoning safety, continuing a tradition of 1,100+ reproducible submissions since 2012.
The goal of the PAN workshop is to advance computational stylometry and text forensics via objective and reproducible evaluation. In 2026, we run the following five tasks: (1) Voight-Kampff Generative AI Detection, particularly in mixed and obfuscated authorship scenarios, (2) Text Watermarking, a new task that aims to find new and benchmark the robustness of existing text watermarking schemes, (3) Multi-author Writing Style Analysis, a continued task that aims to find positions of authorship change, (4) Generative Plagiarism Detection, a continued task that targets source retrieval and text alignment between generated text and source documents, and (5) Reasoning Trajectory Detection, a new task that deals with source detection and safety detection of LLM-generated or human-written reasoning trajectories. As in previous years, PAN invites software submissions as easy-to-reproduce Docker containers for most of the tasks. Since PAN 2012, more than 1,100 submissions have been made this way via the TIRA experimentation platform.
Key Contributions
- Introduces a new text watermarking shared task that benchmarks robustness of existing watermarking schemes against obfuscation
- Introduces Reasoning Trajectory Detection, a new task for source and safety detection of LLM-generated or human-written reasoning chains
- Continues established AI-generated content detection and multi-author stylometry tasks under reproducible Docker-based evaluation via the TIRA platform
🛡️ Threat Analysis
Three of the five tasks directly target output integrity: Voight-Kampff AI-generated text detection, a new text watermarking robustness benchmarking task, and generative plagiarism detection — all concern authenticating, watermarking, or detecting AI-generated content provenance.