tool 2026

Automatic detection of Gen-AI texts: A comparative framework of neural models

Cristian Buttaro , Irene Amerini

0 citations

α

Published on arXiv

2603.18750

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Supervised neural detectors achieve more stable and robust performance than commercial tools across English, Italian, and domain-shifted evaluation scenarios


The rapid proliferation of Large Language Models has significantly increased the difficulty of distinguishing between human-written and AI generated texts, raising critical issues across academic, editorial, and social domains. This paper investigates the problem of AI generated text detection through the design, implementation, and comparative evaluation of multiple machine learning based detectors. Four neural architectures are developed and analyzed: a Multilayer Perceptron, a one-dimensional Convolutional Neural Network, a MobileNet-based CNN, and a Transformer model. The proposed models are benchmarked against widely used online detectors, including ZeroGPT, GPTZero, QuillBot, Originality.AI, Sapling, IsGen, Rephrase, and Writer. Experiments are conducted on the COLING Multilingual Dataset, considering both English and Italian configurations, as well as on an original thematic dataset focused on Art and Mental Health. Results show that supervised detectors achieve more stable and robust performance than commercial tools across different languages and domains, highlighting key strengths and limitations of current detection strategies.


Key Contributions

  • Four neural architectures (MLP, CNN 1D, MobileNet-based CNN, Transformer) for AI-generated text detection
  • Comparative benchmark against 8 commercial detectors (ZeroGPT, GPTZero, QuillBot, Originality.AI, Sapling, IsGen, Rephrase, Writer)
  • Multilingual and domain-specific evaluation showing supervised models outperform commercial tools in stability and robustness

🛡️ Threat Analysis

Output Integrity Attack

Paper focuses on detecting AI-generated text to verify content authenticity and distinguish human-written from LLM-generated outputs — this is output integrity and content provenance, the core of ML09.


Details

Domains
nlp
Model Types
cnntransformertraditional_ml
Threat Tags
inference_time
Datasets
COLING Multilingual DatasetArt and Mental Health dataset (original)
Applications
ai-generated text detectioncontent authenticity verificationacademic plagiarism detection