tool 2026

NOTAI.AI: Explainable Detection of Machine-Generated Text via Curvature and Feature Attribution

Oleksandr Marchenko Breneur , Adelaide Danilov , Aria Nourbakhsh , Salima Lamsiyah

0 citations

α

Published on arXiv

2603.05617

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

The XGBoost ensemble over 17 heterogeneous features achieves higher detection performance than any single component model on a balanced RAID split.

NOTAI.AI

Novel technique introduced


We present NOTAI.AI, an explainable framework for machine-generated text detection that extends Fast-DetectGPT by integrating curvature-based signals with neural and stylometric features in a supervised setting. The system combines 17 interpretable features, including Conditional Probability Curvature, ModernBERT detector score, readability metrics, and stylometric cues, within a gradient-boosted tree (XGBoost) meta-classifier to determine whether a text is human- or AI-generated. Furthermore, NOTAI.AI applies Shapley Additive Explanations (SHAP) to provide both local and global feature-level attribution. These attributions are further translated into structured natural-language rationales through an LLM-based explanation layer, which enables user-facing interpretability. The system is deployed as an interactive web application that supports real-time analysis, visual feature inspection, and structured evidence presentation. A web interface allows users to input text and inspect how neural and statistical signals influence the final decision. The source code and demo video are publicly available to support reproducibility.


Key Contributions

  • Hybrid detection pipeline combining 17 interpretable features (Conditional Probability Curvature, ModernBERT score, readability/stylometric cues) in an XGBoost meta-classifier that outperforms each individual component on a balanced RAID benchmark
  • SHAP-based local and global feature attribution layer that translates model decisions into structured natural-language rationales via an LLM explanation module
  • Interactive web application enabling real-time AI-text detection with visual feature inspection and evidence presentation for end users

🛡️ Threat Analysis

Output Integrity Attack

Core contribution is detecting AI-generated text — a classic output integrity / content authenticity problem. The paper builds a detection system that identifies whether text was machine- or human-generated, extending Fast-DetectGPT with a novel ensemble of curvature, neural, and stylometric features.


Details

Domains
nlp
Model Types
transformertraditional_mlllm
Threat Tags
inference_time
Datasets
RAID
Applications
ai-generated text detectionacademic integrityjournalismcontent authentication