NOTAI.AI: Explainable Detection of Machine-Generated Text via Curvature and Feature Attribution

We present NOTAI.AI, an explainable framework for machine-generated text detection that extends Fast-DetectGPT by integrating curvature-based signals with neural and stylometric features in a supervised setting. The system combines 17 interpretable features, including Conditional Probability Curvature, ModernBERT detector score, readability metrics, and stylometric cues, within a gradient-boosted tree (XGBoost) meta-classifier to determine whether a text is human- or AI-generated. Furthermore, NOTAI.AI applies Shapley Additive Explanations (SHAP) to provide both local and global feature-level attribution. These attributions are further translated into structured natural-language rationales through an LLM-based explanation layer, which enables user-facing interpretability. The system is deployed as an interactive web application that supports real-time analysis, visual feature inspection, and structured evidence presentation. A web interface allows users to input text and inspect how neural and statistical signals influence the final decision. The source code and demo video are publicly available to support reproducibility.

Key Contributions

Hybrid detection pipeline combining 17 interpretable features (Conditional Probability Curvature, ModernBERT score, readability/stylometric cues) in an XGBoost meta-classifier that outperforms each individual component on a balanced RAID benchmark
SHAP-based local and global feature attribution layer that translates model decisions into structured natural-language rationales via an LLM explanation module
Interactive web application enabling real-time AI-text detection with visual feature inspection and evidence presentation for end users

🛡️ Threat Analysis

Output Integrity Attack

Core contribution is detecting AI-generated text — a classic output integrity / content authenticity problem. The paper builds a detection system that identifies whether text was machine- or human-generated, extending Fast-DetectGPT with a novel ensemble of curvature, neural, and stylometric features.

Details

Domains

nlp

Model Types

transformertraditional_mlllm

Threat Tags

inference_time

Datasets

RAID

Applications

2025 0 cit.

Output Integrity Attack

90%