tool 2026

DependencyAI: Detecting AI Generated Text through Dependency Parsing

Sara Ahmed , Tracy Hammond

0 citations · 26 references

α

Published on arXiv

2602.15514

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Dependency relation labels alone provide a robust and competitive signal for AI-generated text detection across monolingual, multi-generator, and multilingual settings, with systematic cross-domain overprediction observed for certain generators.

DependencyAI

Novel technique introduced


As large language models (LLMs) become increasingly prevalent, reliable methods for detecting AI-generated text are critical for mitigating potential risks. We introduce DependencyAI, a simple and interpretable approach for detecting AI-generated text using only the labels of linguistic dependency relations. Our method achieves competitive performance across monolingual, multi-generator, and multilingual settings. To increase interpretability, we analyze feature importance to reveal syntactic structures that distinguish AI-generated from human-written text. We also observe a systematic overprediction of certain models on unseen domains, suggesting that generator-specific writing styles may affect cross-domain generalization. Overall, our results demonstrate that dependency relations alone provide a robust signal for AI-generated text detection, establishing DependencyAI as a strong linguistically grounded, interpretable, and non-neural network baseline.


Key Contributions

  • DependencyAI: an interpretable AI-text detector using TF-IDF n-grams of dependency relation labels (discarding all lexical content) with a LightGBM classifier
  • Evaluation across monolingual, multi-generator, and multilingual settings on M4GT-Bench, establishing a competitive non-neural interpretable baseline
  • Feature importance analysis identifying which syntactic dependency structures (e.g., nsubj, ROOT bigrams) most distinguish AI-generated from human-written text

🛡️ Threat Analysis

Output Integrity Attack

DependencyAI is an AI-generated text detection system — it classifies whether text was produced by an LLM or a human, directly addressing content authenticity and output integrity. AI-generated text detection is an explicit ML09 use case.


Details

Domains
nlp
Model Types
llmtraditional_ml
Threat Tags
inference_timedigital
Datasets
M4GT-Bench
Applications
ai-generated text detectionacademic integritycontent authenticity