defense 2026

HyperPotter: Spell the Charm of High-Order Interactions in Audio Deepfake Detection

Qing Wen 1, Haohao Li 1, Zhongjie Ba 1,2, Peng Cheng 1,2, Miao He 1, Li Lu 1,2, Kui Ren 1,2,3

0 citations · 58 references · arXiv (Cornell University)

α

Published on arXiv

2602.05670

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Outperforms state-of-the-art methods by 13.96% on 4 challenging cross-domain audio deepfake detection datasets, demonstrating strong generalization to diverse attacks and speakers.

HyperPotter

Novel technique introduced


Advances in AIGC technologies have enabled the synthesis of highly realistic audio deepfakes capable of deceiving human auditory perception. Although numerous audio deepfake detection (ADD) methods have been developed, most rely on local temporal/spectral features or pairwise relations, overlooking high-order interactions (HOIs). HOIs capture discriminative patterns that emerge from multiple feature components beyond their individual contributions. We propose HyperPotter, a hypergraph-based framework that explicitly models these synergistic HOIs through clustering-based hyperedges with class-aware prototype initialization. Extensive experiments demonstrate that HyperPotter surpasses its baseline by an average relative gain of 22.15% across 11 datasets and outperforms state-of-the-art methods by 13.96% on 4 challenging cross-domain datasets, demonstrating superior generalization to diverse attacks and speakers.


Key Contributions

  • HyperPotter: a hypergraph-based framework that explicitly models high-order interactions (HOIs) among audio features for deepfake detection, going beyond pairwise relations
  • Clustering-based hyperedge construction with class-aware prototype initialization to capture synergistic multi-feature patterns
  • Achieves 22.15% average relative improvement over baseline across 11 datasets and 13.96% gain over SOTA on 4 cross-domain datasets

🛡️ Threat Analysis

Output Integrity Attack

Directly addresses detection of AI-generated audio content (audio deepfakes) — this is AI-generated content detection analogous to deepfake image detection, which is explicitly categorized under output integrity and content authenticity.


Details

Domains
audio
Model Types
gnn
Threat Tags
inference_timedigital
Applications
audio deepfake detectionsynthetic speech detection