Word-Anchored Temporal Forgery Localization

Current temporal forgery localization (TFL) approaches typically rely on temporal boundary regression or continuous frame-level anomaly detection paradigms to derive candidate forgery proposals. However, they suffer not only from feature granularity misalignment but also from costly computation. To address these issues, we propose word-anchored temporal forgery localization (WAFL), a novel paradigm that shifts the TFL task from temporal regression and continuous localization to discrete word-level binary classification. Specifically, we first analyze the essence of temporal forgeries and identify the minimum meaningful forgery units, word tokens, and then align data preprocessing with the natural linguistic boundaries of speech. To adapt powerful pre-trained foundation backbones for feature extraction, we introduce the forensic feature realignment (FFR) module, mapping representations from the pre-trained semantic space to a discriminative forensic manifold. This allows subsequent lightweight linear classifiers to efficiently perform binary classification and accomplish the TFL task. Furthermore, to overcome the extreme class imbalance inherent to forgery detection, we design the artifact-centric asymmetric (ACA) loss, which breaks the standard precision-recall trade-off by dynamically suppressing overwhelming authentic gradients while asymmetrically prioritizing subtle forensic artifacts. Extensive experiments demonstrate that WAFL significantly outperforms state-of-the-art approaches in localization performance under both in- and cross-dataset settings, while requiring substantially fewer learnable parameters and operating at high computational efficiency.

Key Contributions

Novel word-anchored paradigm reformulating temporal forgery localization as discrete word-token binary classification rather than continuous regression or frame-level anomaly detection
Forensic Feature Realignment (FFR) module that maps pre-trained semantic representations into a discriminative forensic manifold for lightweight linear classification
Artifact-Centric Asymmetric (ACA) loss that handles extreme class imbalance by penalizing fake samples strictly while suppressing easy authentic gradients

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel forensic detection system (WAFL) for localizing AI-generated/manipulated content within videos — directly addresses output integrity by detecting deepfake audio-visual segments. The primary contribution is a new detection methodology (FFR module, ACA loss) for identifying tampered content, which is the core of ML09 (AI-generated content detection / deepfake detection).

Details

Domains

audiovisionmultimodal

Model Types

transformermultimodal

Threat Tags

inference_timedigital

Datasets

LAV-DFAV-Deepfake1M

Applications

2025 0 cit.

Output Integrity Attack

93%