DeepFake Detection in Dyadic Video Calls using Point of Gaze Tracking
Odin Kohler , Rahul Vijaykumar , Masudul H. Imtiaz
Published on arXiv
2509.25503
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Achieves 82% accuracy on a novel dataset using explainable gaze-based features, establishing the first PoG-based deepfake detection method.
Point-of-Gaze (PoG) Deepfake Detection
Novel technique introduced
With recent advancements in deepfake technology, it is now possible to generate convincing deepfakes in real-time. Unfortunately, malicious actors have started to use this new technology to perform real-time phishing attacks during video meetings. The nature of a video call allows access to what the deepfake is ``seeing,'' that is, the screen displayed to the malicious actor. Using this with the estimated gaze from the malicious actors streamed video enables us to estimate where the deepfake is looking on screen, the point of gaze. Because the point of gaze during conversations is not random and is instead used as a subtle nonverbal communicator, it can be used to detect deepfakes, which are not capable of mimicking this subtle nonverbal communication. This paper proposes a real-time deepfake detection method adapted to this genre of attack, utilizing previously unavailable biometric information. We built our model based on explainable features selected after careful review of research on gaze patterns during dyadic conversations. We then test our model on a novel dataset of our creation, achieving an accuracy of 82\%. This is the first reported method to utilize point-of-gaze tracking for deepfake detection.
Key Contributions
- First reported deepfake detection method using point-of-gaze (PoG) tracking as a biometric feature, exploiting the inability of deepfakes to replicate natural gaze behavior during dyadic conversations
- End-to-end real-time detection pipeline tailored to the video-call deepfake phishing threat model, where access to the screen content enables PoG estimation
- Novel custom dataset of dyadic video calls with ground-truth deepfake labels for evaluation
🛡️ Threat Analysis
Proposes a novel AI-generated content detection method specifically targeting real-time deepfake video in video calls; deepfake detection is canonical ML09 (output integrity / content authenticity).