defense 2026

Tutor-Student Reinforcement Learning: A Dynamic Curriculum for Robust Deepfake Detection

Zhanhe Lei 1, Zhongyuan Wang 1, Jikang Cheng 2, Baojin Huang 3, Yuhong Yang 1, Zhen Han 1, Chao Liang 1, Dengpan Ye 4

0 citations · The IEEE/CVF Conference on Com...

α

Published on arXiv

2603.24139

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Adaptive curriculum improves Student detector's generalization against unseen manipulation techniques compared to traditional training

TSRL

Novel technique introduced


Standard supervised training for deepfake detection treats all samples with uniform importance, which can be suboptimal for learning robust and generalizable features. In this work, we propose a novel Tutor-Student Reinforcement Learning (TSRL) framework to dynamically optimize the training curriculum. Our method models the training process as a Markov Decision Process where a ``Tutor'' agent learns to guide a ``Student'' (the deepfake detector). The Tutor, implemented as a Proximal Policy Optimization (PPO) agent, observes a rich state representation for each training sample, encapsulating not only its visual features but also its historical learning dynamics, such as EMA loss and forgetting counts. Based on this state, the Tutor takes an action by assigning a continuous weight (0-1) to the sample's loss, thereby dynamically re-weighting the training batch. The Tutor is rewarded based on the Student's immediate performance change, specifically rewarding transitions from incorrect to correct predictions. This strategy encourages the Tutor to learn a curriculum that prioritizes high-value samples, such as hard-but-learnable examples, leading to a more efficient and effective training process. We demonstrate that this adaptive curriculum improves the Student's generalization capabilities against unseen manipulation techniques compared to traditional training methods. Code is available at https://github.com/wannac1/TSRL.


Key Contributions

  • Tutor-Student RL framework where a PPO agent dynamically re-weights training samples based on learning dynamics
  • State representation incorporating visual features, EMA loss, and forgetting counts to identify high-value training samples
  • Improved generalization to unseen deepfake manipulation techniques compared to uniform supervised training

🛡️ Threat Analysis

Output Integrity Attack

Defends against deepfake content by improving detection of AI-generated/manipulated media. Deepfake detection is a core ML09 task (AI-generated content detection and output integrity).


Details

Domains
visiongenerative
Model Types
cnngan
Threat Tags
inference_time
Applications
deepfake detectionsynthetic media detection