Fair Deepfake Detectors Can Generalize

View on arXiv ← Back to list

Authors: Harry Cheng, Ming-Hui Liu, Yangyang Guo, Tianyi Wang, Liqiang Nie, Mohan Kankanhalli

Published: 2025-07-03 14:10:02+00:00

AI Summary

This paper introduces DAID, a framework that addresses the conflicting objectives of generalization and fairness in deepfake detection. DAID leverages a causal relationship between fairness and generalization, showing that controlling confounders like data distribution and model capacity improves both.

Abstract

Deepfake detection models face two critical challenges: generalization to unseen manipulations and demographic fairness among population groups. However, existing approaches often demonstrate that these two objectives are inherently conflicting, revealing a trade-off between them. In this paper, we, for the first time, uncover and formally define a causal relationship between fairness and generalization. Building on the back-door adjustment, we show that controlling for confounders (data distribution and model capacity) enables improved generalization via fairness interventions. Motivated by this insight, we propose Demographic Attribute-insensitive Intervention Detection (DAID), a plug-and-play framework composed of: i) Demographic-aware data rebalancing, which employs inverse-propensity weighting and subgroup-wise feature normalization to neutralize distributional biases; and ii) Demographic-agnostic feature aggregation, which uses a novel alignment loss to suppress sensitive-attribute signals. Across three cross-domain benchmarks, DAID consistently achieves superior performance in both fairness and generalization compared to several state-of-the-art detectors, validating both its theoretical foundation and practical effectiveness.

Key findings

DAID consistently outperforms state-of-the-art detectors on three cross-domain benchmarks in both fairness and generalization. The average causal effect (ACE) of fairness interventions on generalization is positive and statistically significant. Ablation studies confirm the importance of both modules in DAID.

Approach

DAID uses a two-module approach: demographic-aware data rebalancing (inverse propensity weighting and subgroup normalization) to mitigate distributional bias and demographic-agnostic feature aggregation (alignment loss) to suppress sensitive-attribute signals. This framework improves both fairness and generalization by controlling for confounders.

Datasets

FaceForensics++, DFDC, DFD, Celeb-DF

Model(s)

Xception, EfficientNet, F3-Net, CADDM (used as backbones with DAID modules added)

Author countries

Singapore, China

← Previous