Chameleon: On the Scene Diversity and Domain Variety of AI-Generated Videos Detection

View on arXiv ← Back to list

Authors: Meiyu Zeng, Xingming Liao, Canyu Chen, Nankai Lin, Zhuowei Wang, Chong Chen, Aimin Yang

Published: 2025-03-09 13:58:43+00:00

AI Summary

This paper introduces Chameleon, a diverse dataset for AI-generated video detection, addressing limitations in existing datasets regarding diversity, complexity, and realism. The dataset is created using multiple generation tools and real-world video sources, encompassing scene switches and dynamic perspective changes.

Abstract

Artificial intelligence generated content (AIGC), known as DeepFakes, has emerged as a growing concern because it is being utilized as a tool for spreading disinformation. While much research exists on identifying AI-generated text and images, research on detecting AI-generated videos is limited. Existing datasets for AI-generated videos detection exhibit limitations in terms of diversity, complexity, and realism. To address these issues, this paper focuses on AI-generated videos detection and constructs a diverse dataset named Chameleon. We generate videos through multiple generation tools and various real video sources. At the same time, we preserve the videos' real-world complexity, including scene switches and dynamic perspective changes, and expand beyond face-centered detection to include human actions and environment generation. Our work bridges the gap between AI-generated dataset construction and real-world forensic needs, offering a valuable benchmark to counteract the evolving threats of AI-generated content.

Key findings

Deep learning methods (NPR, FreqNet, BNet) significantly outperformed large vision models in AI-generated video detection on the Chameleon dataset. The EfficientNet B0 model showed the best performance in the backtracking task, identifying the source real-world video. Performance varied across different video categories and AI generation techniques.

Approach

The authors created the Chameleon dataset by generating videos using multiple AI tools and various real video sources, preserving real-world complexities like scene changes and dynamic perspectives. They then benchmark various deep learning models and large vision models on this dataset for AI-generated video detection.

Datasets

Chameleon dataset (created by the authors), FF++, DFDC, Wilddeepfake, CoReD, CDDB (mentioned for comparison)

Model(s)

NPR, FreqNet, BNet, GPT-4V, GPT-4o, Claude 3.5, Gemini-1.5-flash, ResNet-50, VGG16, DenseNet161, EfficientNet B0, ViT-B/16, Swin Transformer

Author countries

China

← Previous