Neural Deepfake Detection with Factual Structure of Text

View on arXiv ← Back to list

Authors: Wanjun Zhong, Duyu Tang, Zenan Xu, Ruize Wang, Nan Duan, Ming Zhou, Jiahai Wang, Jian Yin

Published: 2020-10-15 02:35:31+00:00

AI Summary

This paper proposes FAST, a graph-based model for text deepfake detection. It leverages the factual structure of a document, represented as an entity graph, to learn sentence and document representations using a graph neural network and LSTM, significantly improving upon RoBERTa baselines.

Abstract

Deepfake detection, the task of automatically discriminating machine-generated text, is increasingly critical with recent advances in natural language generative models. Existing approaches to deepfake detection typically represent documents with coarse-grained representations. However, they struggle to capture factual structures of documents, which is a discriminative factor between machine-generated and human-written text according to our statistical analysis. To address this, we propose a graph-based model that utilizes the factual structure of a document for deepfake detection of text. Our approach represents the factual structure of a given document as an entity graph, which is further utilized to learn sentence representations with a graph neural network. Sentence representations are then composed to a document representation for making predictions, where consistent relations between neighboring sentences are sequentially modeled. Results of experiments on two public deepfake datasets show that our approach significantly improves strong base models built with RoBERTa. Model analysis further indicates that our model can distinguish the difference in the factual structure between machine-generated text and human-written text.

Key findings

FAST significantly outperforms RoBERTa and other strong transformer-based baselines on both datasets. Ablation studies demonstrate the contribution of each component, particularly the use of graph-based factual structure modeling and coherence tracking. The model effectively distinguishes differences in factual consistency between human-written and machine-generated text.

Approach

FAST represents document factual structure as an entity graph. It uses a graph convolutional network to learn node representations, incorporating both contextual word embeddings and external knowledge. Sentence representations are then aggregated, considering sentence coherence, to generate a document representation for classification.

Datasets

News-style GROVER-generated dataset and Webtext-style GPT2-generated dataset.

Model(s)

RoBERTa, Graph Convolutional Network (GCN), LSTM, Next Sentence Prediction (NSP) model (RoBERTa-based)

Author countries

China

← Previous