TOPFORMER: Topology-Aware Authorship Attribution of Deepfake Texts with Diverse Writing Styles

Authors: Adaku Uchendu, Thai Le, Dongwon Lee

Published: 2023-09-22 15:32:49+00:00

AI Summary

This paper proposes TopFormer, a novel deepfake text detection model that integrates Topological Data Analysis (TDA) with a Transformer-based architecture (RoBERTa). TopFormer improves authorship attribution by capturing both contextual linguistic features (from the Transformer) and structural linguistic patterns (from TDA), leading to a significant performance increase.

Abstract

Recent advances in Large Language Models (LLMs) have enabled the generation of open-ended high-quality texts, that are non-trivial to distinguish from human-written texts. We refer to such LLM-generated texts as deepfake texts. There are currently over 72K text generation models in the huggingface model repo. As such, users with malicious intent can easily use these open-sourced LLMs to generate harmful texts and dis/misinformation at scale. To mitigate this problem, a computational method to determine if a given text is a deepfake text or not is desired--i.e., Turing Test (TT). In particular, in this work, we investigate the more general version of the problem, known as Authorship Attribution (AA), in a multi-class setting--i.e., not only determining if a given text is a deepfake text or not but also being able to pinpoint which LLM is the author. We propose TopFormer to improve existing AA solutions by capturing more linguistic patterns in deepfake texts by including a Topological Data Analysis (TDA) layer in the Transformer-based model. We show the benefits of having a TDA layer when dealing with imbalanced, and multi-style datasets, by extracting TDA features from the reshaped $pooled_output$ of our backbone as input. This Transformer-based model captures contextual representations (i.e., semantic and syntactic linguistic features), while TDA captures the shape and structure of data (i.e., linguistic structures). Finally, TopFormer, outperforms all baselines in all 3 datasets, achieving up to 7% increase in Macro F1 score. Our code and datasets are available at: https://github.com/AdaUchendu/topformer


Key findings
TopFormer consistently outperforms baseline models across three datasets, achieving up to a 7% increase in Macro F1 score. The improvement is attributed to TDA's ability to capture structural linguistic features complementary to the contextual representations learned by the Transformer. Results on homogeneous datasets show comparable performance to the baseline, indicating that TDA enhances performance particularly in heterogeneous, multi-style data.
Approach
TopFormer enhances a pre-trained RoBERTa model by adding a TDA layer. TDA features, extracted from a reshaped version of RoBERTa's pooled output, are concatenated with the model's regularized output before final classification. This combines contextual and structural linguistic information for improved deepfake text detection.
Datasets
OpenLLMText, SynSciPass, Mixset, TuringBench, M4
Model(s)
RoBERTa, BERT, GPT-who, Contra-BERT, Gaussian-RoBERTa, TopFormer (RoBERTa with TDA layer), TopFormer with BERT as backbone
Author countries
USA