TADA: Training-free Attribution and Out-of-Domain Detection of Audio Deepfakes

Authors: Adriana Stan, David Combei, Dan Oneata, Horia Cucu

Published: 2025-06-06 07:00:23+00:00

AI Summary

This paper introduces TADA, a training-free method for audio deepfake source attribution and out-of-domain detection. It leverages a pre-trained self-supervised learning model and k-Nearest Neighbors (kNN) to achieve high F1-scores for both in-domain (0.93) and out-of-domain (0.84) detection across multiple datasets.

Abstract

Deepfake detection has gained significant attention across audio, text, and image modalities, with high accuracy in distinguishing real from fake. However, identifying the exact source--such as the system or model behind a deepfake--remains a less studied problem. In this paper, we take a significant step forward in audio deepfake model attribution or source tracing by proposing a training-free, green AI approach based entirely on k-Nearest Neighbors (kNN). Leveraging a pre-trained self-supervised learning (SSL) model, we show that grouping samples from the same generator is straightforward--we obtain an 0.93 F1-score across five deepfake datasets. The method also demonstrates strong out-of-domain (OOD) detection, effectively identifying samples from unseen models at an F1-score of 0.84. We further analyse these results in a multi-dimensional approach and provide additional insights. All code and data protocols used in this work are available in our open repository: https://github.com/adrianastan/tada/.


Key findings
TADA achieves a high F1-score of 0.93 for in-domain audio deepfake source attribution and 0.84 for out-of-domain detection. The results demonstrate the effectiveness of using pre-trained SSL models and kNN for this task, highlighting the potential for lightweight and training-free deepfake detection solutions.
Approach
TADA uses a pre-trained self-supervised learning (SSL) model to extract features from audio samples. These features are then used with a k-Nearest Neighbors (kNN) classifier to attribute the source of deepfakes and detect out-of-domain samples without any additional training.
Datasets
ASVspoof 2019, ASVspoof 2021, ASVspoof 5, TIMIT-TTS, Multi-Language Audio Anti-Spoofing Dataset (MLAAD) v5
Model(s)
w2v-bert-2.0 (pre-trained self-supervised learning model), k-Nearest Neighbors (kNN)
Author countries
Romania