DFA-CON: A Contrastive Learning Approach for Detecting Copyright Infringement in DeepFake Art

Authors: Haroon Wahab, Hassan Ugail, Irfan Mehmood

Published: 2025-05-13 13:23:52+00:00

AI Summary

DFA-CON is a novel contrastive learning framework designed to detect copyright infringement in AI-generated art. It learns a discriminative representation space showing affinity between original and forged artworks, outperforming existing pretrained models on the DeepfakeArt Challenge benchmark.

Abstract

Recent proliferation of generative AI tools for visual content creation-particularly in the context of visual artworks-has raised serious concerns about copyright infringement and forgery. The large-scale datasets used to train these models often contain a mixture of copyrighted and non-copyrighted artworks. Given the tendency of generative models to memorize training patterns, they are susceptible to varying degrees of copyright violation. Building on the recently proposed DeepfakeArt Challenge benchmark, this work introduces DFA-CON, a contrastive learning framework designed to detect copyright-infringing or forged AI-generated art. DFA-CON learns a discriminative representation space, posing affinity among original artworks and their forged counterparts within a contrastive learning framework. The model is trained across multiple attack types, including inpainting, style transfer, adversarial perturbation, and cutmix. Evaluation results demonstrate robust detection performance across most attack types, outperforming recent pretrained foundation models. Code and model checkpoints will be released publicly upon acceptance.


Key findings
DFA-CON significantly outperforms several pretrained foundation models in detecting copyright infringement across various attack types (inpainting, style transfer, adversarial perturbation). However, performance on CutMix attacks was lower than expected. Encoder-level representations proved most effective for the detection task.
Approach
DFA-CON uses a supervised contrastive learning approach. It trains a ResNet-50 based encoder and a projection head to create embeddings for original and forged artworks. The model learns to cluster similar images (originals and their forgeries) together while separating dissimilar ones using a supervised contrastive loss function.
Datasets
DeepfakeArt Challenge benchmark dataset
Model(s)
ResNet-50 (encoder), Linear and MLP projection heads
Author countries
UK