CMUA-Watermark: A Cross-Model Universal Adversarial Watermark for Combating Deepfakes

Authors: Hao Huang, Yongtao Wang, Zhaoyu Chen, Yuze Zhang, Yuheng Li, Zhi Tang, Wei Chu, Jingdong Chen, Weisi Lin, Kai-Kuang Ma

Published: 2021-05-23 07:28:36+00:00

AI Summary

This paper proposes CMUA-Watermark, a cross-model universal adversarial watermark to combat deepfakes. It addresses the low transferability of existing adversarial watermarks by iteratively attacking multiple deepfake models and using a two-level perturbation fusion strategy.

Abstract

Malicious applications of deepfakes (i.e., technologies generating target facial attributes or entire faces from facial images) have posed a huge threat to individuals' reputation and security. To mitigate these threats, recent studies have proposed adversarial watermarks to combat deepfake models, leading them to generate distorted outputs. Despite achieving impressive results, these adversarial watermarks have low image-level and model-level transferability, meaning that they can protect only one facial image from one specific deepfake model. To address these issues, we propose a novel solution that can generate a Cross-Model Universal Adversarial Watermark (CMUA-Watermark), protecting a large number of facial images from multiple deepfake models. Specifically, we begin by proposing a cross-model universal attack pipeline that attacks multiple deepfake models iteratively. Then, we design a two-level perturbation fusion strategy to alleviate the conflict between the adversarial watermarks generated by different facial images and models. Moreover, we address the key problem in cross-model optimization with a heuristic approach to automatically find the suitable attack step sizes for different models, further weakening the model-level conflict. Finally, we introduce a more reasonable and comprehensive evaluation method to fully test the proposed method and compare it with existing ones. Extensive experimental results demonstrate that the proposed CMUA-Watermark can effectively distort the fake facial images generated by multiple deepfake models while achieving a better performance than existing methods.


Key findings
CMUA-Watermark effectively distorts fake facial images generated by multiple deepfake models. It achieves better performance than existing methods, demonstrated by high success rates in protecting images and significantly reducing the confidence scores of liveness detection systems. The ablation study shows that both perturbation fusion and automatic step size tuning are crucial for the method's success.
Approach
The approach uses a cross-model universal attack pipeline that iteratively attacks multiple deepfake models. A two-level perturbation fusion strategy (image-level and model-level) alleviates conflicts between watermarks, and a heuristic approach using TPE automatically finds suitable attack step sizes for different models.
Datasets
CelebA test set (128 images for training, rest for testing), LFW dataset, Film100 dataset (100 images from films)
Model(s)
StarGAN, AGGAN, AttGAN, HiSD
Author countries
China, Singapore