UnMarker: A Universal Attack on Defensive Image Watermarking

Authors: Andre Kassis, Urs Hengartner

Published: 2024-05-14 07:05:18+00:00

AI Summary

UnMarker is the first practical universal attack against defensive image watermarking, effectively removing watermarks from images without requiring detector feedback or knowledge of the watermarking scheme. It achieves this by employing novel adversarial optimizations to disrupt the spectral amplitudes of watermarked images.

Abstract

Reports regarding the misuse of Generative AI (GenAI) to create deepfakes are frequent. Defensive watermarking enables GenAI providers to hide fingerprints in their images and use them later for deepfake detection. Yet, its potential has not been fully explored. We present UnMarker -- the first practical universal attack on defensive watermarking. Unlike existing attacks, UnMarker requires no detector feedback, no unrealistic knowledge of the watermarking scheme or similar models, and no advanced denoising pipelines that may not be available. Instead, being the product of an in-depth analysis of the watermarking paradigm revealing that robust schemes must construct their watermarks in the spectral amplitudes, UnMarker employs two novel adversarial optimizations to disrupt the spectra of watermarked images, erasing the watermarks. Evaluations against SOTA schemes prove UnMarker's effectiveness. It not only defeats traditional schemes while retaining superior quality compared to existing attacks but also breaks semantic watermarks that alter an image's structure, reducing the best detection rate to $43%$ and rendering them useless. To our knowledge, UnMarker is the first practical attack on semantic watermarks, which have been deemed the future of defensive watermarking. Our findings show that defensive watermarking is not a viable defense against deepfakes, and we urge the community to explore alternatives.


Key findings
UnMarker effectively defeats seven state-of-the-art watermarking schemes, including semantic watermarks previously considered robust. Its effectiveness, combined with the failure of common mitigations, demonstrates that current defensive watermarking is not a viable defense against deepfakes.
Approach
UnMarker analyzes the watermarking paradigm to identify spectral amplitudes as the universal carrier for robust watermarking schemes. It then uses two novel adversarial optimizations: one to disrupt high-frequency amplitudes for non-semantic watermarks and another using optimizable filters to disrupt low-frequency amplitudes for semantic watermarks.
Datasets
LAION-5B, CelebA, FFHQ, COCO
Model(s)
UNKNOWN. The paper describes the attack approach but does not explicitly list models used for watermark detection.
Author countries
Canada