Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Deepfakes pose significant security and privacy threats through malicious
facial manipulations. While robust watermarking can aid in authenticity
verification and source tracking, existing methods often lack the sufficient
robustness against Deepfake manipulations. Diffusion models have demonstrated
remarkable performance in image generation, enabling the seamless fusion of
watermark with image during generation. In this study, we propose a novel
robust watermarking framework based on diffusion model, called DiffMark. By
modifying the training and sampling scheme, we take the facial image and
watermark as conditions to guide the diffusion model to progressively denoise
and generate corresponding watermarked image. In the construction of facial
condition, we weight the facial image by a timestep-dependent factor that
gradually reduces the guidance intensity with the decrease of noise, thus
better adapting to the sampling process of diffusion model. To achieve the
fusion of watermark condition, we introduce a cross information fusion (CIF)
module that leverages a learnable embedding table to adaptively extract
watermark features and integrates them with image features via cross-attention.
To enhance the robustness of the watermark against Deepfake manipulations, we
integrate a frozen autoencoder during training phase to simulate Deepfake
manipulations. Additionally, we introduce Deepfake-resistant guidance that
employs specific Deepfake model to adversarially guide the diffusion sampling
process to generate more robust watermarked images. Experimental results
demonstrate the effectiveness of the proposed DiffMark on typical Deepfakes.
Our code will be available at https://github.com/vpsg-research/DiffMark.
Key Contributions
DiffMark proposes a novel diffusion-based robust watermarking framework designed to combat deepfakes. By integrating watermarks during the diffusion generation process with timestep-dependent guidance, it creates watermarked images that are more resilient to facial manipulations and deepfake attacks, aiding in authenticity verification and source tracking.
Business Value
Enhances trust in digital media by providing a robust mechanism to verify authenticity and deter malicious manipulation, crucial for journalism, legal evidence, and secure communication.