Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Content watermarking is an important tool for the authentication and
copyright protection of digital media. However, it is unclear whether existing
watermarks are robust against adversarial attacks. We present the winning
solution to the NeurIPS 2024 Erasing the Invisible challenge, which
stress-tests watermark robustness under varying degrees of adversary knowledge.
The challenge consisted of two tracks: a black-box and beige-box track,
depending on whether the adversary knows which watermarking method was used by
the provider. For the beige-box track, we leverage an adaptive VAE-based
evasion attack, with a test-time optimization and color-contrast restoration in
CIELAB space to preserve the image's quality. For the black-box track, we first
cluster images based on their artifacts in the spatial or frequency-domain.
Then, we apply image-to-image diffusion models with controlled noise injection
and semantic priors from ChatGPT-generated captions to each cluster with
optimized parameter settings. Empirical evaluations demonstrate that our method
successfully achieves near-perfect watermark removal (95.7%) with negligible
impact on the residual image's quality. We hope that our attacks inspire the
development of more robust image watermarking methods.