Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Gaussian Splatting (GS) has recently emerged as an efficient representation
for rendering 3D scenes from 2D images and has been extended to images, videos,
and dynamic 4D content. However, applying style transfer to GS-based
representations, especially beyond simple color changes, remains challenging.
In this work, we introduce CLIPGaussian, the first unified style transfer
framework that supports text- and image-guided stylization across multiple
modalities: 2D images, videos, 3D objects, and 4D scenes. Our method operates
directly on Gaussian primitives and integrates into existing GS pipelines as a
plug-in module, without requiring large generative models or retraining from
scratch. The CLIPGaussian approach enables joint optimization of color and
geometry in 3D and 4D settings, and achieves temporal coherence in videos,
while preserving the model size. We demonstrate superior style fidelity and
consistency across all tasks, validating CLIPGaussian as a universal and
efficient solution for multimodal style transfer.
Authors (6)
Kornel Howil
Joanna Waczyńska
Piotr Borycki
Tadeusz Dziarmaga
Marcin Mazur
Przemysław Spurek
Key Contributions
Introduces CLIPGaussian, the first unified framework for multimodal style transfer directly on Gaussian Splatting representations. It enables text- and image-guided stylization across 2D images, videos, 3D objects, and 4D scenes, operating as a plug-in module without retraining large generative models.
Business Value
Streamlines the creation of stylized 3D content and dynamic scenes for gaming, VR/AR, and visual effects, reducing complexity and enabling new artistic possibilities.