Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Human shape editing enables controllable transformation of a person's body
shape, such as thin, muscular, or overweight, while preserving pose, identity,
clothing, and background. Unlike human pose editing, which has advanced
rapidly, shape editing remains relatively under-explored. Current approaches
typically rely on 3D morphable models or image warping, often introducing
unrealistic body proportions, texture distortions, and background
inconsistencies due to alignment errors and deformations. A key limitation is
the lack of large-scale, publicly available datasets for training and
evaluating body shape manipulation methods. In this work, we introduce the
first large-scale dataset of 18,573 images across 1523 subjects, specifically
designed for controlled human shape editing. It features diverse variations in
body shape, including fat, muscular and thin, captured under consistent
identity, clothing, and background conditions. Using this dataset, we propose
Odo, an end-to-end diffusion-based method that enables realistic and intuitive
body reshaping guided by simple semantic attributes. Our approach combines a
frozen UNet that preserves fine-grained appearance and background details from
the input image with a ControlNet that guides shape transformation using target
SMPL depth maps. Extensive experiments demonstrate that our method outperforms
prior approaches, achieving per-vertex reconstruction errors as low as 7.5mm,
significantly lower than the 13.6mm observed in baseline methods, while
producing realistic results that accurately match the desired target shapes.