Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: In-context learning (ICL) offers a promising paradigm for universal medical
image analysis, enabling models to perform diverse image processing tasks
without retraining. However, current ICL models for medical imaging remain
limited in two critical aspects: they cannot simultaneously achieve
high-fidelity predictions and global anatomical understanding, and there is no
unified model trained across diverse medical imaging tasks (e.g., segmentation
and enhancement) and anatomical regions. As a result, the full potential of ICL
in medical imaging remains underexplored. Thus, we present \textbf{Medverse}, a
universal ICL model for 3D medical imaging, trained on 22 datasets covering
diverse tasks in universal image segmentation, transformation, and enhancement
across multiple organs, imaging modalities, and clinical centers. Medverse
employs a next-scale autoregressive in-context learning framework that
progressively refines predictions from coarse to fine, generating consistent,
full-resolution volumetric outputs and enabling multi-scale anatomical
awareness. We further propose a blockwise cross-attention module that
facilitates long-range interactions between context and target inputs while
preserving computational efficiency through spatial sparsity. Medverse is
extensively evaluated on a broad collection of held-out datasets covering
previously unseen clinical centers, organs, species, and imaging modalities.
Results demonstrate that Medverse substantially outperforms existing ICL
baselines and establishes a novel paradigm for in-context learning. Code and
model weights will be made publicly available. Our model are publicly available
at https://github.com/jiesihu/Medverse.