Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Recent advances in computational pathology have led to the emergence of
numerous foundation models. These models typically rely on general-purpose
encoders with multi-instance learning for whole slide image (WSI)
classification or apply multimodal approaches to generate reports directly from
images. However, these models cannot emulate the diagnostic approach of
pathologists, who systematically examine slides at low magnification to obtain
an overview before progressively zooming in on suspicious regions to formulate
comprehensive diagnoses. Instead, existing models directly output final
diagnoses without revealing the underlying reasoning process. To address this
gap, we introduce CPathAgent, an innovative agent-based approach that mimics
pathologists' diagnostic workflow by autonomously navigating across WSI based
on observed visual features, thereby generating substantially more transparent
and interpretable diagnostic summaries. To achieve this, we develop a
multi-stage training strategy that unifies patch-level, region-level, and
WSI-level capabilities within a single model, which is essential for
replicating how pathologists understand and reason across diverse image scales.
Additionally, we construct PathMMU-HR2, the first expert-validated benchmark
for large region analysis. This represents a critical intermediate scale
between patches and whole slides, reflecting a key clinical reality where
pathologists typically examine several key large regions rather than entire
slides at once. Extensive experiments demonstrate that CPathAgent consistently
outperforms existing approaches across benchmarks at three different image
scales, validating the effectiveness of our agent-based diagnostic approach and
highlighting a promising direction for computational pathology.
Authors (8)
Yuxuan Sun
Yixuan Si
Chenglu Zhu
Kai Zhang
Zhongyi Shui
Bowen Ding
+2 more
Key Contributions
CPathAgent is an agent-based foundation model for pathology image analysis that mimics pathologists' diagnostic logic by autonomously navigating WSIs. It generates transparent and interpretable diagnostic summaries, addressing the lack of reasoning in existing models.
Business Value
Enhances diagnostic accuracy and trust in AI-powered pathology by providing interpretable results, potentially speeding up diagnosis and improving patient outcomes.