📄 Abstract
Abstract: Medical texts, particularly electronic medical records (EMRs), are a
cornerstone of modern healthcare, capturing critical information about patient
care, diagnoses, and treatments. These texts hold immense potential for
advancing clinical decision-making and healthcare analytics. However, their
unstructured nature, domain-specific language, and variability across contexts
make automated understanding an intricate challenge. Despite the advancements
in natural language processing, existing methods often treat all data as
equally challenging, ignoring the inherent differences in complexity across
clinical records. This oversight limits the ability of models to effectively
generalize and perform well on rare or complex cases. In this paper, we present
TACL (Threshold-Adaptive Curriculum Learning), a novel framework designed to
address these challenges by rethinking how models interact with medical texts
during training. Inspired by the principle of progressive learning, TACL
dynamically adjusts the training process based on the complexity of individual
samples. By categorizing data into difficulty levels and prioritizing simpler
cases early in training, the model builds a strong foundation before tackling
more complex records. By applying TACL to multilingual medical data, including
English and Chinese clinical records, we observe significant improvements
across diverse clinical tasks, including automatic ICD coding, readmission
prediction and TCM syndrome differentiation. TACL not only enhances the
performance of automated systems but also demonstrates the potential to unify
approaches across disparate medical domains, paving the way for more accurate,
scalable, and globally applicable medical text understanding solutions.
Authors (6)
Mucheng Ren
Yucheng Yan
He Chen
Danqing Hu
Jun Xu
Xian Zeng
Submitted
October 17, 2025
Key Contributions
Introduces TACL (Threshold-Adaptive Curriculum Learning), a novel framework that addresses the challenge of varying data complexity in medical texts. TACL dynamically adjusts the learning process based on data difficulty, allowing models to progressively learn from simpler to more complex examples, thereby improving generalization, especially for rare or complex cases.
Business Value
Enhances the accuracy and reliability of AI systems processing clinical data, leading to better clinical decision support, more efficient data analysis, and improved patient care.