Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Large language models (LLMs) have shown remarkable capabilities in natural
language processing and multi-modal understanding. However, their high
computational cost, limited accessibility, and data privacy concerns hinder
their adoption in resource-constrained healthcare environments. This study
investigates the performance of small language models (SLMs) in a medical
imaging classification task, comparing different models and prompt designs to
identify the optimal combination for accuracy and usability. Using the NIH
Chest X-ray dataset, we evaluate multiple SLMs on the task of classifying chest
X-ray positions (anteroposterior [AP] vs. posteroanterior [PA]) under three
prompt strategies: baseline instruction, incremental summary prompts, and
correction-based reflective prompts. Our results show that certain SLMs achieve
competitive accuracy with well-crafted prompts, suggesting that prompt
engineering can substantially enhance SLM performance in healthcare
applications without requiring deep AI expertise from end users.