Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cl 95% Match Research Paper ML Researchers,AI Engineers,NLP Practitioners,Developers working with LLMs 17 hours ago

Path-Consistency with Prefix Enhancement for Efficient Inference in LLMs

large-language-models › reasoning
📄 Abstract

Abstract: To enhance the reasoning capabilities of large language models (LLMs), self-consistency has become a popular approach, combining multiple samplings with majority voting. However, current methods are computationally expensive and time-consuming due to the need for numerous samplings. To address this, this paper introduces path-consistency, which leverages the confidence of earlier-generated answers to identify the most promising prefix and guide the generation of subsequent branches. By dynamically guiding the generation of subsequent branches based on this prefix, path-consistency mitigates both the errors and redundancies from random or less useful sampling in self-consistency. This approach reduces errors and redundancies from random sampling, significantly accelerating inference by minimizing token consumption. Our extensive empirical results demonstrate that path-consistency improves inference latency by up to 40.5\%, while maintaining task accuracy across various tasks, including mathematical reasoning, commonsense reasoning, and symbolic reasoning.

Key Contributions

Path-consistency with prefix enhancement is introduced as a novel method to accelerate LLM inference and improve reasoning. By leveraging the confidence of early outputs to guide subsequent generation branches, it mitigates errors and redundancies from random sampling in self-consistency, significantly reducing inference latency (up to 40.5%) while maintaining task accuracy.

Business Value

Enables faster and more cost-effective deployment of LLMs for complex reasoning tasks, making them more practical for real-time applications and resource-constrained environments.