Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: To enhance the reasoning capabilities of large language models (LLMs),
self-consistency has become a popular approach, combining multiple samplings
with majority voting. However, current methods are computationally expensive
and time-consuming due to the need for numerous samplings. To address this,
this paper introduces path-consistency, which leverages the confidence of
earlier-generated answers to identify the most promising prefix and guide the
generation of subsequent branches. By dynamically guiding the generation of
subsequent branches based on this prefix, path-consistency mitigates both the
errors and redundancies from random or less useful sampling in
self-consistency. This approach reduces errors and redundancies from random
sampling, significantly accelerating inference by minimizing token consumption.
Our extensive empirical results demonstrate that path-consistency improves
inference latency by up to 40.5\%, while maintaining task accuracy across
various tasks, including mathematical reasoning, commonsense reasoning, and
symbolic reasoning.
Key Contributions
Path-consistency with prefix enhancement is introduced as a novel method to accelerate LLM inference and improve reasoning. By leveraging the confidence of early outputs to guide subsequent generation branches, it mitigates errors and redundancies from random sampling in self-consistency, significantly reducing inference latency (up to 40.5%) while maintaining task accuracy.
Business Value
Enables faster and more cost-effective deployment of LLMs for complex reasoning tasks, making them more practical for real-time applications and resource-constrained environments.