Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ml 90% Match Research Paper NLP Researchers,Machine Learning Engineers,AI Safety Researchers 1 week ago

Incremental Sequence Classification with Temporal Consistency

large-language-models › evaluation
📄 Abstract

Abstract: We address the problem of incremental sequence classification, where predictions are updated as new elements in the sequence are revealed. Drawing on temporal-difference learning from reinforcement learning, we identify a temporal-consistency condition that successive predictions should satisfy. We leverage this condition to develop a novel loss function for training incremental sequence classifiers. Through a concrete example, we demonstrate that optimizing this loss can offer substantial gains in data efficiency. We apply our method to text classification tasks and show that it improves predictive accuracy over competing approaches on several benchmark datasets. We further evaluate our approach on the task of verifying large language model generations for correctness in grade-school math problems. Our results show that models trained with our method are better able to distinguish promising generations from unpromising ones after observing only a few tokens.
Authors (8)
Lucas Maystre
Gabriel Barello
Tudor Berariu
Aleix Cambray
Rares Dolga
Alvaro Ortega Gonzalez
+2 more
Submitted
May 22, 2025
arXiv Category
cs.LG
arXiv PDF

Key Contributions

This paper introduces a novel approach for incremental sequence classification by leveraging temporal-consistency conditions inspired by reinforcement learning. A new loss function is developed that optimizes this condition, leading to significant gains in data efficiency for training sequence classifiers. The method is shown to improve accuracy on text classification benchmarks and effectively evaluate large language model generations for correctness in math problems.

Business Value

Enables more efficient and accurate real-time analysis of sequential data, such as text streams or user interactions. It can also improve the reliability of LLM outputs by providing a more robust evaluation mechanism.