arxiv_cl 94% Match Research Paper NLP Researchers,ML Engineers,AI Developers working with RAG 3 weeks ago

Less is More: Compact Clue Selection for Efficient Retrieval-Augmented Generation Reasoning

large-language-models › reasoning

📄 Abstract

Abstract: Current RAG retrievers are designed primarily for human readers, emphasizing complete, readable, and coherent paragraphs. However, LLMs benefit more from precise, compact, and well-structured input, which enhances reasoning quality and efficiency. Existing methods often rely on reranking or summarization to identify key sentences, but may suffer from semantic breaks and unfaithfulness. Thus, efficiently extracting and organizing answer-relevant clues from large-scale documents while reducing LLM reasoning costs remains a challenge for RAG. Inspired by Occam's razor, we frame LLM-centric retrieval as a MinMax optimization: maximizing the extraction of potential clues and reranking them for well-organization, while minimizing reasoning costs by truncating to the smallest sufficient clues set. In this paper, we propose CompSelect, a Compact clue Selection mechanism for LLM-centric RAG, consisting of a clue extractor, a reranker, and a truncator. (1) The clue extractor first uses answer-containing sentences as fine-tuning targets, aiming to extract sufficient potential clues; (2) The reranker is trained to prioritize effective clues based on real LLM feedback; (3) The truncator uses the truncated text containing the minimum sufficient clues for answering the question as fine-tuning targets, thereby enabling efficient RAG reasoning. Experiments on three QA datasets show that CompSelect improves QA performance by approximately 11\% and reduces Total Latency and Online Latency by approximately 17\% and 67\% compared to various baseline methods on both LLaMA3 and Qwen3. Further analysis confirms its robustness to unreliable retrieval and generalization across different scenarios, offering a scalable and cost-efficient solution for web-scale RAG applications.

Key Contributions

Proposes CompSelect, a novel mechanism for LLM-centric RAG that frames retrieval as a MinMax optimization problem. It efficiently extracts, reranks, and truncates answer-relevant clues from large documents to minimize LLM reasoning costs while maximizing reasoning quality, addressing the limitations of traditional RAG retrievers designed for human readers.

Business Value

Significantly reduces the computational cost of using RAG with LLMs, making advanced AI applications more affordable and faster, enabling wider adoption in areas like customer support, knowledge management, and content generation.

Paper Metadata

Innovation Type

Algorithmic Improvement

Deployment Feasibility

High, as it integrates into existing RAG pipelines and focuses on optimizing retrieval and input preparation.

Limitations Addressed

Existing RAG retrievers are often designed for human readers (complete paragraphs) and are inefficient for LLMs, leading to high reasoning costs and potential semantic breaks. CompSelect addresses the need for precise, compact, and well-structured input for LLMs.

Performance Gains

Reduces LLM reasoning costs by truncating to the smallest sufficient clues set while maintaining or improving reasoning quality.

Technical Tags

Retrieval-Augmented Generation (RAG)LLM-centric retrievalCompact clue selectionMinMax optimizationclue extractorrerankertruncatorreasoning cost reductionanswer-relevant cluesdocument retrieval

Research Topics

Efficient RAGLLM ReasoningInformation RetrievalDocument UnderstandingOptimization Techniques

Methods & Architectures

CompSelect (Compact clue Selection mechanism)MinMax OptimizationClue ExtractionRerankingTruncation TransformerRAG Systems

Applications & Tasks

Natural Language Processing Information Retrieval Question Answering Inefficient RAG retrievers for LLMsHigh LLM reasoning costsExtracting and organizing answer-relevant clues Retrieval-Augmented GenerationEfficient Document RetrievalLLM Reasoning Enhancement

Related Fields

Natural Language ProcessingInformation RetrievalLarge Language ModelsDeep LearningOptimization

Keywords

RAGLLMretrievalreasoningefficiencycompactcluesoptimizationdocument understandinginformation retrievalCompSelect

Academic Context

#Efficient RAG#LLM Reasoning#Information Retrieval#Document Understanding#Optimization Techniques

Commercial Potential

Potential Products

Optimized RAG components for LLM applicationsCost-effective AI assistantsEfficient knowledge retrieval systems

Target Industries

TechnologySoftware DevelopmentCustomer ServiceKnowledge Management

Use Case Examples

Building a Q&A system that can efficiently retrieve and synthesize information from large document repositories.Developing chatbots that provide accurate answers with lower latency and computational cost.

Competitive Edge

Offers a more efficient approach to RAG by focusing on LLM-specific input requirements, potentially outperforming methods that rely on generic document retrieval or summarization.

Market Opportunity

Large, as RAG is a key technology for advanced LLM applications.

Revenue Models

Could be licensed as a component or integrated into AI platforms.

Resource Requirements

Compute Needs

Moderate, depends on the scale of documents and LLM used for generation.

Data Requirements

Requires large-scale document collections for retrieval.

Deployment Constraints

Requires careful tuning of the extraction, reranking, and truncation modules.

Scalability

Designed for efficiency, suggesting good scalability for large document sets.

Production Readiness

Maturity Level

Research

Time to Market

Medium, requires integration and testing within RAG systems.

Patent Potential

Moderate, for the CompSelect mechanism and its optimization approach.

View Full Paper Back to Papers