arxiv_cl 92% Match Research Paper LLM Researchers,Machine Learning Engineers,NLP Practitioners 4 weeks ago

Submodular Context Partitioning and Compression for In-Context Learning-short paper

large-language-models › model-architecture

📄 Abstract

Abstract: In-context learning (ICL) enables efficient few-shot learning in large language models (LLMs) without training, but suffers from the quadratic input complexity of transformers, limiting the maximum number of exemplars. While various efficient ICL approaches partition the context into blocks to process (e.g., ensembling, compression, cross-attention), they often ignore the information redundancy or under-representation caused by different partition strategies, leading to suboptimal performance. To tackle this problem, we propose Sub-CP, a block-aware context selection framework that leverages submodular objectives to control block diversity. Sub-CP supports a flexible spectrum of selection strategies, allowing each block to range from globally diverse to locally coherent. This allows fine-grained control over semantic structure while enabling precomputation. Extensive experiments across diverse tasks on multiple datasets show that Sub-CP consistently improves performance across model scales.

Key Contributions

Proposes Sub-CP, a block-aware context selection framework that uses submodular objectives to control block diversity for in-context learning. This approach mitigates information redundancy and under-representation caused by partitioning strategies, leading to improved performance across diverse tasks.

Business Value

Enables more effective and efficient use of LLMs in few-shot learning scenarios, reducing computational costs and improving accuracy. This is valuable for applications requiring rapid adaptation to new tasks with limited data.

Paper Metadata

Innovation Type

Algorithmic Framework

Deployment Feasibility

High, as it's a framework for improving existing ICL methods.

Limitations Addressed

Existing efficient ICL methods often ignore information redundancy or under-representation from partitioning, leading to suboptimal performance. Sub-CP addresses this by providing fine-grained control over semantic structure and block diversity.

Technical Tags

In-Context Learning (ICL)LLMsContext PartitioningSubmodular OptimizationExemplar SelectionInformation RedundancyBlock-Aware SelectionSub-CPFew-Shot Learning

Research Topics

In-Context LearningFew-Shot LearningLLM EfficiencyContext ManagementOptimization Techniques

Methods & Architectures

Submodular optimization for context partitioningBlock-aware context selection framework (Sub-CP)Control over block diversity (globally diverse to locally coherent)Precomputation strategies Transformer-based LLMs

Applications & Tasks

Few-Shot Learning Natural Language Processing Machine Learning Efficiency Limiting quadratic input complexity in transformersImproving exemplar selection for ICLReducing information redundancy in context partitioning Few-shot learningEfficient in-context learningContext selection

Related Fields

Machine LearningNatural Language ProcessingOptimizationComputer Science

Keywords

In-Context LearningLLMsFew-Shot LearningContext SelectionSubmodular OptimizationTransformer EfficiencyExemplar SelectionICLSub-CPContext Partitioning

Academic Context

#In-Context Learning#Few-Shot Learning#LLM Efficiency#Context Management#Optimization Techniques

Commercial Potential

Potential Products

More efficient LLM inference frameworksTools for optimizing few-shot learning performance

Target Industries

TechnologySoftware DevelopmentResearch & Development

Use Case Examples

Adapting LLMs to new tasks with minimal examplesReducing computational cost for few-shot inferenceImproving performance on tasks like text classification or summarization with limited data

Competitive Edge

Offers a more principled approach to context selection in ICL, addressing information redundancy issues in prior methods.

Market Opportunity

Growing adoption of LLMs,Need for efficient few-shot learning techniques

Resource Requirements

Compute Needs

Moderate for training and experimentation.

Data Requirements

Diverse datasets for evaluating few-shot learning tasks.

Scalability

The submodular optimization approach is generally scalable.

Production Readiness

Maturity Level

Research

Time to Market

1-2 years for integration into libraries.

View Full Paper Back to Papers