arxiv_cl 95% Match Research Paper AI researchers,NLP engineers,data scientists,information retrieval specialists 1 week ago

Beyond Isolated Dots: Benchmarking Structured Table Construction as Deep Knowledge Extraction

large-language-models › evaluation

📄 Abstract

Abstract: With the emergence of large language models (LLMs), there is an expectation that LLMs can effectively extract explicit information from complex real-world documents (e.g., papers, reports). However, most LLMs generate paragraph-style answers that are chaotic, disorganized, and untraceable. To bridge this gap, we introduce the Arranged and Organized Extraction Benchmark (AOE), a new bilingual benchmark with data and documents of varying lengths designed to systematically evaluate the ability of LLMs to comprehend fragmented documents and reconstruct isolated information into one organized table. Unlike conventional text-to-table tasks, which rely on fixed schema and narrow task domains, AOE includes 11 carefully crafted tasks across three diverse domains, requiring models to generate context-specific schema tailored to varied input queries. In the experiment, we evaluated both open-source and closed-source state-of-the-art LLMs. The results show that even the most advanced models struggled significantly. The benchmark is available at https://anonymous.4open.science/r/AOE-Benchmark/.

Authors (12)

Tianyun Zhong

Guozhao Mo

Yanjiang Liu

Yihan Chen

Lingdi Kong

Xuanang Chen

+6 more

Submitted

July 22, 2025

arXiv Category

cs.CL

arXiv PDF

Key Contributions

This paper introduces the Arranged and Organized Extraction Benchmark (AOE), a new bilingual benchmark designed to evaluate LLMs' ability to reconstruct fragmented information into organized tables, requiring context-specific schema generation. This addresses the limitation of LLMs generating chaotic, untraceable paragraph answers and provides a systematic way to benchmark structured extraction capabilities.

Business Value

Enables businesses to more effectively extract and organize critical information from large volumes of unstructured or semi-structured documents (e.g., reports, contracts, research papers), leading to better data-driven decision-making and operational efficiency.

Paper Metadata

Innovation Type

Benchmark Creation and Evaluation Methodology

Deployment Feasibility

The benchmark provides a clear evaluation standard, which can guide the development and selection of LLMs suitable for structured data extraction tasks.

Limitations Addressed

Addresses the gap in LLM capabilities for extracting explicit, structured information from complex real-world documents, where current LLMs often produce disorganized text. It tackles the limitations of fixed schemas in traditional text-to-table tasks by requiring context-specific schema generation.

Technical Tags

structured table extractionknowledge extractionLLM evaluationArranged and Organized Extraction Benchmark (AOE)context-specific schemafragmented documentsbilingual benchmarktext-to-table

Research Topics

Information extractionLLM capabilitiesdocument understandingknowledge representation

Methods & Architectures

Benchmark creation (AOE)LLM evaluationschema generationcomparative analysis Large Language Models (LLMs)open-source LLMsclosed-source LLMs

Applications & Tasks

Information retrieval Data management Document analysis Knowledge discovery extracting structured data from complex documentsorganizing fragmented informationevaluating LLM's ability to generate context-specific schemas structured table constructionknowledge extractiondocument comprehensionschema generation

Datasets & Benchmarks

Datasets

Arranged and Organized Extraction Benchmark (AOE)

Related Fields

Information ExtractionNatural Language ProcessingKnowledge GraphsData MiningDocument Understanding

Keywords

Large Language Modelsstructured extractiontable constructionknowledge extractiondocument analysisbenchmarkAOEschema generationinformation retrievalbilingualfragmented documentstext-to-table

Academic Context

#Information extraction#LLM capabilities#document understanding#knowledge representation

Commercial Potential

Potential Products

Automated data extraction toolsKnowledge management systemsDocument analysis platforms

Target Industries

FinanceLegalHealthcareResearchPublishing

Use Case Examples

Extracting financial data from annual reports into a structured table.Compiling research findings from multiple papers into a comparative table.Organizing information from legal documents into a structured database.

Competitive Edge

Establishes a new benchmark (AOE) specifically for structured table construction from complex documents, addressing a gap not fully covered by existing text-to-table tasks and providing a more rigorous evaluation for LLMs in this domain.

Market Opportunity

Large market for data extraction and knowledge management solutions.

Revenue Models

Licensing of benchmark datasets and evaluation toolsSaaS for data extraction services.

Resource Requirements

Compute Needs

Moderate to high, depending on the LLMs being evaluated.

Data Requirements

Requires diverse documents and carefully designed tasks for the AOE benchmark.

Deployment Constraints

The performance of LLMs on AOE tasks can vary significantly, requiring careful selection and potential fine-tuning for specific document types.

Scalability

The benchmark is designed to be bilingual and cover diverse domains, suggesting scalability in its application.

Production Readiness

Maturity Level

Benchmark/Research

Time to Market

1-2 years for developing specialized extraction tools based on benchmark performance.

View Full Paper Back to Papers