arxiv_cl 96% Match Research Paper LLM developers,Researchers generating diagrams,Technical writers,Data visualization specialists 6 days ago

DiagramEval: Evaluating LLM-Generated Diagrams via Graphs

large-language-models › evaluation

📄 Abstract

Abstract: Diagrams play a central role in research papers for conveying ideas, yet they are often notoriously complex and labor-intensive to create. Although diagrams are presented as images, standard image generative models struggle to produce clear diagrams with well-defined structure. We argue that a promising direction is to generate demonstration diagrams directly in textual form as SVGs, which can leverage recent advances in large language models (LLMs). However, due to the complexity of components and the multimodal nature of diagrams, sufficiently discriminative and explainable metrics for evaluating the quality of LLM-generated diagrams remain lacking. In this paper, we propose DiagramEval, a novel evaluation metric designed to assess demonstration diagrams generated by LLMs. Specifically, DiagramEval conceptualizes diagrams as graphs, treating text elements as nodes and their connections as directed edges, and evaluates diagram quality using two new groups of metrics: node alignment and path alignment. For the first time, we effectively evaluate diagrams produced by state-of-the-art LLMs on recent research literature, quantitatively demonstrating the validity of our metrics. Furthermore, we show how the enhanced explainability of our proposed metrics offers valuable insights into the characteristics of LLM-generated diagrams. Code: https://github.com/ulab-uiuc/diagram-eval.

Authors (2)

Chumeng Liang

Jiaxuan You

Submitted

October 29, 2025

arXiv Category

cs.CL

arXiv PDF

Key Contributions

This paper introduces DiagramEval, a novel evaluation metric for LLM-generated diagrams, addressing the lack of suitable metrics. DiagramEval conceptualizes diagrams as graphs, using text elements as nodes and connections as edges, and proposes new metrics based on node and edge properties to assess diagram quality, particularly for diagrams generated in SVG format.

Business Value

Improves the quality and reliability of AI-generated diagrams, making them more useful for technical documentation, research communication, and educational materials, thereby saving time and effort.

Paper Metadata

Innovation Type

Evaluation Metric/Methodology

Deployment Feasibility

High, as it's an evaluation metric that can be integrated into LLM generation pipelines.

Limitations Addressed

Lack of effective metrics for evaluating LLM-generated diagrams,Difficulty in assessing diagram quality beyond visual appearance,Challenges in generating structured diagrams with LLMs

Technical Tags

Diagram GenerationLLM EvaluationGraph RepresentationSVGEvaluation MetricsNode MetricsEdge MetricsDiagram QualityTextual Diagrams

Research Topics

Evaluating LLM-generated diagramsDiagram representation and quality assessmentLeveraging LLMs for diagram creationGraph-based evaluation metrics

Methods & Architectures

Evaluation metric developmentGraph conceptualizationComparative analysis

Applications & Tasks

Technical Documentation Research Communication Data Visualization Diagramming Tools Difficulty in creating clear diagramsLimitations of standard image generative models for diagramsLack of discriminative and explainable metrics for LLM-generated diagramsComplexity and multimodal nature of diagrams Evaluating LLM-generated diagramsAssessing diagram qualityDeveloping new evaluation metrics

Related Fields

DiagrammingData VisualizationNatural Language ProcessingComputer GraphicsGraph Theory

Keywords

LLMDiagram GenerationEvaluationSVGGraphMetricsTechnical DocumentationVisualizationAIDiagramEval

Academic Context

#Evaluating LLM-generated diagrams#Diagram representation and quality assessment#Leveraging LLMs for diagram creation#Graph-based evaluation metrics

Commercial Potential

Potential Products

Automated diagram generation toolsQuality assessment tools for diagrams

Target Industries

TechnologyPublishingEducationSoftware Development

Use Case Examples

Generating flowcharts for software documentationCreating diagrams for research papersAutomating the creation of technical illustrations

Competitive Edge

Introduces a novel graph-based evaluation approach specifically for diagrams, offering a more structured and potentially more accurate assessment than general image quality metrics.

Market Opportunity

Growing demand for automated content generation tools.

Revenue Models

N/A

Resource Requirements

Compute Needs

Low to Moderate (for running evaluation)

Data Requirements

LLM-generated diagrams (SVGs)

Deployment Constraints

Requires diagrams to be in a structured format (like SVG) that can be parsed into graphs.

Scalability

The evaluation metric itself is scalable.

Regulatory Considerations

N/A

Production Readiness

Maturity Level

Research

Time to Market

N/A

Licensing

N/A

Patent Potential

Low

View Full Paper Back to Papers