📄 Abstract
Abstract: The complexity of Structured Query Language (SQL) and the specialized nature
of geospatial functions in tools like PostGIS present significant barriers to
non-experts seeking to analyze spatial data. While Large Language Models (LLMs)
offer promise for translating natural language into SQL (Text-to-SQL),
single-agent approaches often struggle with the semantic and syntactic
complexities of spatial queries. To address this, we propose a multi-agent
framework designed to accurately translate natural language questions into
spatial SQL queries. The framework integrates several innovative components,
including a knowledge base with programmatic schema profiling and semantic
enrichment, embeddings for context retrieval, and a collaborative multi-agent
pipeline as its core. This pipeline comprises specialized agents for entity
extraction, metadata retrieval, query logic formulation, SQL generation, and a
review agent that performs programmatic and semantic validation of the
generated SQL to ensure correctness (self-verification). We evaluate our system
using both the non-spatial KaggleDBQA benchmark and a new, comprehensive
SpatialQueryQA benchmark that includes diverse geometry types, predicates, and
three levels of query complexity. On KaggleDBQA, the system achieved an overall
accuracy of 81.2% (221 out of 272 questions) after the review agent's review
and corrections. For spatial queries, the system achieved an overall accuracy
of 87.7% (79 out of 90 questions), compared with 76.7% without the review
agent. Beyond accuracy, results also show that in some instances the system
generates queries that are more semantically aligned with user intent than
those in the benchmarks. This work makes spatial analysis more accessible, and
provides a robust, generalizable foundation for spatial Text-to-SQL systems,
advancing the development of autonomous GIS.
Authors (4)
Ali Khosravi Kazazi
Zhenlong Li
M. Naser Lessani
Guido Cervone
Submitted
October 23, 2025
Key Contributions
Proposes a multi-agent framework for accurate spatial Text-to-SQL translation, addressing LLM limitations with complex spatial queries. It integrates a knowledge base, semantic enrichment, and specialized agents for entity extraction, metadata retrieval, query logic, SQL generation, and validation, improving accuracy for non-experts.
Business Value
Democratizes access to geospatial data analysis by allowing users to query complex spatial databases using natural language, accelerating insights for planning, research, and operations.