AgentNLQ: A General-Purpose Agent for Natural Language to SQL

AgentNLQ：用于自然语言转 SQL 的通用智能体

Abstract: Natural language to SQL (NL2SQL) conversion is an important problem for researchers and enterprises due to the ubiquitous importance of relational databases in broad-ranging practical problems. Despite the rapid advancements in the capabilities of LLMs, NL2SQL has not reached parity in accuracy with human expert SQL writers, hence needing additional improvements in NL2SQL algorithms.

摘要： 由于关系数据库在广泛的实际问题中具有普遍的重要性，自然语言转 SQL（NL2SQL）转换对于研究人员和企业来说是一个重要课题。尽管大语言模型（LLM）的能力取得了飞速进步，但 NL2SQL 在准确性上尚未达到人类专家 SQL 编写者的水平，因此需要对 NL2SQL 算法进行进一步改进。

This study presents a new multi-agent method for NL2SQL that achieves 78.1% semantic accuracy on the BIg Bench for LaRge-scale Database (BIRD) benchmark. Our method leverages a semantically enriched representation of user-provided schema, adds user-provided business rules, and produces accurate SQL queries.

本研究提出了一种新的 NL2SQL 多智能体方法，在“大规模数据库基准测试”（BIRD）上实现了 78.1% 的语义准确率。我们的方法利用了用户提供的模式（schema）的语义增强表示，加入了用户提供的业务规则，并生成了准确的 SQL 查询。

The main contributions of this study are (a) We designed an optimized new orchestrator in a multi-agent solution that uses LLMs to plan, orchestrate, reflect, and self-correct to generate accurate SQL queries, (b) We developed an advanced schema enrichment method that creates context-aware metadata to improve accuracy, and (c) We demonstrated the accuracy and generalizability of the method across different domains and datasets by evaluating it on the BIRD-SQL benchmark.

本研究的主要贡献包括：（a）我们在多智能体解决方案中设计了一个优化的新型编排器，利用大语言模型进行规划、编排、反思和自我修正，从而生成准确的 SQL 查询；（b）我们开发了一种先进的模式增强方法，通过创建上下文感知的元数据来提高准确性；（c）我们通过在 BIRD-SQL 基准测试上的评估，证明了该方法在不同领域和数据集上的准确性和通用性。