ReViSQL: Achieving Human-Level Text-to-SQL
arXiv:2603.20004v2 Announce Type: replace Abstract: Translating natural language to SQL (Text-to-SQL) is a critical challenge in both database research and data analytics applications. Recent efforts have focused on enhancing SQL reasoning by developing large language models and AI agents that decompose Text-to-SQL tasks into manually designed, step-by-step pipelines. However, despite these extensive architectural engineering efforts, a significant gap remains: even state-of-the-art (SOTA) AI agents have not yet achieved the human-level accuracy on the BIRD benchmark. In this paper, we show that closing this gap does not require further architectural complexity, but rather clean training data to improve SQL reasoning of the underlying models. We introduce ReViSQL, a streamlined framework t
View PDF HTML (experimental)
Abstract:Translating natural language to SQL (Text-to-SQL) is a critical challenge in both database research and data analytics applications. Recent efforts have focused on enhancing SQL reasoning by developing large language models and AI agents that decompose Text-to-SQL tasks into manually designed, step-by-step pipelines. However, despite these extensive architectural engineering efforts, a significant gap remains: even state-of-the-art (SOTA) AI agents have not yet achieved the human-level accuracy on the BIRD benchmark. In this paper, we show that closing this gap does not require further architectural complexity, but rather clean training data to improve SQL reasoning of the underlying models. We introduce ReViSQL, a streamlined framework that achieves human-level accuracy on BIRD for the first time. Instead of complex AI agents, ReViSQL leverages reinforcement learning with verifiable rewards (RLVR) on BIRD-Verified, a dataset we curated comprising 2.5k verified Text-to-SQL instances based on the BIRD Train set. To construct BIRD-Verified, we design a data correction and verification workflow involving SQL experts. We identified and corrected data errors in 61.1% of a subset of BIRD Train. By training on BIRD-Verified, we show that improving data quality alone boosts the single-generation accuracy by 8.2-13.9% under the same RLVR algorithm. To further enhance performance, ReViSQL performs inference-time scaling via execution-based reconciliation and majority voting. Empirically, we demonstrate the superiority of our framework with two model scales: ReViSQL-235B-A22B and ReViSQL-30B-A3B. On an expert-verified BIRD Mini-Dev set, ReViSQL-235B-A22B achieves 93.2% execution accuracy, exceeding the proxy human-level accuracy (92.96%) and outperforming the prior open-source SOTA method by 9.8%. Our lightweight ReViSQL-30B-A3B matches the prior SOTA at a 7.5$\times$ lower per-query cost.
Subjects:
Databases (cs.DB); Computation and Language (cs.CL)
ACM classes: H.2.3
Cite as: arXiv:2603.20004 [cs.DB]
(or arXiv:2603.20004v2 [cs.DB] for this version)
https://doi.org/10.48550/arXiv.2603.20004
arXiv-issued DOI via DataCite
Submission history
From: Yuxuan Zhu [view email] [v1] Fri, 20 Mar 2026 14:49:27 UTC (2,129 KB) [v2] Mon, 30 Mar 2026 21:21:45 UTC (2,129 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modellanguage modelbenchmark
How We Cut Claude Code Session Overhead with Lazy-Loaded Personas
If you use Claude Code with a heavily customized CLAUDE.md , every message you send carries that full file as context. Not just once at session start — on every turn. That matters more than most people realize. The Problem: Eager-Loading Everything The naive approach to building a multi-persona system in Claude Code is to define all your personas directly in CLAUDE.md . It feels clean — everything in one place, always available. The cost: if you have 23 specialist personas, each defined in 150-200 lines, you're looking at 3,000-5,000 tokens of persona definitions loaded on every single message — regardless of whether the current task has anything to do with a UX designer or a financial analyst. Claude Code's CLAUDE.md is not a one-time setup file. It is re-injected into context on every tu

PACELC Theorem in System Design
The PACELC Theorem represents a foundational advancement in understanding the inherent trade-offs that define modern distributed systems . Developed as a direct extension of the CAP Theorem , it provides architects and engineers with a more complete framework for reasoning about system behavior under both failure conditions and normal operations. Where earlier models focused narrowly on rare network failures, the PACELC Theorem acknowledges that consistency , availability , and latency constantly interact in real production environments. The Evolution from CAP to PACELC The CAP Theorem established that in the presence of a network partition , a distributed system can guarantee only two out of three properties: Consistency , Availability , and Partition Tolerance . This insight proved inval

The Type System: What You Know, What's New, and What's Weird
My project: Hermes IDE | GitHub Me: gabrielanhaia You'll reach for class hierarchies and abstract classes. Stop. TypeScript has something better for most of those cases. In Post 1 , we covered the big mental shifts: structural typing, type erasure, null vs undefined, how overloading isn't really overloading. That was the "prepare yourself" post. This one is where we actually build things with the type system. I'll split it by feel: the stuff that'll be instantly familiar, the stuff that's genuinely new, and the stuff that'll trip you up because it looks familiar but behaves differently. Primitives, Arrays, Objects: The Familiar Stuff I'll keep this short because you already know what types are. const name : string = " Gabriel " ; const age : number = 31 ; const isActive : boolean = true ;
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.




Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!