NL-2-SQL Lab
Why generating SQL is easy β and answering real business questions is not.
The hard part of enterprise NLβSQL is not generating SQL. It's finding the right 4 tables
in a warehouse with 127. It's knowing that core.clients has
RLS before the query is even written. It's decomposing "why is revenue down?"
into a hypothesis-driven plan β not a single prompt.
This lab walks through a 6-agent pipeline β Intent β Schema Discovery β Query Planner β SQL Generator β Guardrail Agent β Synthesis β on 5 real wealth-management scenarios. No LLM, no DB. Pure architecture showcase.
Enterprise Challenges in Production NL-2-SQL
Off-the-shelf LLM prompting fails in enterprise environments not because models are weak β but because production databases have access controls, scale, ambiguity, and domain vocabulary that a general-purpose model cannot reason about without structure. Every challenge below is mapped to the agent in our pipeline that specifically addresses it.
12
Challenges Documented
6
Critical Severity
4
High Severity
7
Agents That Solve Them
Each challenge below is live in the agent trace β
Pick any scenario, click Run Pipeline, and watch each agent address its assigned challenges in sequence. The Schema Discovery step shows all 33 tables scored. The Guardrail step shows every RLS and PII check. The Ontology step shows the exact context block injected into the LLM.
Interactive Agent Trace
Choose a Business Question
Multi-Agent Pipeline
Click Run Pipeline to simulate the agent execution trace
Agent Trace
"Why is revenue down while AUM is up?"
- βRaw NL question
- βUser role & domain session context
Entities
Metrics Required
Time Range
trailing 12 months, monthly grain
Ambiguities Resolved Before Planning
- β‘"Revenue" could mean gross advisory fees, net revenue, or total firm revenue β assuming advisory fees
- β‘"AUM" could be client-level, advisor-level, or firm-level β assuming firm-level
- β‘No time period specified β defaulting to trailing 12 months
The SuperML Take
Most NLβSQL demos fail because they treat business questions as query problems. Real systems treat them as reasoning problems.