🧠

NL-2-SQL Lab

Why generating SQL is easy — and answering real business questions is not.

The hard part of enterprise NL→SQL is not generating SQL. It's finding the right 4 tables in a warehouse with 127. It's knowing that core.clients has RLS before the query is even written. It's decomposing "why is revenue down?" into a hypothesis-driven plan — not a single prompt.

This lab walks through a 6-agent pipeline — Intent → Schema Discovery → Query Planner → SQL Generator → Guardrail Agent → Synthesis — on 5 real wealth-management scenarios. No LLM, no DB. Pure architecture showcase.

Multi-Agent Schema Discovery RLS Guardrails Query Planning Wealth Management Ontology

Enterprise Challenges in Production NL-2-SQL

Off-the-shelf LLM prompting fails in enterprise environments not because models are weak — but because production databases have access controls, scale, ambiguity, and domain vocabulary that a general-purpose model cannot reason about without structure. Every challenge below is mapped to the agent in our pipeline that specifically addresses it.

Challenges Documented

Critical Severity

High Severity

Agents That Solve Them

Each challenge below is live in the agent trace →

Pick any scenario, click Run Pipeline, and watch each agent address its assigned challenges in sequence. The Schema Discovery step shows all 33 tables scored. The Guardrail step shows every RLS and PII check. The Ontology step shows the exact context block injected into the LLM.

Interactive Agent Trace

Choose a Business Question

Multi-Agent Pipeline

Click Run Pipeline to simulate the agent execution trace

→

Agent Trace

"Why is revenue down while AUM is up?"

→Raw NL question
→User role & domain session context

diagnosticrevenue_management87% confidence

Entities

revenueAUMfee_rate

Metrics Required

net_revenueassets_under_managementeffective_fee_rate

Time Range

trailing 12 months, monthly grain

Ambiguities Resolved Before Planning

⚡"Revenue" could mean gross advisory fees, net revenue, or total firm revenue — assuming advisory fees
⚡"AUM" could be client-level, advisor-level, or firm-level — assuming firm-level
⚡No time period specified — defaulting to trailing 12 months

The SuperML Take

Most NL→SQL demos fail because they treat business questions as query problems. Real systems treat them as reasoning problems.