PRD 61: NL2SQL v2 — Top-10 Competitive Upgrade

Status: Draft Priority: High Effort: 35-45 hours Dependencies: PRD-21 (Database Knowledge), PRD-30 (Modular Architecture) Created: 2026-02-18 Research Base: Deep analysis of 9 leading open-source NL2SQL projects + Automatos codebase audit


Executive Summary

Deep research across the top NL2SQL/Text-to-SQL open-source projects (PandasAI 23.2K stars, Vanna 22.7K, DB-GPT 16K, WrenAI, Dataherald 3.6K, SQLCoder 4K, XiYan-SQL, DBHub, SQLBot 5.5K) reveals that Automatos already has a surprisingly strong NL2SQL implementation — more complete than most open-source alternatives. However, critical gaps remain in error correction, training data management, benchmark-quality generation, and MCP exposure that prevent it from being competitive with the best.

This PRD closes those gaps across 8 phases, taking Automatos from "good internal tool" to "competitive with Vanna/Dataherald" level.


Part 1: How the Top Projects Work

Tier 1: Full Platforms (Closest Competitors)

1. Vanna (22.7K stars) — RAG-Based NL2SQL

  • Core Innovation: Uses vector stores to store DDL schemas, documentation, and verified question/SQL pairs. Retrieves similar examples at generation time for few-shot prompting.

  • Pipeline: generate_sql()run_sql()generate_plotly_code()get_plotly_figure()

  • Training: vn.train(question, sql) / vn.add_ddl() / vn.add_documentation() — builds a RAG corpus of schema + examples

  • Auto-train: If query fails, automatically adds correction to training data

  • Follow-ups: generate_followup_questions() suggests next questions

  • License: MIT

  • Key Lesson: RAG-based few-shot example retrieval dramatically improves SQL accuracy

2. PandasAI (23.2K stars) — Code Generation + SQL

  • Core Innovation: Generates Python code (not just SQL) for data analysis. Agent-based architecture routes between SQL queries and pandas operations.

  • Connectors: MySQL, PostgreSQL, Snowflake, Databricks (production license required for Snowflake/Databricks)

  • License: MIT + EE directory (enterprise features require commercial license)

  • Key Lesson: Smart routing between SQL and code-based analysis; already partially adopted in Automatos via PandasAI integration

3. DB-GPT (16K stars) — Full AI Data Platform

  • Core Innovation: AWEL (Agentic Workflow Expression Language) DAG-based orchestration. Five-stage Text2SQL pipeline: Query Understanding → Schema Linking → SQL Generation → Execution → Visualization

  • Databases: MySQL, PostgreSQL, Oracle, MSSQL, ClickHouse, DuckDB, Hive, Spark (8 databases)

  • Benchmark: 82.5% Spider accuracy

  • Architecture: Modular packages (dbgpt-core, dbgpt-ext, dbgpt-serve, dbgpt-client)

  • Key Lesson: Modular package architecture; schema linking as explicit pipeline stage; fine-tuning framework for Text2SQL

4. WrenAI — Semantic Layer Focus

  • Core Innovation: Modeling Definition Language (MDL) for semantic layer. Strong emphasis on business metric definitions over raw SQL.

  • API: SQL Generation + Chart Generation endpoints (commercial plan required)

  • Key Lesson: Semantic layer as first-class citizen improves accuracy for business queries

Tier 2: Specialized Tools

5. Dataherald (3.6K stars) — Enterprise NL2SQL API

  • Core Innovation: 4 SQL generator implementations (Langchain Agent, Langchain Chain, LlamaIndex, Dataherald Agent). Agent with 7 specialized tools. "Golden SQL" training system.

  • Agent Tools: QuerySQLDatabase, GetCurrentTime, TablesSQLDatabase, SchemaSQLDatabase, InfoRelevantColumns, ColumnEntityChecker, GetFewShotExamples

  • Databases: PostgreSQL, BigQuery, Databricks, Snowflake

  • Training: Golden SQL pairs stored in vector DB + used for fine-tuning (min 20-30 samples/table)

  • Key Lesson: Golden SQL concept; agent with specialized database tools; self-correction

6. SQLCoder (4K stars) — Fine-Tuned Model

  • Core Innovation: Purpose-built fine-tuned model for Text-to-SQL (based on StarCoder/Code Llama)

  • Key Lesson: Fine-tuned models can beat GPT-4 on specific domains at lower cost

7. XiYan-SQL — MCP-Native NL2SQL

  • Core Innovation: State-of-the-art benchmark scores delivered via MCP server. Remote mode (API) or local mode.

  • Databases: MySQL, PostgreSQL

  • Key Lesson: MCP as delivery mechanism for NL2SQL; 2-22% better than raw database MCP servers

8. DBHub — MCP Database Gateway

  • Core Innovation: Zero-dependency MCP server for multi-database access. Custom parameterized SQL tools.

  • Databases: PostgreSQL, MySQL, SQL Server, MariaDB, SQLite

  • Key Lesson: MCP + custom tools pattern; read-only mode with row limits

9. SQLBot (5.5K stars) — Conversational SQL

  • Core Innovation: Multi-turn conversational interface for SQL

  • Key Lesson: Conversation state management for iterative query refinement

Benchmark Reality Check

Method
Spider EX
Notes

MCS-SQL + GPT-4

89.6%

With Spider training data

CHASE-SQL + Gemini 1.5

87.6%

Without training data

DAIL-SQL + GPT-4

86.6%

With training data

DIN-SQL + GPT-4

85.3%

With training data

DB-GPT

82.5%

Reported

Spider 2.0 (best)

23.8%

Real-world enterprise complexity

Real enterprise datasets

~24%

"NL2SQL is a solved problem... Not!"

Critical insight: Academic benchmarks (85-91%) dramatically overstate real-world accuracy (~24% on enterprise schemas). Schema linking brittleness causes 10-20% accuracy drops on paraphrased queries. Models are only effective for ~20% of realistic user queries.


Part 2: Automatos Current State (Honest Assessment)

What Already Works (Strengths)

Automatos has a substantial, production-ready NL2SQL system across 16+ files in orchestrator/modules/nl2sql/:

Feature
Status
Quality
Files

NL→SQL Generation

Working

Production

query/nl2sql_service.py (343 lines)

SQL Validation (3-tier)

Working

Excellent

query/validator.py (254 lines)

Schema Introspection

Working

Good

schema/introspection.py

Schema Provider + Cache

Working

Good

schema/provider.py (374 lines)

Smart Agent (multi-turn)

Working

Good

intelligence/agent.py (403 lines)

Query Clarification

Working

Good

intelligence/clarifier.py (261 lines)

Query Rephrasing

Working

Basic

intelligence/rephraser.py

Result Explanation

Working

Basic

intelligence/explainer.py

Visualization Suggestion

Working

Good

intelligence/visualizer.py (310 lines)

PandasAI Integration

Working

Good

tools/services/pandas_ai_service.py

Semantic Layer (metrics/dims)

Working

Good

core/models/database_knowledge.py

Database Tool Auto-creation

Working

Good

tools/services/database_tool_integration.py

Full REST API

Working

Complete

api/database_knowledge.py (612 lines)

Multi-tenant Isolation

Working

Excellent

Workspace-based throughout

Audit Trail

Working

Complete

database_query_audit table

Frontend UI

Working

Extensive

DatabaseQueryExplorer.tsx (NL input, SQL, results, viz), SemanticLayerBuilder, QueryTemplatesGrid, DataWidget (chatbot results with table/pagination/CSV export), DataVisualizationModal, AddDatabaseModal, plus 15 analytics components

Credential Management

Working

Secure

Encrypted, runtime-resolved

Query Templates (20+)

Working

Good

Per-dialect templates

Analytics Dashboard

Working

Basic

api/database_analytics.py

Total: ~3,500+ lines of working NL2SQL code

What's Missing (Gaps vs Top Projects)

Gap
Impact
Who Has It
Priority

1. RAG-Based Few-Shot Examples

Accuracy drops 15-25% without examples

Vanna, Dataherald

Critical

2. SQL Error Self-Correction

First-try failures become dead ends

Vanna (auto-train), DB-GPT, Dataherald

Critical

3. Golden SQL Training System

No way to improve over time

Vanna, Dataherald

High

4. Schema Linking Stage

Sending full schema wastes tokens, reduces accuracy

DB-GPT, CHASE-SQL, DAIL-SQL

High

5. Multi-Database Connectors

Only PostgreSQL + MySQL

DB-GPT (8), Dataherald (4)

Medium

6. MCP Server Exposure

NL2SQL not available via MCP

XiYan-SQL, DBHub

Medium

7. Query Confidence Scoring

Users can't judge result reliability

Dataherald

Medium

8. SQL Generation Benchmarking

No way to measure/track accuracy

All serious projects

Medium

5 Bugs Found

#
Bug
Severity
File
Fix

1

No table relevance scoring — sends ALL tables to LLM

High

query/nl2sql_service.py:_get_relevant_tables()

Uses keyword matching only; needs embedding-based similarity

2

Clarifier patterns too aggressive — triggers on simple queries

Medium

intelligence/clarifier.py

Time-range pattern matches "last" in "last_name"; grouping matches "by" in "baby"

3

Validator allows subqueries with mutations

Medium

query/validator.py

Nested INSERT INTO ... SELECT bypasses single-statement check

4

Schema cache never invalidates on DDL changes

Medium

schema/provider.py

_schema_cache uses TTL but no hook for ALTER TABLE / new columns

5

Visualizer chart config missing axis labels

Low

intelligence/visualizer.py

_build_chart_config() returns x/y fields but no labels/titles for charts


Part 3: Build vs. Adopt Analysis

The Question

"Is it worth dropping the in-house version and adopting a really good open-source project?"

Verdict: BUILD ON EXISTING (Enhance, Don't Replace)

Why NOT to adopt Vanna/Dataherald/DB-GPT:

Concern
Detail

Multi-tenant isolation

None of the 9 projects support workspace-based multi-tenancy. You'd have to retrofit it — harder than building from scratch.

Credential management

Automatos uses encrypted, runtime-resolved credentials via CredentialStore. Open-source projects use connection strings in config files.

Agent tool integration

PRD-17 auto-creates 3 database tools per source. No open-source project has this.

Audit trail

Your database_query_audit table with user/agent/session tracking is more complete than any open-source project.

Architecture fit

FastAPI + SQLAlchemy + workspace isolation. Vanna is a standalone library, DB-GPT is a full platform, Dataherald is Docker-only. None fit cleanly into your modular architecture.

License risk

PandasAI EE requires commercial license. WrenAI API is paid. DB-GPT is AGPL-friendly but massive dependency.

Migration cost

Estimated 80-120 hours to rip out existing + integrate + retrofit multi-tenancy + restore feature parity.

What TO adopt (techniques, not codebases):

Technique
Source
Effort
Impact

RAG-based few-shot retrieval

Vanna

8h

+15-25% accuracy

Golden SQL training pairs

Dataherald

4h

Continuous improvement

SQL error self-correction loop

DB-GPT, Vanna

6h

Reduces failure rate 30-50%

Explicit schema linking stage

CHASE-SQL, DB-GPT

4h

Better token efficiency + accuracy

MCP tool exposure

XiYan-SQL, DBHub

3h

New integration channel

Bottom line: Your existing implementation is worth ~3,500 lines of working, multi-tenant, production-ready code. Adopting would cost more than enhancing. Instead, adopt the techniques (RAG for SQL, Golden SQL, self-correction) that the top projects use.

Same Verdict for RAG (from PRD-60)

The RAG system has a similar story — substantial existing implementation (semantic chunker, hybrid search, ingestion pipeline). Adopt techniques (parent-child chunks, RRF, knowledge graph) not codebases.


Existing Frontend Reality

The NL2SQL frontend is already extensive — unlike RAG (where the DocumentWidget broke), the database UI is largely intact:

Component
File
What It Does

DatabaseQueryExplorer.tsx

frontend/components/knowledge/

NL query input, generated SQL display, query results table, visualization toggle, query history, source selection

SemanticLayerBuilder

frontend/components/knowledge/

Define metrics, dimensions, business terms

QueryTemplatesGrid

frontend/components/knowledge/

20+ query templates per dialect

DataWidget/index.tsx

frontend/components/widgets/

Chatbot widget — table with pagination, search, CSV export, sort, chart view

DataVisualizationModal

frontend/components/knowledge/

Full chart rendering

SimpleDataVisualization

frontend/components/knowledge/

Inline chart in query explorer

AddDatabaseModal

frontend/components/knowledge/

Database connection setup

document-management.tsx

frontend/components/documents/

Main orchestrator with tabs for all of the above

Frontend work in this PRD is minimal — only 2 new things:

  1. TrainingExamplesManager.tsx (Phase 5) — genuinely new, adds as tab in existing structure

  2. Confidence badge in DatabaseQueryExplorer.tsx (Phase 6) — small addition to existing component

  3. Benchmark charts in DatabaseQueryAnalytics.tsx (Phase 8) — small addition to existing component


Part 4: Implementation Plan

Phase 1: RAG-Based Few-Shot Example Retrieval (8h) — CRITICAL

What: Store verified question/SQL pairs in vector store. At query time, retrieve the most similar examples and include them in the LLM prompt (like Vanna's core approach).

Why: This is the single highest-impact improvement. Vanna's entire value proposition is built on this pattern. Few-shot examples improve accuracy by 15-25% on real-world queries.

Backend Changes

New file: orchestrator/modules/nl2sql/training/example_store.py

Modify: orchestrator/modules/nl2sql/query/nl2sql_service.py

Add few-shot examples to the SQL generation prompt:

Database Migration

API Endpoints

Files to modify:

  • orchestrator/modules/nl2sql/query/nl2sql_service.py — inject examples into prompt

  • orchestrator/api/database_knowledge.py — add example management endpoints

  • orchestrator/core/models/database_knowledge.py — add SQLAlchemy model


Phase 2: SQL Error Self-Correction Loop (6h) — CRITICAL

What: When generated SQL fails execution, automatically retry with error context. If retry succeeds, optionally auto-save as training example (like Vanna's auto_train).

Why: Currently, a failed SQL query is a dead end. DB-GPT, Vanna, and Dataherald all implement self-correction. This alone can reduce failure rates by 30-50%.

Backend Changes

Modify: orchestrator/modules/nl2sql/service.pyquery_database() method

Modify: orchestrator/modules/nl2sql/query/nl2sql_service.py

Add error context to prompt building:

Files to modify:

  • orchestrator/modules/nl2sql/service.py — add correction loop

  • orchestrator/modules/nl2sql/query/nl2sql_service.py — add error context to prompts

  • orchestrator/api/database_knowledge.py — add auto_correct parameter to query endpoint


Phase 3: Explicit Schema Linking Stage (4h) — HIGH

What: Before SQL generation, run a dedicated schema linking step that identifies the most relevant tables/columns for the query. Send only relevant schema to the LLM instead of the full schema.

Why: Current implementation uses keyword matching in _get_relevant_tables() which is brittle. DB-GPT and CHASE-SQL use explicit schema linking as a separate pipeline stage. This improves both accuracy (less noise) and token efficiency (smaller prompts).

Backend Changes

New file: orchestrator/modules/nl2sql/query/schema_linker.py

Modify: orchestrator/modules/nl2sql/query/nl2sql_service.py

Replace _get_relevant_tables() with schema linker:

Files to modify:

  • orchestrator/modules/nl2sql/query/nl2sql_service.py — replace _get_relevant_tables()

  • orchestrator/modules/nl2sql/__init__.py — export SchemaLinker


Phase 4: Bug Fixes (3h) — HIGH

Fix the 5 bugs identified during the code audit.

Bug 1: Table Relevance Scoring (in nl2sql_service.py)

Current: Keyword-only matching in _get_relevant_tables() — sends all tables if no keywords match.

Fix: Use embedding similarity + keyword matching. This is partially addressed by Phase 3 (Schema Linker), but as an immediate fix:

Bug 2: Clarifier False Positives (in clarifier.py)

Current: Regex patterns match inside words — "last_name" triggers time_range, "baby" triggers grouping.

Fix: Use word boundary matching:

File: orchestrator/modules/nl2sql/intelligence/clarifier.py

Bug 3: Subquery Mutation Bypass (in validator.py)

Current: Only checks top-level for INSERT/UPDATE/DELETE. Nested subqueries can contain mutations.

Fix: Parse recursively:

File: orchestrator/modules/nl2sql/query/validator.py

Bug 4: Schema Cache Invalidation (in provider.py)

Fix: Add cache invalidation method + hook on introspection:

File: orchestrator/modules/nl2sql/schema/provider.py

Bug 5: Chart Config Missing Labels (in visualizer.py)

Fix: Add axis labels to chart config:

File: orchestrator/modules/nl2sql/intelligence/visualizer.py


Phase 5: Golden SQL Training System (4h) — HIGH

What: A complete system for managing verified question/SQL pairs, inspired by Dataherald's "Golden SQL" concept. Users can verify auto-generated pairs, import external examples, and track which examples are most effective.

Frontend: Training Examples Manager

New component: frontend/components/knowledge/TrainingExamplesManager.tsx

Integration note: The existing document-management.tsx orchestrator already has 10+ tab panels (CodeGraphPanel, MultimodalKnowledgePanel, DatabaseQueryExplorer, QueryTemplatesGrid, SemanticLayerBuilder, etc.). This new component should be added as a new tab/panel within that existing structure, NOT as a standalone page.

The existing DatabaseQueryExplorer.tsx already has NL query input, generated SQL display, query results, visualization toggle, and query history. The TrainingExamplesManager should complement it — accessed via a "Training Examples" tab or button within the database section.

Features:

  • List all training examples for a database source

  • Filter by verified/unverified/auto-generated

  • Inline edit SQL

  • One-click verify button

  • Import from CSV/JSON

  • Usage statistics (how often each example is retrieved)

  • Delete/archive examples

API Endpoints (in api/database_knowledge.py)

Files to create:

  • frontend/components/knowledge/TrainingExamplesManager.tsx

Files to modify:

  • orchestrator/api/database_knowledge.py — add endpoints

  • orchestrator/core/models/database_knowledge.py — add NL2SQLTrainingExample model


Phase 6: Query Confidence Scoring (3h) — MEDIUM

What: Assign a confidence score to each generated SQL query before execution, so users know how reliable the result is likely to be.

Why: Dataherald implements this. Users need to know when to trust automated SQL and when to review manually.

Scoring Factors

Files to create:

  • orchestrator/modules/nl2sql/query/confidence.py

Files to modify:

  • orchestrator/modules/nl2sql/service.py — integrate scorer

  • orchestrator/api/database_knowledge.py — return confidence in query response

  • frontend/components/knowledge/DatabaseQueryExplorer.tsx — display confidence badge (small addition — component already has NL input, SQL display, results table, chart toggle, and query history)


Phase 7: MCP Server Exposure (3h) — MEDIUM

What: Expose NL2SQL as an MCP tool so external agents (Claude Desktop, Cursor, etc.) can query databases through Automatos.

Why: XiYan-SQL and DBHub prove that MCP is the dominant integration pattern for database tools. Automatos already has MCP gateway infrastructure (PRD-33).

MCP Tool Definitions

Files to modify:

  • orchestrator/modules/tools/services/database_tool_integration.py — add MCP tool definitions

  • MCP gateway configuration


Phase 8: Benchmarking & Monitoring (4h) — MEDIUM

What: Implement a benchmarking system to measure and track NL2SQL accuracy over time.

Why: "What gets measured gets improved." Without benchmarks, you can't know if changes help or hurt.

Benchmark Test Suite

New file: orchestrator/modules/nl2sql/benchmarks/runner.py

API Endpoints

Monitoring Dashboard

Add to existing DatabaseQueryAnalytics.tsx (part of the 15 existing analytics components):

  • Accuracy trend chart (benchmark scores over time)

  • Error category breakdown (syntax, schema linking, wrong aggregation, etc.)

  • Latency percentiles (p50, p95, p99)

  • Most common failure patterns

Note: The chatbot DataWidget (frontend/components/widgets/DataWidget/index.tsx) already displays NL2SQL results with table, pagination (25 rows/page), search, CSV export, sort, and chart view toggle. No changes needed to the chatbot widget — enhancements are to the analytics/management UI only.

Files to create:

  • orchestrator/modules/nl2sql/benchmarks/runner.py

  • orchestrator/modules/nl2sql/benchmarks/comparator.py

Files to modify:

  • orchestrator/api/database_knowledge.py — add benchmark endpoints

  • frontend/components/context/DatabaseQueryAnalytics.tsx — add accuracy charts


Priority Matrix

Phase
Effort
Impact
Priority
Dependencies

1. RAG Few-Shot Examples

8h

Critical (+15-25% accuracy)

P0

None

2. Error Self-Correction

6h

Critical (-30-50% failures)

P0

None

3. Schema Linking

4h

High (token efficiency + accuracy)

P1

None

4. Bug Fixes

3h

High (correctness)

P1

None

5. Golden SQL Training

4h

High (continuous improvement)

P1

Phase 1

6. Confidence Scoring

3h

Medium (user trust)

P2

Phase 3

7. MCP Exposure

3h

Medium (integration channel)

P2

None

8. Benchmarking

4h

Medium (measurability)

P2

Phase 5

MVP (Phases 1-4): 21 hours — Gets you to competitive parity with Vanna/Dataherald Full (All phases): 35 hours — Best-in-class for a multi-tenant NL2SQL platform


Competitive Comparison (After Implementation)

Feature
Automatos (Current)
Automatos (After PRD-61)
Vanna
Dataherald
DB-GPT

NL→SQL Generation

Basic LLM

RAG-enhanced + few-shot

RAG-based

Agent-based

Schema-linked

Error Self-Correction

None

Auto-retry + context

Auto-train

Agent retry

LLM feedback

Training Data Mgmt

None

Golden SQL system

train() API

Golden SQL API

Fine-tuning hub

Schema Linking

Keyword-only

Embedding + semantic layer

DDL retrieval

Schema+column tools

Dedicated stage

SQL Validation

Excellent (3-tier)

Excellent (3-tier + subquery fix)

Basic

Basic

Basic

Multi-tenant

Full isolation

Full isolation

None

None

None

Audit Trail

Complete

Complete

None

Basic

None

Confidence Scoring

None

Multi-factor scoring

None

Partial

None

MCP Tools

None

3 tools exposed

None

None

None

Benchmarking

None

Automated + tracking

None

None

Fine-tuning eval

Databases

2 (PG, MySQL)

2 (PG, MySQL)

8+

4

8

Agent Tools

Auto-created

Auto-created + MCP

None

7 agent tools

Agent framework

Visualization

10 chart types

10 chart types + labels

Plotly

None

GPT-Vis

Semantic Layer

Metrics + dimensions

Metrics + dimensions

None

None

None

License

Proprietary

Proprietary

MIT

MIT

MIT


Files Summary

New Files

File
Phase
Purpose

orchestrator/modules/nl2sql/training/example_store.py

1

RAG-based example storage/retrieval

orchestrator/modules/nl2sql/training/__init__.py

1

Package init

orchestrator/modules/nl2sql/query/schema_linker.py

3

Explicit schema linking stage

orchestrator/modules/nl2sql/query/confidence.py

6

Query confidence scoring

orchestrator/modules/nl2sql/benchmarks/runner.py

8

Benchmark test runner

orchestrator/modules/nl2sql/benchmarks/comparator.py

8

SQL result comparison

orchestrator/modules/nl2sql/benchmarks/__init__.py

8

Package init

frontend/components/knowledge/TrainingExamplesManager.tsx

5

Training examples UI

Modified Files

File
Phases
Changes

orchestrator/modules/nl2sql/service.py

2, 6

Add correction loop, confidence scoring

orchestrator/modules/nl2sql/query/nl2sql_service.py

1, 2, 3, 4

Few-shot injection, error context, schema linker, bug fix

orchestrator/modules/nl2sql/query/validator.py

4

Subquery mutation fix

orchestrator/modules/nl2sql/intelligence/clarifier.py

4

Word boundary fix

orchestrator/modules/nl2sql/intelligence/visualizer.py

4

Chart labels fix

orchestrator/modules/nl2sql/schema/provider.py

4

Cache invalidation

orchestrator/modules/nl2sql/__init__.py

1, 3, 6

Export new classes

orchestrator/api/database_knowledge.py

1, 2, 5, 6, 8

New endpoints

orchestrator/core/models/database_knowledge.py

1, 5

New models

orchestrator/modules/tools/services/database_tool_integration.py

7

MCP tool definitions

frontend/components/knowledge/DatabaseQueryExplorer.tsx

6

Confidence display

frontend/components/context/DatabaseQueryAnalytics.tsx

8

Benchmark charts

Database Migration

Table
Phase
Purpose

nl2sql_training_examples

1

Training example storage

nl2sql_benchmark_runs

8

Benchmark history

nl2sql_benchmark_results

8

Individual benchmark results


Success Criteria


Out of Scope (Future PRDs)

  • Additional database connectors (Snowflake, BigQuery, Redshift, ClickHouse)

  • Fine-tuned model training (like SQLCoder approach)

  • Multi-database JOINs (cross-source queries)

  • Natural language to NoSQL (MongoDB, etc.)

  • Real-time schema change detection (webhooks from database)

  • A/B testing different generation strategies

  • Query plan optimization suggestions

  • Row-level security integration


Estimated Total Effort: 35-45 hours MVP (Phases 1-4): 21 hours Priority: High Dependencies: PRD-21 (completed), PRD-30 (completed)

Last updated