PRD 60: RAG v3 — Top-10 Competitive Upgrade

Status: Draft Priority: Critical Effort: 34-44 hours (phased, includes 4h prerequisite S3 migration fix + 4h PRD-19 multimodal resurrection) Dependencies: PRD-08 (RAG v2, completed), PRD-19 (Multimodal Knowledge — partially completed, 5 bugs), PRD-42/46 (Cloud Doc Sync — caused DocumentWidget breakage), PRD-09 (Context Real Data)


Executive Summary

Deep competitive research against the top 10 open-source RAG projects (Dify, LangChain, RAGFlow, Pathway, Flowise, LlamaIndex, Quivr, LightRAG, Haystack, txtai) revealed 8 critical gaps in Automatos AI's RAG implementation. While our mathematical foundations (Knapsack DP, Shannon entropy, RRF fusion) are strong, the system lacks features that every top-10 project delivers: document return to users, proper hybrid search, advanced document parsing, evaluation feedback loops, parent-child chunk retrieval, and a dedicated search UI.

This PRD addresses all 8 gaps in priority order, transforming Automatos from a "good chunker with vector search" into a complete RAG platform competitive with Dify, RAGFlow, and LlamaIndex.


Current State: What We Have (Strengths)

Component
File
Status

5-strategy SemanticChunker

orchestrator/modules/rag/chunking/semantic_chunker.py

Working

0/1 Knapsack DP optimizer

orchestrator/modules/search/optimization/context_optimizer.py

Working

RRF multi-query fusion

orchestrator/modules/rag/service.py:299-349

Working

HyDE + query decomposition

orchestrator/modules/rag/query_enhancer.py

Working

Iterative/Agentic RAG

orchestrator/modules/rag/iterative_rag.py

Working

pgvector store (4 search modes)

orchestrator/modules/search/vector_store/store.py

Working

Cross-encoder reranking

orchestrator/modules/rag/service.py:351-387

Working

Cloud file download (Composio)

orchestrator/modules/rag/services/cloud_file_downloader.py

Working

Document upload + ingestion pipeline

orchestrator/modules/rag/ingestion/manager.py

Working

Frontend document management

frontend/components/documents/

Working

Frontend semantic search UI

frontend/components/documents/semantic-search.tsx

Working


Gap Analysis: What's Missing

Gap 1: Document Return to Users (CRITICAL)

Impact: Users search but only see chunks — never the actual document Every top-10 project does this: Dify shows inline citations with source links. RAGFlow lets you hover citations to see original content with tables/charts. LlamaIndex returns source nodes with page numbers.

Current behavior: The chatbot tool router (consumers/chatbot/tool_router.py:78-157) groups chunks by source file and generates download links, but:

  • Download links point to hardcoded paths (/var/automatos/documents/{source})

  • No document preview capability

  • No page-level or section-level citations

  • No way to navigate from a chunk back to its location in the original document

  • The semantic search component (frontend/components/documents/semantic-search.tsx) shows chunks with similarity scores but no document context

Gap 2: Hybrid Search (BM25 + Vector) Not Wired

Impact: Missing keyword matches that vector search alone misses Every top-10 project does this: Dify has configurable semantic/keyword weight slider. RAGFlow uses native Elasticsearch hybrid. Pathway combines Splade + dense vectors.

Current behavior: EnhancedVectorStore.search() (modules/search/vector_store/store.py) has a SearchMode.HYBRID mode that combines 1 - (embedding <=> query) * 0.7 + ts_rank(...) * 0.3, but:

  • RAGService._get_candidates() (modules/rag/service.py:668-685) hardcodes SearchMode.VECTOR_ONLY

  • The ts_rank function requires a tsvector column on document_chunks — this column likely doesn't exist

  • No BM25 implementation; ts_rank is PostgreSQL full-text search (different algorithm)

  • No configurable weight between semantic and keyword results

Gap 3: Document Parsing Quality

Impact: Tables, images, and structured content lost during ingestion RAGFlow's DeepDoc has OCR, table structure recognition, 14+ document-aware templates. Our parser is basic.

Current behavior: DocumentManager in ingestion/manager.py handles:

  • PDF via pdfplumber (text extraction only — no table structure recognition)

  • DOCX via python-docx (text only — images/charts dropped)

  • Markdown, Text, Python, JSON (basic text extraction)

  • No OCR capability for scanned PDFs

  • No table extraction as structured data

  • No image extraction or captioning

However: PRD-19 built multimodal processors at orchestrator/modules/rag/ingestion/multimodal/processors.py (728 lines) with Camelot table extraction, Tesseract OCR, GPT-4V image descriptions, and LaTeX formula parsing. This code exists but has 5 bugs preventing it from working (see Phase 4B). Dependencies are already installed (camelot-py, pytesseract, pdfplumber, pillow). Fix, don't rebuild.

Gap 4: Parent-Child Chunk Retrieval

Impact: Small chunks retrieved for precision, but no way to expand to surrounding context LangChain's ParentDocumentRetriever stores small chunks for matching but returns parent sections. RAGFlow recently added parent-child chunking.

Current behavior:

  • DocumentChunk dataclass in ingestion/manager.py:92-99 already has parent_content and headers fields

  • These fields are populated during Markdown header-based splitting

  • But _get_candidates() always returns parent_content: None and headers: {} (hardcoded at lines 715-716)

  • The document_chunks table metadata JSONB could store parent references, but nothing reads them back

  • ContextRetrievalEngine._get_surrounding_chunks() returns empty list (placeholder)

Gap 5: Evaluation & Feedback Loop

Impact: No way to know if RAG answers are good — can't improve without measurement LlamaIndex has built-in evaluation (faithfulness, relevancy, answer correctness). Haystack has EvaluationResult. txtai has built-in scoring.

Current behavior:

  • No user feedback mechanism (thumbs up/down on RAG answers)

  • No automated evaluation (faithfulness, relevancy, hallucination detection)

  • document_usage table tracks queries and execution time but not answer quality

  • No A/B testing between RAG configurations

  • No ground truth dataset for regression testing

Gap 6: Dedicated Search Results Page

Impact: Users have no standalone way to search their knowledge base Dify has a retrieval test panel with similarity scores. RAGFlow has a full search interface. Even Flowise shows Document Store previews.

Current behavior:

  • Semantic search exists as a component (frontend/components/documents/semantic-search.tsx) embedded in the documents page

  • No standalone /search route

  • No full-text search option (only vector similarity)

  • No filters (by file type, date range, tags, source)

  • No pagination or infinite scroll

  • Search results show chunks but not document context

Gap 7: Knowledge Graph / Entity Retrieval

Impact: Missing relationships between concepts, can't do multi-hop reasoning LightRAG excels here with dual-level entity + thematic graph retrieval. LlamaIndex has KnowledgeGraphIndex.

Current behavior:

  • EntityExtractor exists (modules/search/services/entity_extractor.py) — extracts entities and relationships

  • document_relationships table is created by EnhancedVectorStore but never populated

  • ContextRetrievalEngine._get_related_documents() returns empty list (placeholder at strategy MULTI_HOP)

  • No graph storage, no graph traversal, no entity linking

Gap 8: Real-Time Index Sync

Impact: Documents updated in cloud storage don't reflect in search until manual re-sync Pathway is the gold standard — incremental updates via Differential Dataflow.

Current behavior:

  • Cloud sync (services/cloud_sync_service.py) is batch-only, triggered manually

  • No change detection or incremental update

  • Full re-ingestion required for updated documents

  • No webhook listeners for cloud storage change notifications


Existing Frontend Reality (IMPORTANT)

Before detailing implementation, note the massive existing frontend (~307 .tsx files, ~85K lines):

Area
Components
Key Files

Document Management

28 components

document-management.tsx (orchestrator with 10+ tabs), document-library.tsx, modern-file-manager.tsx

Cloud Storage

Full 5-provider OAuth

cloud-storage-panel.tsx, cloud-storage-connections.tsx (Google Drive, Dropbox, OneDrive, Box, SharePoint via Composio)

Chatbot Widgets

9 widget types

DocumentWidget/index.tsx (444 lines — Content/Chunks tabs), DataWidget, CodeWidget, FileWidget, etc.

Semantic Search

Already exists

semantic-search.tsx (AI search with similarity slider, 4 context modes, debounced)

Knowledge Base

10 components

DatabaseQueryExplorer.tsx, SemanticLayerBuilder, QueryTemplatesGrid

Context Engineering

7 components

context-engineering.tsx, configure-rag-modal.tsx

Analytics

15 components

Dashboard, charts, metrics

Critical Broken Feature: DocumentWidget (S3 Migration)

The DocumentWidget (frontend/components/widgets/DocumentWidget/index.tsx, 444 lines) was a working feature that displayed RAG results in the chatbot with Content tab + Chunks tab (similarity scores, relevance %). It broke after PRD-42/46 migrated to S3 Vectors:

  • Widget expects data.content and data.chunks from backend — these stopped being populated

  • Download creates blob from data.content in memory — content is now in S3, not local

  • The semantic-search.tsx hook may be querying pgvector endpoints that changed

  • Tool router (tool_router.py:78-157) has hardcoded local paths (/var/automatos/documents/)

This must be fixed FIRST before adding new features.


Detailed Implementation Plan

Phase 0: Fix S3 Migration Breakage (4h) — PREREQUISITE

Fix the DocumentWidget and semantic search that broke when PRD-42/46 moved to S3 Vectors.

0.1 Fix DocumentWidget Data Pipeline

File: orchestrator/consumers/chatbot/tool_router.py

The tool router builds document context for chatbot responses. After S3 migration, chunk content and document content must be fetched from the new storage layer:

File: frontend/components/widgets/DocumentWidget/index.tsx

The widget already has Content tab + Chunks tab. Fix the data flow:

  • Ensure data.content is populated by fetching from the correct backend endpoint

  • Ensure data.chunks array with similarity, content, metadata is populated

  • Update download handler to use API endpoint instead of in-memory blob from data.content

0.2 Fix Semantic Search Hook

File: frontend/hooks/use-semantic-search-api.ts

Verify the hook queries the correct backend endpoint after S3 migration. The semantic search component already has the UI — it just needs the data pipeline restored.

File: orchestrator/consumers/chatbot/tool_router.py:78-157


Phase 1: Document Return & Citations (6h) — HIGHEST PRIORITY

Enhance existing components to properly return documents to users. Do NOT create new viewer components — the DocumentWidget and artifact-viewer already exist.

1.1 Backend: Document Content API

File: orchestrator/api/documents.py

Add endpoints for retrieving document content with chunk highlighting:

1.2 Backend: Enhanced Search Response

File: orchestrator/modules/rag/service.py

Modify _get_candidates() to include document_id in every result (already does this at line 711), and add a new method:

This method should:

  1. Call _get_candidates() with the workspace_id

  2. Group results by document_id

  3. For each document, fetch filename/type from documents table

  4. Return grouped results sorted by max similarity per document

1.3 Backend: Search API Endpoint

File: orchestrator/api/documents.py (or create new orchestrator/api/search.py)

1.4 Frontend: Enhance Existing DocumentWidget (NOT a new component)

File: frontend/components/widgets/DocumentWidget/index.tsx (already 444 lines)

This widget already has Content tab + Chunks tab with similarity scores, relevance percentages, and download. After Phase 0 restores data flow, enhance with:

  • "View in Document" button that scrolls to the chunk in full document context

  • Clickable chunk → opens document content with that chunk highlighted via <mark> tags

  • Source document link with file type icon

DO NOT create a new document-viewer.tsx — the DocumentWidget + artifact-viewer.tsx already handle document display.

1.5 Frontend: Enhance Existing Semantic Search (small changes)

File: frontend/components/documents/semantic-search.tsx (already exists with similarity slider, 4 context modes)

Small additions to existing component:

  • Group results by document (accordion/collapsible per document)

  • Show document title + max relevance score at document level

  • "View in Document" button that opens DocumentWidget with chunk highlighted

  • Add document download button per group

1.6 Frontend: Citation Badges in Chat Messages

File: Chat message rendering component in frontend/components/chatbot/

When the LLM response includes document references (from build_tool_context_message):

  • Render clickable citation badges inline (e.g., [1], [2])

  • On click, open DocumentWidget with the relevant chunk

  • The chatbot already has artifact-viewer.tsx for rendering artifacts — leverage that pattern


Phase 2: Wire Up Hybrid Search (4h)

2.1 Add tsvector Column

Migration: Create an Alembic migration (or SQL script)

2.2 Switch RAG Service to Hybrid Mode

File: orchestrator/modules/rag/service.py

In _get_candidates(), change the search mode from VECTOR_ONLY to HYBRID:

2.3 Verify Hybrid Search in EnhancedVectorStore

File: orchestrator/modules/search/vector_store/store.py

Check that the HYBRID search mode SQL correctly references the search_vector column (not computed to_tsvector on the fly, which is slow):

Read the full search() method in store.py to find the HYBRID case and verify/fix the SQL.

2.4 Make Hybrid Weight Configurable

File: orchestrator/modules/rag/service.pyRAGConfig dataclass

Add:

Wire this through to EnhancedVectorStore.search() so the weights are configurable via system settings.


Phase 3: Parent-Child Chunk Retrieval (4h)

3.1 Store Parent References During Ingestion

File: orchestrator/modules/rag/ingestion/manager.py

During chunking (especially for Markdown with headers), store the parent chunk ID in metadata:

3.2 Context Expansion in RAG Service

File: orchestrator/modules/rag/service.py

Add a method to expand retrieved chunks to include surrounding context:

Call this in retrieve() before the Knapsack optimization step, and use expanded_content for the context formatting while keeping original content for token counting relevance scores.

3.3 Store chunk_index Reliably

File: orchestrator/modules/rag/ingestion/manager.py

Verify that every chunk stored in document_chunks has chunk_index in its metadata JSONB. Audit the _store_chunks_to_db() method (or equivalent) and ensure:

This is critical for Phase 3.2 to work.


Phase 4: Improved Document Parsing (6h)

4.1 Table Extraction from PDFs

File: orchestrator/modules/rag/ingestion/manager.py

pdfplumber already supports table extraction. Add:

4.2 Add Page Numbers to Chunk Metadata

When chunking PDF content, preserve the [Page N] markers and store the page number(s) in chunk metadata:

This enables "View on Page 5" links in the frontend.

4.3 OCR for Scanned PDFs (Optional — Lower Priority)

Consider adding pytesseract or surya for OCR:

Dependencies to add: pytesseract, Pillow (optional, skip if not needed for MVP)

4.4 Resurrect PRD-19 Multimodal Pipeline (4h)

Context: PRD-19 built a multimodal document processing system with Table, Image, Formula, and orchestrator processors. The code exists at orchestrator/modules/rag/ingestion/multimodal/processors.py (728 lines) with a supporting API at orchestrator/api/knowledge_multimodal.py, database schema in orchestrator/core/database/init_complete_schema.sql, and a service at orchestrator/modules/rag/services/multimodal_knowledge_tools.py. The bones are solid but 5 bugs prevent it from working.

What exists (working processors):

  • TableProcessor — Camelot-based table extraction from PDFs (lattice + stream methods)

  • ImageProcessor — OCR via Tesseract + AI image descriptions

  • FormulaProcessor — LaTeX detection, variable/operator extraction, domain classification

  • MultimodalDocumentProcessor — Orchestrator that runs all processors on a document

  • Database schema: knowledge_items, kb_tables, kb_images, kb_formulas, knowledge_relationships

  • API endpoints: /types, /items, /upload, /search, /items/{id}, /stats

  • Dependencies already installed: camelot-py, pdfplumber, pytesseract, pandas, pillow

Bug 1: workspace_id Column Missing (CRITICAL)

The API code references workspace_id in 46+ places, but the column doesn't exist in the knowledge_items table schema.

Bug 2: Deprecated GPT-4V Model Name

File: orchestrator/modules/rag/ingestion/multimodal/processors.py (line ~431-472)

Bug 3: Factory Ignores OpenAI Key

File: orchestrator/modules/rag/ingestion/multimodal/processors.py (line ~717-727)

Bug 4: Extracted Content Never Embedded

After multimodal extraction, the content is stored in knowledge_items but the embedding column is always NULL. Fix: After extraction, generate embeddings for extracted content and store them.

Bug 5: Multimodal Not Connected to RAG Ingestion Pipeline

The multimodal processors exist as standalone, but the main DocumentManager.ingest_file() pipeline doesn't call them. Wire it in:

Diagram processor: PRD-19 planned diagram analysis but never built it. Out of scope for this fix — can revisit when there's demand.

4.5 XLSX/CSV as Markdown Tables (existing section)

File: orchestrator/modules/rag/ingestion/manager.py

Currently these formats may not be handled well. Add:


Phase 5: Evaluation & Feedback Loop (4h)

5.1 Database: Feedback Table

Create migration:

5.2 Backend: Feedback API

File: orchestrator/api/context.py (or new orchestrator/api/rag_feedback.py)

5.3 Frontend: Feedback UI in Chat

When a chatbot message uses RAG (detected by presence of document citations), show:

  • Thumbs up / thumbs down buttons

  • Optional 1-5 star rating on expand

  • "Suggest correction" text area

File: Add to the chat message component in frontend/components/chat/ or frontend/components/chatbot/

5.4 Backend: Automated Relevance Scoring

File: orchestrator/modules/rag/service.py

Add post-retrieval relevance validation:

Log these scores to document_usage for trending analysis.


Phase 6: Dedicated Search Route (2h) — Reduced Scope

Note: semantic-search.tsx already exists as a full component with AI-powered search, similarity slider, and 4 context modes ('documents', 'chatbot', 'agent', 'patterns'). It's currently embedded in the documents page. This phase just gives it a standalone route and adds filters.

6.1 Frontend: Search Route

File: Create frontend/app/search/page.tsx

6.2 Frontend: Enhance Existing Semantic Search for Standalone Mode

File: frontend/components/documents/semantic-search.tsx (MODIFY, not create new)

Add a standalone prop that, when true:

  • Shows a larger search bar at top

  • Adds filter sidebar: file type checkboxes, date range, tags

  • Adds search mode toggle: "Semantic" vs "Keyword" vs "Hybrid" (once Phase 2 is done)

  • Adds sort options: Relevance, Date, Name

  • Shows document-grouped results (from Phase 1.5 enhancements)

DO NOT create a separate knowledge-search.tsx — enhance the existing component.

6.3 Add Search to Navigation

File: The main sidebar/navigation component

Add a "Search" link to the sidebar navigation.


Phase 7: Knowledge Graph Foundation (6h)

This is the largest phase and can be deferred if timeline is tight. However, it's what separates top-5 from top-10.

7.1 Populate document_relationships Table

File: orchestrator/modules/search/vector_store/store.py

The document_relationships table already exists (created by EnhancedVectorStore._ensure_tables()). Currently never populated.

During document ingestion, after chunking:

7.2 Implement Multi-Hop Retrieval

File: orchestrator/modules/search/retrieval/context_retrieval_engine.py

Replace the placeholder _get_related_documents():

7.3 Entity Extraction During Ingestion

File: orchestrator/modules/rag/ingestion/pipeline.py

Add a post-ingestion step that calls entity extraction:

Make this configurable (off by default since it's LLM-intensive).


Phase 8: Polish & Integration (4h)

8.1 Update RAG Configuration UI

File: frontend/components/context/configure-rag-modal.tsx

Add toggles for:

  • Hybrid search enable/disable

  • Vector/keyword weight slider (when hybrid enabled)

  • Parent-child expansion enable/disable

  • Expansion window size (1-3 chunks)

8.2 Update Context Engineering Dashboard

File: frontend/components/context/context-engineering.tsx

Add metrics cards for:

  • Hybrid search hit rate (keyword matches vs vector matches)

  • Average chunk expansion ratio

  • Feedback score trends

  • Entity relationship count

8.3 Update System Settings

File: Relevant system settings components

Add RAG v3 settings:

  • hybrid_search_enabled (boolean, default: true)

  • hybrid_vector_weight (float, default: 0.7)

  • hybrid_keyword_weight (float, default: 0.3)

  • parent_child_expansion (boolean, default: true)

  • expansion_window (int, default: 1)

  • entity_extraction_enabled (boolean, default: false)

  • ocr_enabled (boolean, default: false)


File Change Summary

Backend Files to Modify

File
Phase
Changes

orchestrator/consumers/chatbot/tool_router.py

0, 1

Fix hardcoded local paths, use document API endpoints, populate DocumentWidget data

orchestrator/modules/rag/service.py

1, 2, 3, 5

Add search_with_document_context(), wire hybrid search, add _expand_to_parent_context(), add _score_retrieval_quality()

orchestrator/modules/rag/ingestion/manager.py

3, 4

Add _extract_pdf_with_tables(), _extract_spreadsheet(), page numbers, parent chunk refs, chunk_index

orchestrator/modules/search/vector_store/store.py

2, 7

Verify HYBRID SQL uses search_vector column, add relationship population

orchestrator/modules/search/retrieval/context_retrieval_engine.py

3, 7

Implement _get_related_documents() and _get_surrounding_chunks()

orchestrator/api/documents.py

1

Add GET /{id}/content, GET /search endpoints

orchestrator/modules/rag/ingestion/multimodal/processors.py

4B

Fix GPT-4V model name, fix factory OpenAI key passthrough

orchestrator/api/knowledge_multimodal.py

4B

Fix workspace_id references (46+ places)

orchestrator/modules/rag/services/multimodal_knowledge_tools.py

4B

Fix SQL schema references

orchestrator/modules/rag/ingestion/manager.py

4B

Wire multimodal processors into ingestion pipeline

Backend Files to Create

File
Purpose

orchestrator/api/rag_feedback.py

Feedback submission and stats endpoints

Database migration for search_vector column + rag_feedback table + knowledge_items.workspace_id

SQL migration

Frontend Files to Modify (Enhance Existing — NOT Create New)

File
Phase
Changes

frontend/components/widgets/DocumentWidget/index.tsx

0, 1

Fix broken data pipeline (S3 migration), add "View in Document" link

frontend/hooks/use-semantic-search-api.ts

0

Fix endpoint after S3 migration

frontend/components/documents/semantic-search.tsx

1, 6

Group by document, add standalone mode with filters

frontend/components/context/configure-rag-modal.tsx

8

Add hybrid search, parent-child, expansion settings

frontend/components/context/context-engineering.tsx

8

Add new metric cards

Navigation/sidebar component

6

Add Search link

Chat message component

1

Add citation badges

Frontend Files to Create (Minimal — Only What Doesn't Exist)

File
Purpose

frontend/app/search/page.tsx

Search route (wraps existing semantic-search.tsx)

frontend/components/chat/rag-feedback.tsx

Thumbs up/down + rating UI

frontend/components/chat/citation-badge.tsx

Inline citation rendering

Removed from original plan (already exist):

  • frontend/components/documents/document-viewer.tsx → Use existing DocumentWidget + artifact-viewer.tsx

  • frontend/components/search/knowledge-search.tsx → Enhance existing semantic-search.tsx


Priority Matrix

Phase
Feature
Impact
Effort
Priority

0

Fix S3 Migration Breakage

CRITICAL

4h

P0 — Do First (prerequisite)

1

Document Return & Citations

CRITICAL

6h (reduced — existing components)

P0 — Do First

2

Hybrid Search (BM25 + Vector)

HIGH

4h

P0 — Do First

3

Parent-Child Chunk Retrieval

HIGH

4h

P1 — Do Second

4

Document Parsing (tables, OCR)

MEDIUM

6h

P1 — Do Second

4B

Resurrect PRD-19 Multimodal (5 bug fixes)

HIGH

4h

P1 — Do with Phase 4

5

Evaluation & Feedback

HIGH

4h

P1 — Do Second

6

Dedicated Search Route

MEDIUM

2h (reduced — enhance existing)

P2 — Do Third

7

Knowledge Graph Foundation

LOW (now)

6h

P3 — Future

8

Polish & Integration

LOW

4h

P3 — Future

Recommended order: Phase 0 → Phase 1 → Phase 2 → Phase 3 → Phase 4 + 4B → Phase 5 → Phase 6 → Phase 7 → Phase 8


Success Criteria


Competitive Positioning After Implementation

Feature

Dify

RAGFlow

LlamaIndex

Automatos (After)

Chunking Strategies

2 (basic, custom)

14+ templates

20+ splitters

5 semantic + LangChain

Hybrid Search

Yes

Yes (native)

Yes

Yes (pgvector + tsvector)

Document Return

Citations + links

Hoverable refs

Source nodes

Viewer + highlights

Parent-Child

No

Yes (recent)

Yes

Yes

Table Extraction

Via plugins

DeepDoc (excellent)

Via LlamaParse

pdfplumber tables

Feedback Loop

Basic

No

Evaluation suite

Ratings + auto-scoring

Math Optimization

No

No

No

Knapsack DP (unique)

Multi-Query RRF

Agent Node

No

SubQuestionQuery

Built-in

Agentic RAG

Yes

Yes

Yes

Iterative + cognitive tools

Search UI

Retrieval test

Full UI

No (library)

Full page + filters


Technical Notes for Implementation

Database Connection Patterns

  • ORM queries: Use db: Session = Depends(get_db) for SQLAlchemy ORM operations

  • Raw SQL in API routes: Use db.execute(text("...")) with SQLAlchemy text()

  • Raw SQL in services: Use asyncpg.connect(db_url) for async operations (as in _get_candidates)

  • Workspace isolation: Always filter by workspace_id from RequestContext

Auth Pattern

Frontend API Pattern

Embedding Dimension

The system uses configurable embedding dimensions via system_settings.vector_store_dimensions (default: 1024). Don't hardcode dimensions.

Existing Search Hook

frontend/hooks/use-semantic-search-api.ts — existing SWR hook for semantic search. Extend or create new hooks for the enhanced search endpoints.


Estimated Total Effort: 34-44 hours (all phases, including Phase 0 prerequisite + Phase 4B multimodal) MVP (Phases 0-3): 18 hours (includes fixing S3 breakage) Core (Phases 0-5 + 4B): 32 hours (includes multimodal resurrection) Priority: Critical — Phase 0 (S3 fix) is a regression that must be fixed regardless Dependencies: PRD-08 completed, PRD-19 (Multimodal Knowledge — partially completed, has 5 bugs), PRD-42/46 (Cloud Doc Sync — caused the breakage)

Last updated