PRD-68: Progressive Complexity Routing — Atom → Organism
Version: 2.0 Status: Ready for Implementation Priority: P0 Author: Gar Kavanagh + Auto CTO Created: 2026-02-28 Updated: 2026-02-28 Dependencies: PRD-50 (Universal Router — COMPLETE), PRD-59 (Workflow Engine V2 — MERGED), PRD-67 (CTO Agent — COMPLETE) Branch: ralph/progressive-complexity-routing
Executive Summary
Automatos has 850+ tools, 350+ models, multi-agent workflows, and a Neural Swarm architecture. But every chat message — from "Hi" to "Refactor the auth system" — runs through the same monolithic pipeline: regex intent classification → hardcoded tool category filter → full memory fetch → LLM with 15 pre-filtered tools.
This PRD makes the Atom → Organism progressive complexity model the platform's primary routing architecture. AutoBrain — which already sits at the front door of every chat request — evolves from a binary gate (RESPOND/DELEGATE) into an LLM-driven complexity assessor. The assessment threads through existing components that are already wired to receive it but currently get no data.
This is not a new system. The plumbing exists. We finish it.
What We're Building
An AI platform that feels like ChatGPT for "Hi" (0.5s, 100 tokens) and seamlessly transforms into a multi-agent swarm for "Refactor the auth system" (5 min, 12K tokens). Dynamic scaling of intelligence — the user never knows they're using a different pipeline.
What We're Deleting
~1,200 lines of dead code: mock API endpoints, deprecated streaming methods, legacy service stubs. One streaming format. One flow.
1. Current State (What's Wired Today)
The Flow
api/chat.py:306 → AutoBrain.assess() → ComplexityAssessment
api/chat.py:406 → passes complexity_assessment to service
service.py:1259 → accepts complexity_assessment parameter
service.py:1371 → passes to smart_chat.prepare(complexity_assessment=...)
integration.py:94 → passes to smart_orchestrator
smart_orchestrator.py:162 → checks complexity_assessment.needs_memory ← FIELD DOESN'T EXIST YET
smart_orchestrator.py:199 → checks complexity_assessment.tool_hints ← FIELD DOESN'T EXIST YETThe wiring from chat.py → service.py → integration.py → smart_orchestrator.py is COMPLETE. The downstream branching code is partially written — it checks needs_memory and tool_hints fields that don't exist on the ComplexityAssessment dataclass yet.
What AutoBrain Returns Today
Missing fields: needs_memory, tool_hints, needs_multi_agent
What AutoBrain Does Today
Pure regex. Returns RESPOND for atoms/platform/memory, DELEGATE for everything else. The complexity field is set but never drives downstream behavior. Everything that hits DELEGATE gets Complexity.MOLECULE regardless of actual complexity.
2. Dead Code Deletion (Do First)
Delete before building. Clean house.
DELETE: api/multi_agent.py (567 lines)
api/multi_agent.py (567 lines)Why: All 6 endpoints call EnhancedOrchestratorService methods that return {"status": "legacy_mock"}. Mounted in main.py:696 but completely non-functional.
Steps:
Delete
orchestrator/api/multi_agent.pyRemove import from
orchestrator/main.py(line ~53):from api.multi_agent import router as multi_agent_routerRemove registration from
orchestrator/main.py(line ~696):app.include_router(multi_agent_router)
DELETE: api/field_theory.py (552 lines)
api/field_theory.py (552 lines)Why: Same as above. All endpoints return legacy mocks. Mounted in main.py:697 but non-functional.
Steps:
Delete
orchestrator/api/field_theory.pyRemove import from
orchestrator/main.py(line ~54):from api.field_theory import router as field_theory_routerRemove registration from
orchestrator/main.py(line ~697):app.include_router(field_theory_router)
DELETE: Legacy mock methods in modules/orchestrator/service.py
modules/orchestrator/service.pyWhy: FieldManager, CoordinationManager, and 7 mock methods (update_field_context, propagate_field_influence, etc.) all return {"status": "legacy_mock"}.
Steps:
Delete lines ~298-347 (legacy class stubs and mock methods)push
Keep
EnhancedOrchestratorServiceclass itself — it's still imported byapi/orchestrator.pyfor the task decomposition endpoint
DELETE: stream_response_aisdk() in consumers/chatbot/service.py
stream_response_aisdk() in consumers/chatbot/service.pyWhy: Explicitly deprecated at line 502. Emits deprecation warning. Not called from any active path. stream_response_with_agent() is the only active method.
Steps:
Delete the method (lines ~493-530)
DELETE: Legacy SSE format support in consumers/workflows/streaming.py
consumers/workflows/streaming.pyWhy: We standardize on AI SDK Data Stream format (0:, d:, e: prefixes). The legacy data: {json}\n\n format is not used by the active frontend.
Steps:
Remove any legacy format functions/comments
Keep only AI SDK format methods (
format_aisdk_*)
Total deletion: ~1,200 lines.
3. The Implementation (5 Changes to 5 Files)
Change 1: Evolve ComplexityAssessment dataclass (auto.py)
ComplexityAssessment dataclass (auto.py)File: orchestrator/consumers/chatbot/auto.py Line: 55
BEFORE:
AFTER:
Why: smart_orchestrator.py already checks complexity_assessment.needs_memory (line 162) and complexity_assessment.tool_hints (line 199). These fields just need to exist and be populated.
Change 2: Add LLM-driven assessment to AutoBrain.assess() (auto.py)
AutoBrain.assess() (auto.py)File: orchestrator/consumers/chatbot/auto.py Method: assess() at line 145
Keep the existing regex fast-paths for ATOM (greetings) and platform queries. Add a Redis cache layer and an LLM classification call for everything else.
New assess() logic:
The _llm_classify() method:
The _cache_lookup() and _cache_store() methods:
The _get_agent_summaries() method:
Model Configuration:
Add to core/llm/manager.py SERVICE_CATEGORY_MAP:
Default in .env.example:
This means: any model, any provider, swappable at runtime via system settings. Free Llama 8B by default, swap to Haiku/Flash/whatever performs best.
Change 3: ATOM fast-path in service.py
service.pyFile: orchestrator/consumers/chatbot/service.py Location: Lines 1340-1374 (the SmartChatIntegration block)
BEFORE (lines 1353-1374):
AFTER:
What this does:
ATOM: Skips
get_chat_tools()(saves ~50ms DB query), skipsSmartChatIntegration.prepare()(saves intent classification, memory fetch, tool routing). Just builds a minimal system prompt and goes straight to LLM.Everything else: Runs the existing pipeline with the complexity assessment flowing through (memory skip / tool_hints already handled by
smart_orchestrator.py).
Delete: Remove the is_simple = self.prompt_analyzer.is_simple_message(latest_text) check entirely. AutoBrain's ATOM detection replaces it. prompt_analyzer.is_simple_message() becomes dead code for this path.
Change 4: tool_hints integration in SmartToolRouter (smart_tool_router.py)
tool_hints integration in SmartToolRouter (smart_tool_router.py)File: orchestrator/consumers/chatbot/smart_tool_router.py Method: route() at line ~185
smart_orchestrator.py:199 already passes tool_hints to the router. The router needs to use them.
Add to the route() method — BEFORE existing logic:
What this does: When AutoBrain says tool_hints: ["email"], the router searches ALL available tools for "email" in their name or description. No hardcoded category dict. "email" finds GMAIL_SEND_EMAIL, OUTLOOK_SEND_MAIL, composio_email_*, etc. Tools become discoverable instead of filterable.
Change 5: api/chat.py — ORGAN/ORGANISM workflow bridge
api/chat.py — ORGAN/ORGANISM workflow bridgeFile: orchestrator/api/chat.py Location: After line 313 (the existing action branching)
BEFORE (line 314):
AFTER:
What this does: Adds the WORKFLOW action handling as a clean branch. Phase 1 treats WORKFLOW same as DELEGATE (the LLM still handles it, just with richer context from the assessment). Phase 2 will bridge to execute_workflow_with_progress().
4. CTO Agent Compatibility (PRD-67)
The CTO detection in api/chat.py:298 runs BEFORE AutoBrain:
CTO bypasses AutoBrain entirely. This is correct — CTO Auto always gets full Cell-level context (tools + memory + codebase access). No complexity assessment needed for the platform builder.
The CTO path remains unchanged. PRD-68 only affects the else branch (non-admin, non-explicit-agent).
5. Streaming Format — Standardize on AI SDK
Decision: AI SDK Data Stream format (0:, d:, e: prefixes) for everything.
The chat frontend already handles:
0:"text chunk"— streaming textd:{"type":"tool-start",...}— tool lifecycled:{"type":"tool-end",...}— tool completiond:{"type":"workflow-update",...}— workflow progress (US-015 widget)d:{"type":"finish",...}— response complete
When Phase 2 bridges chat → workflow, workflow stage events map to d: events:
No format adapter needed. Same protocol, same parser, same frontend.
6. File Impact Summary
api/multi_agent.py
DELETE (567 lines)
1
api/field_theory.py
DELETE (552 lines)
1
main.py
Remove imports + registration for above 2 files
1
modules/orchestrator/service.py
Delete legacy mock methods + classes (~40 lines)
1
consumers/chatbot/service.py
Delete stream_response_aisdk() (~50 lines). Add ATOM branch (~30 lines). Remove is_simple_message() usage.
1
consumers/chatbot/auto.py
Add 3 fields to dataclass. Add _llm_classify(), _cache_lookup(), _cache_store(), _get_agent_summaries() methods. (~120 lines new)
1
consumers/chatbot/smart_tool_router.py
Add tool_hints parameter + hint-matching block (~25 lines)
1
api/chat.py
Add WORKFLOW action branch (~15 lines)
1
core/llm/manager.py
Add "complexity_assessor" to SERVICE_CATEGORY_MAP (1 line)
1
.env.example
Add COMPLEXITY_ASSESSOR_LLM_PROVIDER/MODEL (2 lines)
1
consumers/workflows/streaming.py
Remove legacy format code (cleanup)
1
Net: ~1,200 lines deleted. ~200 lines added.
7. Phasing
Phase 1: Core Routing (This PR)
Delete dead code (multi_agent.py, field_theory.py, legacy mocks, deprecated method)
Add fields to
ComplexityAssessmentdataclassAdd LLM classification + Redis cache to
AutoBrain.assess()Add ATOM fast-path in
service.py(skip tools/memory/orchestration)Add
tool_hintstoSmartToolRouterAdd WORKFLOW branch placeholder in
chat.pyAdd
complexity_assessorto LLM manager service map
Deliverable: "Hi" → 0.5s, 100 tokens. "Send email" → correctly classified as MOLECULE with tool_hints: ["email"], tools discovered by hint match. "Reply to that thread" → CELL with needs_memory: true. "Refactor auth" → ORGAN with needs_multi_agent: true (falls through to delegate for now).
Phase 2: Workflow Bridge (Follow-up PR)
When
action == WORKFLOW, create transient workflow from chat messageExecute via
execute_workflow_with_progress()with PhaseSelectorStream stage events as AI SDK
d:events back to chatDisplay multi-agent progress inline in chat UI
Deliverable: "Research the bug, plan a fix, open a PR" → user sees planning, execution, and results streamed back in chat, powered by the full PRD-59 Neural Swarm pipeline.
8. Verification & Metrics
"Hi" latency
~2s (loads tools, runs intent classifier, fetches memory)
<1s (ATOM bypass)
"Hi" token cost
~500 tokens (system prompt + tool descriptions)
~100 tokens (minimal prompt)
"Send email" tool accuracy
Depends on regex matching "EXTERNAL_ACTION"
LLM classifies with tool_hints: ["email"], router discovers email tools
Cache hit rate (repeat patterns)
0% (no cache)
>70% after 1 week
Complexity assessment cost
$0 (regex)
~$0.001 per uncached message (Haiku/Llama)
Dead code
~1,200 lines of mocks/stubs
0
9. Non-Goals
NOT replacing the UniversalRouter — AutoBrain classifies complexity + action. The router (for DELEGATE) still picks which agent handles it. Different jobs.
NOT replacing SmartIntentClassifier — Intent classification still runs for MOLECULE/CELL paths. It helps tool routing when tool_hints aren't enough. But it no longer gates the pipeline.
NOT rewriting the workflow engine — PRD-59's PhaseSelector and Neural Swarm remain the ORGAN/ORGANISM execution vehicle. Phase 2 just connects chat to it.
NOT building new modules — No new files except possibly a test file. This is rewiring, not building.
10. Open Questions
Cache scope: Should the complexity cache be per-workspace (current design) or global? Same message in different workspaces might have different complexities based on available agents. Recommendation: Per-workspace. Different agent configurations = different complexity assessments.
LLM fallback behavior: If the complexity assessor model is unavailable and Redis cache misses, should we fall back to regex (current behavior) or error? Recommendation: Fall back to DELEGATE as MOLECULE (current behavior). Never block the user.
Phase 2 transient workflows: Should ORGAN/ORGANISM workflows created from chat be visible in the workflows UI, or ephemeral? Recommendation: Visible, with a "created from chat" tag. Users should be able to re-run them.
Last updated

