PRD-82 — Research: Orchestration Readiness Assessment
Version: 1.0 Type: Research / Strategic Assessment Status: Complete Priority: P0 Author: Gerard + Claude Date: 2026-03-14 Purpose: Map the trajectory from origin to current state to orchestration-ready future. Honest gap analysis. No dreaming.
1. Executive Summary
Automatos has evolved from an ambitious multi-agent concept (PRDs 01-06) into a real, working platform with substantial infrastructure. But 81 PRDs later, the gap between "what's planned" and "what's built" needs honest accounting before adding an orchestration layer.
The good news: The foundations for orchestration are closer than they appear. Context Service, Tool Router, Agent Factory, Heartbeat Service, and Memory are all built and wired. The missing piece is smaller than PRD-82's original 24-section draft suggests.
The honest news: Several "foundation" PRDs (01-06) were never implemented as designed. The platform grew organically around the chatbot pipeline, heartbeat system, and tool plumbing — not from those original blueprints. That's fine. The organic path built real things. But it means the orchestration layer needs to be designed for what actually exists, not what was originally planned.
2. Where You Started (The Vision)
Original Foundation PRDs (01-06)
These were drafted as a connected system:
PRD-01 — Core Orchestration Engine (task decomposition, agent assignment, workflow coordination)
PRD-02 — Agent Factory & Lifecycle
PRD-03 — Context Engineering Layer (atoms → molecules → cells → organs)
PRD-04 — Inter-Agent Communication
PRD-05 — Memory & Knowledge Systems
PRD-06 — Monitoring & Analytics Dashboard
Reality: None of these were implemented as written. Instead, the platform evolved through practical needs:
Agent Factory got rewritten 3+ times (latest: clean 1,235-line version)
Context became the
ContextServicewith 8 modes (not the atom→organ hierarchy)Memory became a 5-layer stack (Redis → Postgres → Mem0 → RAG)
Inter-agent comms became Redis pub/sub +
inter_agent.py(1,216 lines)Monitoring became heartbeat + activity feed + reports (PRDs 72, 76)
Lesson: The original PRDs were architectural aspirations. The actual platform was built bottom-up from real user needs. That's not a failure — that's how good software gets built.
The Pivot Moment
Around PRD-37 (SaaS Foundation), the project shifted from "research platform" to "multi-tenant product." This forced real decisions:
Clerk auth + workspaces
Tool assignments per agent (DB-backed)
ContextService as single entry point
Config centralization (86 files fixed)
Security hardening
This was the right call. It built the infrastructure that orchestration needs.
3. What You've Actually Built (Honest Inventory)
Working Foundation Systems
ContextService
✅ Built, 8 modes, 12 sections
modules/context/service.py
~800
Tool Router
✅ Single source of truth
modules/tools/tool_router.py
735
Tool Registry
✅ Core + Platform + Workspace
tool_registry.py + action_registry.py
~2,000
Unified Executor
✅ Prefix-based dispatch
unified_executor.py + 9 exec modules
~1,500
Agent Factory
✅ Clean rewrite, tool loop
agent_factory.py
1,235
Universal Router
✅ 7-tier routing
core/routing/engine.py
906
Heartbeat Service
✅ Cron-based autonomous execution
heartbeat_service.py
1,459
Memory (5-layer)
✅ Redis → Postgres → Mem0 → RAG
unified_memory_service.py
2,068
Chatbot Pipeline
✅ SSE streaming + tool loop
consumers/chatbot/service.py
1,963
Inter-Agent Comms
✅ Redis pub/sub + consensus
inter_agent.py
1,216
Multi-Agent Coordination
⚠️ Built but untested at scale
coordination_manager.py
877
Channel Adapters
✅ 11 platforms
channels/
~2,000
Report Service
✅ PRD-76
report_service.py
~400
Task Reconciler
✅ Stall detection + retry
task_reconciler.py
~200
Scheduled Tasks
✅ PRD-77 agent self-scheduling
scheduled_task_service.py
~300
What Does NOT Exist
orchestration_runs table
❌ Not created
No migration, no model, no code
orchestration_tasks table
❌ Not created
No migration, no model, no code
Task graph / dependency engine
❌ Not built
No DAG, no dependency resolution
Coordinator agent
❌ Not built
Heartbeat orchestrator is closest but different purpose
Verifier / critic loop
❌ Not built
No output validation against criteria
Aggregator
❌ Not built
No multi-output merging
Budget enforcement (per-run)
❌ Not built
Token budget exists in ContextService but not per-run
Run trace / explainability
❌ Not built
Heartbeat logs exist but no structured run trace
Guidance engine
❌ Not built
No prompt coaching, model recommendation, task structuring
Learned patterns
❌ Not built
No outcome tracking feeding back to recommendations
Recipe from run
❌ Not built
Can't convert a successful run into a reusable recipe
Partially Built (Exists But Not Orchestration-Grade)
9-stage workflow
Legacy, mostly dead code
modules/orchestrator/service.py marked LEGACY. Stages exist but pipeline isn't wired to live execution path
Recipe execution
DB table + scheduler exists
Only 1 concrete recipe (Jira bug triage). No dynamic recipe creation
Board tasks
PRD-72 table + bridge
Task board exists but not wired as orchestration task graph
Workflow executions
Table exists
No structured run lifecycle (start → tasks → verify → complete)
4. Competitive Landscape Analysis
Agent Zero — Hierarchical Delegation
Model: Prompt-driven agents, parent-child delegation, conversation sealing.
Multi-tenancy
❌ Single user
✅ Full workspace isolation
Persistent state
❌ In-memory (crashes lose work)
✅ PostgreSQL + Redis
Tool system
Basic (code exec, search, delegate)
✅ 3-layer registry, 40+ platform actions
Memory
FAISS in-process, LLM consolidation
✅ 5-layer stack with Mem0
Delegation
✅ Clean parent→child with topic sealing
⚠️ Inter-agent comms exist but no delegation protocol
Verification
❌ None (prompt-dependent)
❌ None
Context management
Basic history compression
✅ ContextService with 8 modes, 12 sections, budget
What to steal:
Conversation sealing after delegation (prevent context bleed)
Utility model separation (cheap model for memory/compression)
Skills as on-demand loading (not eager)
What you already beat them on:
Persistence, multi-tenancy, tool richness, context engineering, channels
OpenClaw — Personal AI Gateway
Model: Hub-and-spoke gateway, channel-first, single-user.
Channels
✅ 15+ platforms, native apps
✅ 11 channels
Multi-agent routing
✅ 6-tier deterministic bindings
✅ 7-tier Universal Router
Multi-tenancy
❌ Single trusted operator
✅ Full workspace isolation
Persistence
SQLite + JSONL files
✅ PostgreSQL + Redis + S3
Tool policy layers
✅ 6-level deny-first
⚠️ Per-agent assignment, no layered policy
Scaling
❌ Single process
✅ Multi-worker
What to steal:
Tool policy layering (gateway > agent > provider > group > sandbox)
Context compaction with dedicated summarization model
ACP protocol for external agent integration
Not relevant: Different use case (personal assistant vs. platform).
OpenAI Symphony — Issue Tracker Daemon
Model: Linear issues → isolated Codex agents → PRs. One agent per issue, no coordination.
Coordination
❌ None (isolation is the strategy)
⚠️ Has inter-agent, needs orchestration
Policy as code
✅ WORKFLOW.md (brilliant)
⚠️ Agent config in DB, not versioned
Workspace isolation
✅ Strict per-issue sandboxing
⚠️ Shared workspace with file tools
Reconciliation
✅ Self-healing poll loop
✅ TaskReconciler exists
Persistence
❌ In-memory only
✅ PostgreSQL
Multi-agent on same task
❌ Explicitly avoided
Goal for Phase 2
What to steal:
WORKFLOW.md / Policy-as-Code pattern (version agent behavior alongside code)
Reconciliation loop pattern (already have TaskReconciler — extend it)
Lifecycle hooks (before_run, after_run) for workspace setup/teardown
Issue tracker as coordination mechanism (board tasks → orchestration tasks?)
What you already beat them on:
Persistence, multi-agent, tool richness, memory, real-time channels
Perplexity Computer Use
Model: Browser automation agent with search-first approach.
Relevance: Limited. Different problem domain (web interaction vs. multi-agent orchestration). Worth watching for UX patterns around showing agent work in progress.
5. The Actual Gap to Orchestration
Here's the honest distance from where you are to a working orchestration layer:
Already Have (Don't Rebuild)
Need to Build (The Real Gap)
The Key Insight
The gap is narrower than PRD-82's original scope suggested. You don't need a Guidance Engine, Learning Engine, Prompt Coach, Model Recommender, or Recipe Builder to get orchestration working. Those are Phase 2C/2D features.
The core missing piece is:
A coordinator that creates a persistent run, decomposes it into tasks with dependencies, assigns agents, executes sequentially/parallel, verifies outputs, and records the trace.
Everything else (context, tools, agents, memory, scheduling) already works.
6. Dependency Chain — What Blocks What
Critical Path to Orchestration
What's NOT on the critical path (can be deferred)
PRD-80 (Unified Context Service) — already essentially built as
modules/context/service.pyPRD-68 (Progressive Complexity) — nice-to-have for routing, not blocking orchestration
PRD-64 (Unified Action Discovery) — partially done via ActionRegistry
PRD-69 (Agent Intelligence Layer) — Phase 2D territory
What IS blocking
PRD-81 (Mission Cleanup) — if this is cleaning up context/memory foundations, it should land first
No orchestration schema — need tables before services
No coordinator logic — this is the actual new code
7. Recommended Path Forward
Step 1: Land PRD-81 (current work)
Finish the mission cleanup. Stabilize context + memory.
Step 2: PRD-82A — Orchestration Schema + Context Modes
Scope: Database only + context modes. No execution logic.
Deliverables:
Alembic migration:
orchestration_runs,orchestration_tasks,orchestration_task_dependencies,orchestration_eventsSQLAlchemy models
Two new ContextModes:
COORDINATOR,VERIFIERModeConfig for each (which sections, tool loading strategy)
API endpoints: create run, get run, list tasks, get events
Tests for schema + context modes
Why separate: Schema changes are low-risk, high-value. Once tables exist, everything else can be built incrementally.
Step 3: PRD-82B — Sequential Coordinator Service
Scope: The coordinator. Sequential execution only. No parallelism.
Deliverables:
CoordinatorService— takes a goal, produces a plan (task list with dependencies)Plan execution loop: pick next ready task → assign agent → execute via AgentFactory → verify → mark complete
Verification step: LLM-as-judge against success criteria
Run lifecycle: created → planning → executing → verifying → completed/failed
Event logging: every state transition recorded in
orchestration_eventsIntegration with existing
ContextService(COORDINATOR mode for planning, TASK_EXECUTION for agent work, VERIFIER for validation)Tests
Why sequential first: Parallel execution adds complexity (race conditions, resource contention, partial failure handling). Get the lifecycle right first.
Step 4: PRD-82C — Parallel Execution + Budget + UI
Scope: Scale the coordinator.
Deliverables:
Bounded parallel task execution (asyncio.gather with semaphore)
Per-run token budget tracking (increment on each LLM call)
Per-run tool call budget
Budget exhaustion handling (degrade, pause, or fail)
Run trace API for frontend
Frontend: run viewer with task graph, status, budget gauge, event timeline
Tests
Step 5: PRD-82D — Guidance + Learning (Future)
Scope: Intelligence layer on top of working orchestration.
Deliverables:
Prompt coach (analyze request, suggest improvements)
Model recommender (task type → model suggestion)
Outcome capture (link run results to recommendations)
Pattern detection (repeated successful structures → recipe candidates)
Recommendation UI (preflight advice panel)
8. What You Can Reuse (Don't Reinvent)
ContextService + COORDINATOR mode
Coordinator's system prompt
ContextService + TASK_EXECUTION mode
Agent task execution (already works)
AgentFactory.execute_with_prompt()
Execute any task agent
get_tools_for_agent()
Tool loading for task agents
UnifiedToolExecutor
Tool dispatch
TaskReconciler pattern
Stall detection for orchestration tasks
HeartbeatService scheduling
Cron-triggered orchestration runs
report_service
Run output persistence
board_task_bridge
UI task display
inter_agent.py
Agent-to-agent messaging during runs
9. What This Means for PRD Count
You asked if you need 20 more PRDs or 200. Here's the honest answer:
To get orchestration working: 4 PRDs
82A (Schema + Context Modes) — ~1 week
82B (Sequential Coordinator) — ~2 weeks
82C (Parallel + Budget + UI) — ~2 weeks
82D (Guidance + Learning) — ~3 weeks
To get the full Phase 2 vision: ~8-10 PRDs
Add:
Recipe-from-run generation
Workflow pattern learning
Advanced model recommendation
Cross-run analytics
Semi-autonomous workflow builder
External action approval gates
You do NOT need to rewrite foundations
The original PRDs 01-06 were superseded by what you actually built. Don't go back and implement them as designed. The organic evolution produced something more practical.
10. Risk Assessment
Risk 1: Over-scoping PRD-82 (again)
The original draft was 24 sections covering guidance, learning, recipes, coaching, recommendations, AND orchestration. That's 6 systems. Mitigation: This document splits it into 4 focused PRDs.
Risk 2: Coordinator complexity
The coordinator needs to: decompose tasks, assign agents, manage dependencies, handle failures, retry, verify, aggregate. This is the hardest new code. Mitigation: Start sequential-only. No parallel. No dynamic replanning. Just: plan → execute in order → verify → done.
Risk 3: Context window pressure during multi-agent runs
Each agent task consumes context. A 5-task run means 5 separate LLM interactions, each needing full context assembly. Mitigation: ContextService already handles this. COORDINATOR mode gets planning context. TASK_EXECUTION mode gets agent context. They're separate calls, not one bloated prompt.
Risk 4: Cost blowout
Coordinator call + N task agent calls + N verifier calls = 2N+1 LLM calls minimum. Mitigation: Budget tracking from day 1 (PRD-82C). Coordinator can use cheaper model. Verifier can be rule-based for simple criteria.
Risk 5: Nobody uses orchestration if simple chat works
If 90% of requests are simple chat, building orchestration is premature. Mitigation: Orchestration is opt-in (triggered by recipes, heartbeats, or explicit "plan this" requests). Don't force it on simple queries.
11. Conclusion
Where you started: Ambitious 6-PRD foundation that was too abstract to implement directly.
What you built instead: A practical, working platform through 81 PRDs of organic evolution. Context Service, Tool Router, Agent Factory, Memory, Heartbeat, Channels, Routing — all real, all wired, all serving users.
What's actually missing for orchestration: Persistent run/task schema, a coordinator service, verification, and budget tracking. That's it. The execution infrastructure (agents, tools, context, memory) already works.
The path: 4 focused PRDs, building on what exists. No rewriting foundations. No 24-section fantasy documents. Schema first, sequential coordinator second, parallel + budget third, intelligence fourth.
You're closer than you think. The foundations are there. The next step is PRD-82A: the schema.
Appendix A: Competitive Pattern Matrix
Persistent runs
❌
❌
❌
❌
✅
Task graph
❌
❌
❌
❌
✅
Multi-tenant
❌
❌
❌
✅
✅
Tool richness
Low
Medium
Low
✅ High
✅ High
Memory system
FAISS only
SQLite
None
✅ 5-layer
✅ 5-layer
Context engineering
Basic
Basic
None
✅ 8 modes
✅ 10 modes
Verification
❌
❌
CI-based
❌
✅ LLM-as-judge
Budget control
❌
❌
❌
❌
✅ Per-run
Delegation
✅
Subagents
❌
⚠️ Pub/sub
✅ Coordinator
Channels
Web only
✅ 15+
None
✅ 11
✅ 11
Self-learning
❌
❌
❌
❌
✅ (82D)
Policy as code
Prompts
✅ JSON5
✅ WORKFLOW.md
DB config
DB + SKILL.md
Run explainability
❌
❌
Logs
❌
✅ Event trace
Appendix B: Patterns to Adopt from Research
From Agent Zero
Conversation sealing — after coordinator delegates to agent, seal that context to prevent bleed into next task
Utility model — use cheap model for coordinator planning, memory ops, verification heuristics
On-demand skill loading — load SKILL.md content only when agent is assigned task that needs it
From OpenClaw
Tool policy layers — consider gateway > workspace > agent > task layering for tool access control
Context compaction model — dedicated cheaper model for summarization during long runs
From Symphony
Reconciliation loop — extend TaskReconciler to cover orchestration runs (detect stalled runs, orphaned tasks)
Lifecycle hooks —
before_task,after_taskhooks for workspace setup/teardownTracker as coordinator — board tasks / mission board as the human-visible coordination layer (agents read from and write to it)
Continuation vs. retry distinction — continuation (same thread, 1s delay) vs. failure retry (fresh attempt, exponential backoff)
Last updated

