PRD-80: Unified Context Service
Version: 1.0 Status: Draft Priority: P0 Author: Gar Kavanagh + Claude Created: 2026-03-12 Updated: 2026-03-12 Dependencies: PRD-64 (Unified Action Discovery — COMPLETE), PRD-68 (Progressive Complexity Routing — COMPLETE), PRD-71 (Unified Skills — COMPLETE), PRD-76 (Agent Reporting — COMPLETE)
Executive Summary
Every time we fix a prompt, tool loading, or memory injection bug, we have to patch it in 3–5 different places. The chatbot builds prompts one way (smart_orchestrator.py → personality.py). Agent task execution builds them another way (agent_factory.py → _build_agent_system_prompt). Heartbeats copy-paste from the factory. Recipes have their own path. The orchestrator stages have yet another. This fragmentation has caused:
Chatbot missing platform actions — fixed in commit
4a8d7e3but only because someone noticed Auto couldn't seeplatform_executeComposio tools loaded differently per code path — factory built typed schemas, chatbot used generic
composio_executeMemory injected in different formats — chatbot via
get_happy_system_prompt(memories=...), factory via string concatenation, heartbeat skipped entirelyTool count explosion — factory sent 107 tools, chatbot sent 76, heartbeat sent all platform tools individually before the dispatcher fix
Daily logs, action summaries, personality all wired independently into each path with different bugs each time
This PRD introduces a single ContextService that every LLM-calling code path uses. One place to build prompts, load tools, inject memory, manage token budgets. Fix it once, fixed everywhere.
What We're Building
modules/context/package — new module containing the unified context serviceContextServiceclass — single entry point:build_context(agent, mode, messages) → ContextResultComposable prompt sections — identity, skills, platform actions, memory, tools, task context assembled declaratively
Token budget manager — sections have priority weights; low-priority content gets trimmed first when approaching limits
ContextResultdataclass — containssystem_prompt,tools,tool_choice,messages,metadataready for any LLM callMigration of all 9 code paths to use
ContextServiceinstead of building prompts/tools themselvesDead code cleanup — remove
_build_agent_system_prompt,get_happy_system_promptcomplexity, duplicated tool loading
What We're NOT Building
A prompt versioning UI (future PRD — admin prompt editing stays in
system_promptstable)A/B testing framework for prompts (future)
LLM-specific prompt formatting (we target OpenAI chat format; provider adapters stay in
llm_manager.py)New memory system (Mem0 stays; we just standardize how memories are injected into context)
1. The Problem: 9 Fragmented Code Paths
Every path that calls an LLM independently builds its own prompt, loads its own tools, and injects its own context. Here's the current state:
1
Chatbot (Auto)
consumers/chatbot/smart_orchestrator.py
get_happy_system_prompt()
smart_tool_router.route()
smart_memory.retrieve_memories()
build_prompt_summary() (added 2026-03-12)
2
Agent Task Execution
modules/agents/factory/agent_factory.py
_build_agent_system_prompt()
get_tools_for_agent()
String concatenation in prompt
build_prompt_summary()
3
Heartbeat Service
services/heartbeat_service.py
Inline f-string
to_dispatcher_schema() only
None
Inline summary
4
Recipe Executor
api/recipe_executor.py
Recipe-specific prompt + agent prompt
Inherits from factory
None
None
5
Execution Manager
modules/agents/execution/execution_manager.py
Delegates to factory
Delegates to factory
None
Via factory
6
Universal Router
core/routing/engine.py
Per-tier prompts
Per-tier tool selection
None
None
7
Orchestrator Stages
modules/orchestrator/stages/*.py
Per-stage prompts
None (LLM-only)
None
None
8
Board Task Chat
Via chatbot path
Via chatbot
Via chatbot
Via chatbot
Via chatbot
9
NL2SQL
modules/nl2sql/service.py
Schema-specific prompt
None (text-only)
None
None
What Goes Wrong
Feature added to one path, missing from others — platform action summary was added to factory + chatbot but not heartbeat or recipes until manually patched
Memory format divergence — chatbot formats memories as bullet list via
personality.py, factory dumps raw strings, heartbeat gets nothingTool count inconsistency — factory sends
get_tools_for_agent()(core + platform dispatcher + composio), chatbot sendssmart_tool_router.route()(filtered subset), heartbeat sends only platform dispatcherPersonality applied inconsistently — chatbot uses
AutomatosPersonalitywith workspace settings, factory uses basic system prompt, heartbeat uses hardcoded textNo token awareness — prompts grow unbounded; adding platform action summary + daily logs + memory + skill instructions can exceed model context with zero warnings
2. Architecture: The Unified Context Service
2.1 Design Principles (from Context Engineering)
Inspired by David Kamm & IBM's Context Engineering framework:
Composable Sections — Context is built from independent sections (atoms → molecules → cells), each responsible for one concern. Sections can be included, excluded, or reordered without touching other sections.
Token Budgets as First-Class Constraints — Every section declares its priority and max token allocation. The assembly pipeline respects a total budget and trims low-priority sections first.
Declarative Mode-Based Assembly — Each "mode" (chatbot, task_execution, heartbeat, etc.) declares which sections it needs and in what order. No imperative if/else chains.
Single Source of Truth — One module owns prompt construction. Callers provide context (agent, task, messages), service returns ready-to-send payload.
2.2 Module Structure
2.3 Core Interfaces
2.4 Token Budget Manager
Priority assignments:
1
Identity
Agent must know who it is
2
Task Context / Recipe Context
Must know what to do
3
Tools (schemas)
Must know what tools are available
4
Skills
SKILL.md instructions guide behaviour
5
Platform Actions
Action catalog for platform_execute
6
Memory
User context, preferences
7
Daily Logs
Recent activity for awareness
8
Datetime
Nice-to-have temporal context
9
Custom
Workspace-level custom prompts
2.5 Tool Loading Strategy
Tool loading is unified in the tools section but varies by mode:
Tool assembly (single implementation, used by all modes):
This replaces:
get_tools_for_agent()intool_router.pysmart_tool_router.route()in chatbot pathInline
to_dispatcher_schema()calls in heartbeatTool assembly in
agent_factory.py
2.6 Memory Integration
Memory retrieval is unified in the memory section:
This replaces:
Memory retrieval + formatting in
smart_orchestrator.py:157-178Memory string concatenation in
agent_factory._build_agent_system_prompt()Missing memory in heartbeat/recipe paths
3. Section Details
3.1 Identity Section
Renders:
Replaces:
get_happy_system_prompt()identity portion inpersonality.py_build_agent_system_prompt()opening inagent_factory.pyHardcoded "You are a helpful AI assistant" in heartbeat
3.2 Skills Section
Renders: The full SKILL.md text from the agent's assigned skill (loaded from agent_skills table → skills table → content field).
Replaces:
Skill injection in
agent_factory._build_agent_system_prompt()Missing skill injection in heartbeat (currently heartbeat gets skill content via its own path)
3.3 Platform Actions Section
Renders: Output of ActionRegistry.build_prompt_summary() — grouped by category with parameter hints.
Replaces:
build_prompt_summary()injection insmart_orchestrator.py:262-268build_prompt_summary()injection inagent_factory.pyInline summary in
heartbeat_service.py
3.4 Memory Section
See §2.6 above. Renders:
3.5 Task Context Section
Renders:
3.6 Recipe Context Section
Renders:
3.7 Conversation Section
Renders: Formatted message history, filtered and converted:
Strips system messages (we build our own)
Converts
partsformat to plain textTrims oldest messages if exceeding token budget
4. How Callers Change
4.1 Agent Factory (Task Execution)
Before:
After:
4.2 Smart Chat Orchestrator (Chatbot)
Before:
After:
4.3 Heartbeat Service
Before:
After:
4.4 Recipe Executor
Before:
After:
5. Migration Strategy
Phase 1: Build the Module (No Breaking Changes)
Goal: Create modules/context/ with full ContextService. All existing code paths continue working unchanged.
Files created:
modules/context/__init__.pymodules/context/service.pymodules/context/result.pymodules/context/modes.pymodules/context/budget.pymodules/context/estimator.pymodules/context/sections/base.pymodules/context/sections/identity.pymodules/context/sections/skills.pymodules/context/sections/platform_actions.pymodules/context/sections/memory.pymodules/context/sections/tools.pymodules/context/sections/task_context.pymodules/context/sections/recipe_context.pymodules/context/sections/datetime_context.pymodules/context/sections/conversation.pymodules/context/sections/custom.py
Verification: Unit tests for each section + integration test that build_context() produces equivalent output to current paths.
Phase 2: Migrate Callers (One at a Time)
Each migration follows the same pattern:
Add
ContextServicecall alongside existing codeLog both outputs, verify equivalence
Switch to
ContextServiceoutputRemove old code
Migration order (least risk → most risk):
1
Heartbeat Service
LOW
Runs on schedule, easy to test, simple prompt
2
Agent Factory
MEDIUM
Core execution path, well-tested
3
Recipe Executor
MEDIUM
Uses factory internally, limited usage
4
Execution Manager
LOW
Delegates to factory, thin wrapper
5
Smart Orchestrator (Chatbot)
HIGH
User-facing, intent classification interplay
6
Universal Router
LOW
Tier routing, independent prompts
7
Orchestrator Stages
LOW
Internal LLM calls, no tools
8
NL2SQL
LOW
Isolated, schema-specific
9
Channels (Telegram, etc.)
MEDIUM
Uses factory, needs testing
Phase 3: Cleanup
Delete
_build_agent_system_prompt()fromagent_factory.pyDelete
get_happy_system_prompt()frompersonality.py(move personality logic toIdentitySection)Delete
smart_tool_router.py(filtering moves toToolLoadingStrategy.FILTERED)Delete tool loading from
tool_router.py:get_tools_for_agent()(moves toToolsSection)Consolidate
build_prompt_summary()intoPlatformActionsSectionRemove memory injection from
smart_orchestrator.py(moves toMemorySection)
Phase 4: Advanced Features (Future)
Prompt versioning via
system_promptstable integrationA/B testing section variants
Per-workspace section overrides (admin can disable/reorder sections)
Token usage analytics (which sections consume the most tokens per mode)
6. Token Budget Model
6.1 Default Budgets by Mode
Chatbot
128K
4K
60K
64K
Task Execution
128K
4K
20K
104K
Heartbeat
128K
2K
0
8K
Recipe
128K
4K
10K
40K
NL2SQL
128K
2K
2K
8K
6.2 Token Estimation
We use a character-based estimator (4 chars ≈ 1 token) as the fast path, with optional tiktoken for precise estimation when the rough estimate is within 10% of the budget.
6.3 Trimming Behaviour
When total section tokens exceed the budget:
Soft trim — Sections with
max_tokenscaps get truncated to their capHard trim — If still over, drop sections from priority 10 → 1 until within budget
Never drop — Priority 1-2 sections (identity, task context) are never dropped
Log warnings — Every trim/drop is logged with section name and tokens saved
7. Observability
7.1 Logging
Every build_context() call logs:
7.2 SSE Events
The chatbot path currently emits memory_retrieved SSE events. ContextResult includes memory_context so the chatbot can continue emitting these events without reaching into internals.
7.3 Metrics (Future)
context_build_duration_ms— histogram by modecontext_tokens_used— gauge by mode + sectioncontext_sections_trimmed— counter by section name
8. Testing Strategy
8.1 Unit Tests
Each section gets its own test file:
Key assertions:
Each section renders expected content given known inputs
Budget manager trims lowest-priority sections first
Budget manager never drops priority 1-2 sections
Token estimator is within 20% of tiktoken for sample texts
8.2 Integration Tests
build_context(CHATBOT)produces prompt containing identity, memory, platform actionsbuild_context(TASK_EXECUTION)includes task description and full tool setbuild_context(HEARTBEAT)produces prompt under 8K tokensTool schemas match expected structure (OpenAI function calling format)
8.3 Equivalence Tests (Migration Phase)
For each caller migration:
Capture current output (prompt + tools + messages) for 5 representative inputs
Run same inputs through
ContextServiceAssert semantic equivalence (exact match not required; key sections must be present)
9. Risk Assessment
Breaking existing prompts during migration
HIGH
Migrate one caller at a time, run equivalence tests, dual-write during transition
Token estimator inaccuracy
MEDIUM
Use conservative estimates (overcount by 10%), log actual vs estimated
Circular imports
MEDIUM
modules/context/ depends on modules/tools/, modules/memory/, core/models/ — keep dependency direction clear, no reverse imports
Performance regression (async section rendering)
LOW
Sections that need DB/API calls run in parallel via asyncio.gather()
Mode config drift (new features added to config but not to service)
MEDIUM
All prompt modifications must go through section classes — no direct string injection
10. Success Criteria
Code paths using ContextService
9/9 (100%)
Lines of prompt-building code deleted
> 500
Time to add new prompt section to all agents
< 30 minutes (add 1 section class + register in modes)
Token budget violations (prompts exceeding model context)
0
Bug requiring multi-file prompt fix
0 (fix in section class, affects all modes)
11. File Impact Summary
New Files
modules/context/__init__.py
Package exports
modules/context/service.py
ContextService
modules/context/result.py
ContextResult dataclass
modules/context/modes.py
ContextMode enum + configs
modules/context/budget.py
TokenBudgetManager
modules/context/estimator.py
Token estimator
modules/context/sections/*.py
11 section classes
tests/test_context/*.py
Unit + integration tests
Modified Files
modules/agents/factory/agent_factory.py
Replace _build_agent_system_prompt + tool loading with ContextService.build_context()
consumers/chatbot/smart_orchestrator.py
Replace prompt building + memory + tool routing with ContextService.build_context()
services/heartbeat_service.py
Replace inline prompt + tool loading with ContextService.build_context()
api/recipe_executor.py
Replace prompt building with ContextService.build_context()
modules/agents/execution/execution_manager.py
Delegate to ContextService
core/routing/engine.py
Use ContextService for per-tier prompts
modules/orchestrator/stages/*.py
Use ContextService for stage prompts
modules/nl2sql/service.py
Use ContextService for schema prompt
Deleted Files (Phase 3)
consumers/chatbot/smart_tool_router.py
Filtering moves to ToolsSection
Parts of consumers/chatbot/personality.py
Personality moves to IdentitySection
Files NOT Touched
modules/tools/execution/unified_executor.py
Tool execution stays separate from context building
modules/tools/discovery/action_registry.py
Keeps build_prompt_summary() — consumed by PlatformActionsSection
core/composio/client.py
Composio SDK stays; tool schemas consumed by ToolsSection
modules/memory/
Memory services stay; consumed by MemorySection
12. Relationship to Other PRDs
PRD-03 (Context Engineering Layer)
PRD-80 supersedes PRD-03's prompt management aspects. PRD-03 was theoretical; PRD-80 is the concrete implementation.
PRD-51 (Orchestrator Unification)
PRD-80 is complementary — PRD-51 unified the routing/execution flow, PRD-80 unifies the context/prompt flow.
PRD-58 (Prompt Management)
PRD-80 subsumes PRD-58. The FutureAGI integration and prompt versioning UI remain future work.
PRD-64 (Unified Action Discovery)
PRD-80 consumes PRD-64's ActionRegistry via PlatformActionsSection.
PRD-68 (Progressive Complexity)
PRD-80's modes support complexity-aware context (e.g., skip memory for simple queries via complexity_assessment).
PRD-69 (Agent Intelligence Layer)
PRD-80 provides the context backbone that PRD-69's intelligence features would plug into.
Appendix A: Context Engineering Patterns Applied
From David Kamm & IBM's Context Engineering framework:
Atoms → Molecules → Cells
Sections are atoms; mode configs compose atoms into molecules; build_context() is the cell
Token budgets as constraints
TokenBudgetManager enforces hard limits with priority-based trimming
Declarative assembly
MODE_CONFIGS dict declares section composition per mode — no imperative if/else
Schema-driven context
ContextResult is a typed schema; sections implement BaseSection interface
Separation of concerns
Each section owns exactly one type of context; no section reaches into another
Appendix B: Current Prompt Sizes (Estimated)
Measured from production logs and code analysis:
Identity / personality
~200
get_happy_system_prompt base
Skill content (SENTINEL)
~1,800
Full SKILL.md
Platform action summary
~1,200
58 actions grouped by category
Memory injection
~400
5-10 memories as bullets
Daily logs
~500
Last 2000 chars
Datetime context
~30
Single line
Tool schemas (19 tools)
~3,000
OpenAI function format
Total (task execution)
~7,130
Well within budget
Total (chatbot, 20 msgs)
~12,000
Includes message history
Last updated

