System Prompt Management
Purpose and Scope
The System Prompt Management subsystem provides centralized storage, versioning, and optimization for all LLM-facing system prompts used throughout the Automatos AI platform. This includes chatbot personalities, orchestrator prompts, specialized task prompts, and persona templates.
This page covers prompt storage, versioning, the PromptRegistry cache, admin APIs, and FutureAGI-based evaluation and optimization. For runtime agent context assembly (how prompts are combined with plugins, skills, and tools), see Agent Context Assembly. For the FutureAGI worker service architecture, see Worker Service Architecture.
Sources: orchestrator/core/models/system_prompts.py:1-205, orchestrator/core/services/prompt_registry.py:1-200
System Architecture
The prompt management system consists of four primary layers:
Database Layer: PostgreSQL storage for prompts, versions, and evaluation runs
Registry Layer:
PromptRegistrywith 60-second TTL cache and three-tier fallbackAPI Layer: Admin REST endpoints for CRUD operations and evaluation triggers
Worker Layer: Isolated
agent-opt-workerservice for FutureAGI SDK operations
High-Level Architecture Diagram
Sources: orchestrator/core/services/prompt_registry.py:35-143, orchestrator/api/admin_prompts.py:1-466, orchestrator/core/services/futureagi_service.py:44-428
Database Models
SystemPrompt Table
The SystemPrompt model stores the container for a named prompt identified by a unique slug:
id
UUID
Primary key
slug
String(120)
Unique stable identifier (e.g., "chatbot-friendly")
display_name
String(255)
Human-readable name
category
String(50)
Grouping (personality, orchestrator, specialized, persona)
description
Text
Optional documentation
variables
JSONB
Variable schema (e.g., {"agent_name": "str"})
is_active
Boolean
Global enable/disable flag
futureagi_eval_enabled
Boolean
Enable live traffic scoring
created_at
DateTime
Creation timestamp
updated_at
DateTime
Last modification timestamp
SystemPromptVersion Table
The SystemPromptVersion model stores immutable snapshots of prompt content:
id
UUID
Primary key
prompt_id
UUID
Foreign key to system_prompts
version_number
Integer
Sequential version (1, 2, 3, ...)
content
Text
The actual prompt text
change_note
String(500)
Optional change description
status
String(20)
draft | active | archived
created_by
String(255)
User ID or "system"
created_at
DateTime
Creation timestamp
eval_scores
JSONB
FutureAGI quality scores (populated after evaluation)
Constraint: Only one version per prompt can have status='active' at any time. This is enforced at the application layer.
SystemPromptEvalRun Table
The SystemPromptEvalRun model tracks FutureAGI assessment runs:
id
UUID
Primary key
prompt_id
UUID
Foreign key to system_prompts
version_id
UUID
Foreign key to system_prompt_versions
run_type
String(30)
assess | optimize | safety | live
status
String(20)
pending | running | completed | failed
scores
JSONB
Evaluation results
metadata_
JSONB
Configuration (algorithm, metrics, etc.)
error_message
Text
Failure details
started_at
DateTime
Execution start time
completed_at
DateTime
Execution end time
created_at
DateTime
Record creation timestamp
Sources: orchestrator/core/models/system_prompts.py:32-139
Model Relationships Diagram
Sources: orchestrator/core/models/system_prompts.py:32-139
PromptRegistry Service
The PromptRegistry is a singleton service providing cached access to system prompts with three-tier fallback resolution.
Architecture
Sources: orchestrator/core/services/prompt_registry.py:93-116
Cache Implementation
The CachedPrompt dataclass stores cached content with TTL tracking:
The registry maintains an in-memory dictionary _cache: Dict[str, CachedPrompt] indexed by slug.
Sources: orchestrator/core/services/prompt_registry.py:24-32
Variable Interpolation
The get() method accepts keyword arguments for variable substitution using str.format_map():
Example Usage:
The prompt template "You are {agent_name}, a friendly AI assistant..." becomes "You are Atlas, a friendly AI assistant...".
Sources: orchestrator/core/services/prompt_registry.py:59-76
Hardcoded Defaults
The registry includes hardcoded fallbacks in _HARDCODED_DEFAULTS dictionary for bootstrapping scenarios where the database is unavailable. These are minimal versions of the most critical prompts (chatbot personalities, routing classifier, etc.).
Sources: orchestrator/core/services/prompt_registry.py:149-195
Admin API Endpoints
The admin_prompts.py router provides REST endpoints for managing prompts.
Endpoint Summary
GET
/api/admin/prompts
List prompts with optional category/search filters
GET
/api/admin/prompts/categories
Get category counts
GET
/api/admin/prompts/{prompt_id}
Get single prompt details
GET
/api/admin/prompts/{prompt_id}/versions
List all versions for a prompt
POST
/api/admin/prompts/{prompt_id}/versions
Create new version (draft or activate)
POST
/api/admin/prompts/{prompt_id}/versions/{version_id}/activate
Activate specific version
POST
/api/admin/prompts/{prompt_id}/rollback
Rollback to previous version
DELETE
/api/admin/prompts/{prompt_id}/versions/{version_id}
Delete draft version
PATCH
/api/admin/prompts/{prompt_id}/futureagi-toggle
Toggle live traffic scoring
GET
/api/admin/prompts/{prompt_id}/assessment-runs
List FutureAGI runs
POST
/api/admin/prompts/{prompt_id}/assess
Trigger assessment/optimization
Sources: orchestrator/api/admin_prompts.py:42-466
Version Creation Flow
Sources: orchestrator/api/admin_prompts.py:171-223
Rollback Operation
The rollback endpoint re-activates the most recent archived version:
Find most recent
status='archived'versionUpdate current
status='active'tostatus='archived'Update found version to
status='active'Clear registry cache
Sources: orchestrator/api/admin_prompts.py:267-308
Seed System
The seed_system_prompts.py module provides idempotent initialization of default prompts.
Prompt Manifest
The PROMPT_MANIFEST list defines all default prompts across four categories:
personality
chatbot-friendly, chatbot-professional, chatbot-technical
orchestrator
routing-classifier, task-decomposer, complexity-analyzer, agent-selector, master-orchestrator, quality-assessor
specialized
nl2sql-generator, memory-injection
persona
persona-executive-assistant, persona-data-analyst, persona-code-reviewer, persona-creative-writer
Each entry specifies:
slug: Unique identifierdisplay_name: Human-readable titlecategory: Groupingdescription: Documentationvariables: Expected template variablescontent: Initial prompt text
Sources: orchestrator/core/seeds/seed_system_prompts.py:23-318
Seeding Logic
The seed_system_prompts(db: Session) function:
Iterates through
PROMPT_MANIFESTChecks if prompt with same
slugalready existsIf not, creates
SystemPromptrecordCreates initial
SystemPromptVersionwithversion_number=1,status='active',created_by='system'Commits transaction
This is called during application startup from the lifespan handler in main.py.
Sources: orchestrator/core/seeds/seed_system_prompts.py:321-365
Frontend Interface
The SystemPromptsTab component provides the admin UI for prompt management.
Component States
Sources: frontend/components/settings/SystemPromptsTab.tsx:100-727
List View
The list view displays all prompts with:
Search filter (searches display_name, slug, description)
Category filter buttons (All, personality, orchestrator, specialized, persona)
Prompt cards showing: display_name, category badge, version count
Sources: frontend/components/settings/SystemPromptsTab.tsx:302-379
Detail View Tabs
Content Tab:
Read mode: Shows active content in
<pre>block, quality scores if availableEdit mode:
<Textarea>for content,<Input>for change noteActions: "Edit Prompt", "Rollback", "Save as Draft", "Save & Activate"
Versions Tab:
Lists all versions with version number, status badge, change note, timestamp
Actions: "Activate" button for non-active versions, "Delete" button for drafts
Assessments Tab:
FutureAGI live scoring toggle
Trigger buttons: "Score Quality", "Optimize", "Safety Scan"
Assessment run results with status badges, scores, optimized prompts
Sources: frontend/components/settings/SystemPromptsTab.tsx:420-724
Assessment Run Polling
When assessment runs have status='pending' or status='running', the component polls every 3 seconds:
Sources: frontend/components/settings/SystemPromptsTab.tsx:165-173
FutureAGI Integration
The FutureAGI integration provides automated prompt evaluation, safety checks, and optimization through the agent-opt and ai-evaluation SDKs.
Service Architecture
Sources: orchestrator/core/services/futureagi_service.py:44-428, services/agent-opt-worker/main.py:1-545
Evaluation Types
Assess (/assess endpoint):
Runs quality metrics:
completeness,is_helpful,is_concise,prompt_adherence,factual_accuracyUses real input/output pairs from live traffic or synthetic test cases
Returns scores 0.0-1.0 per metric with pass/fail status and reasoning
Safety (/safety endpoint):
Runs protection metrics:
toxicity,prompt_injection,content_moderationPrefixes prompt with context preamble: "NOTE: The following text is a SYSTEM PROMPT..."
Returns overall safe/unsafe verdict with per-check details
Optimize (/optimize endpoint):
Starts async optimization job, returns
job_idCollects dataset from recent chat messages
Runs meta_prompt algorithm (or bayesian/protegi/random)
Polls via
GET /optimize/{job_id}until completedReturns optimized prompt text, initial/final scores, history
Score (/score endpoint):
Scores a single input/output pair (used for live traffic)
Fire-and-forget from chat pipeline when
futureagi_eval_enabled=true
Sources: services/agent-opt-worker/main.py:218-327
Template Variable Escaping
The worker implements special escaping for template variables like {agent_name} to prevent .format() crashes during optimization:
After optimization, _restore_template_vars() reverses the process.
Sources: services/agent-opt-worker/main.py:351-372
Dataset Collection
The orchestrator collects datasets from the messages table by joining consecutive user/assistant message pairs:
This extracts real conversational turns for optimization training.
Sources: orchestrator/core/services/futureagi_service.py:307-343
Version Lifecycle
System prompt versions follow a strict state machine:
State Diagram
Sources: orchestrator/api/admin_prompts.py:171-308
Activation Rules
Only one version per prompt can have
status='active'When activating a version, the current active version is automatically archived
Draft versions can be deleted; active/archived versions cannot
Rollback re-activates the most recent archived version
Every activation clears the
PromptRegistrycache for that slug
Sources: orchestrator/api/admin_prompts.py:225-264
Live Traffic Scoring
When SystemPrompt.futureagi_eval_enabled=true, the system automatically scores every chat response.
Scoring Flow
Sources: orchestrator/core/services/futureagi_service.py:233-301
Integration Point
The chat service calls this after generating each assistant response:
This builds a dataset over time for future optimization runs.
Sources: orchestrator/core/services/futureagi_service.py:233-301
Configuration
Environment Variables
AGENT_OPT_WORKER_URL
Worker service URL
http://agent-opt-worker.railway.internal:8080
FUTUREAGI_API_KEY
FutureAGI API key (worker reads this)
(none)
FUTUREAGI_SECRET_KEY
FutureAGI secret key (worker reads this)
(none)
OPENAI_API_KEY
OpenAI key for optimization (worker reads this)
(none)
The orchestrator only needs AGENT_OPT_WORKER_URL. The worker reads the FutureAGI and OpenAI keys directly.
Sources: orchestrator/core/services/futureagi_service.py:24-26, services/agent-opt-worker/main.py:41-51
Worker Isolation
The agent-opt-worker runs in a separate container to isolate the agent-opt and ai-evaluation SDKs from the main orchestrator, preventing version conflicts and dependency bloat.
Docker Setup:
Sources: services/agent-opt-worker/Dockerfile:1-16, services/agent-opt-worker/requirements.txt:1-7
Template Configuration
The worker uses a TEMPLATE_CONFIG dictionary to define:
Required input keys per template (
input,output,context)Best model per template (
turing_large,protect, etc.)
completeness
input, output
turing_large
is_helpful
input, output
turing_large
is_concise
output
turing_large
toxicity
output
protect
prompt_injection
input
protect
content_moderation
output
protect
This ensures each evaluation template receives only the data it expects.
Sources: services/agent-opt-worker/main.py:124-136
Summary
The System Prompt Management subsystem provides:
Versioned Storage: PostgreSQL-backed prompts with draft/active/archived lifecycle
Cached Access: 60-second TTL cache with DB and hardcoded fallbacks
Admin Interface: REST APIs and React UI for CRUD operations
Evaluation System: FutureAGI integration for quality scoring and optimization
Live Learning: Automatic scoring of chat responses to build optimization datasets
Worker Isolation: Separate container for SDK dependencies
All LLM-facing prompts in the platform are managed through this system, enabling centralized updates, A/B testing via draft versions, and continuous improvement through FutureAGI optimization.
Sources: orchestrator/core/models/system_prompts.py:1-205, orchestrator/core/services/prompt_registry.py:1-200, orchestrator/api/admin_prompts.py:1-466, orchestrator/core/services/futureagi_service.py:1-432, services/agent-opt-worker/main.py:1-545
Last updated

