Quality Assessment & Learning
This page documents the quality assessment and learning system for workflow recipes. After a recipe execution completes, two optional analysis stages can be triggered: quality assessment (Stage 7) evaluates execution performance across 5 dimensions, and learning analysis (Stage 6) extracts patterns and generates improvement suggestions. Results are stored in the recipe's learning_data field and used to continuously improve recipe performance.
For information about recipe execution itself, see Recipe Execution. For creating and configuring recipes, see Creating Recipes.
System Overview
The quality assessment and learning system operates as post-execution analysis stages that provide feedback for recipe improvement. The system consists of:
RecipeQualityService: Evaluates execution quality across 5 dimensions
RecipeLearningService: Extracts patterns and generates improvement suggestions
Learning Data Storage: JSONB field on
workflow_recipestable storing historical analysisSuggestions API: Retrieves accumulated learning insights for display
Both services analyze the step_results field from a completed RecipeExecution record to derive metrics and insights.
Sources: orchestrator/api/workflow_recipes.py:709-828
Quality Assessment System
Assessment Trigger
Quality assessment is triggered via API after a recipe execution completes:
The endpoint validates:
Recipe exists in workspace
Execution belongs to this recipe
Execution has completed (status = 'completed' or 'failed')
Sources: orchestrator/api/workflow_recipes.py:770-828
Five-Dimensional Quality Model
Sources: orchestrator/api/workflow_recipes.py:770-828
Quality Score Calculation
The RecipeQualityService.assess_quality() method computes:
quality_score
float (0.0-1.0)
Weighted average of 5 dimensions
breakdown
dict
Per-dimension scores and explanations
grade
string
Letter grade (A: 0.9+, B: 0.8+, C: 0.7+, D: 0.6+, F: <0.6)
bottlenecks
list
Steps with poor performance or errors
The score is stored on the recipe record:
Sources: orchestrator/api/workflow_recipes.py:814-820
Frontend Quality Display
Quality scores are displayed in the recipe cards with color-coded progress bars:
Sources: frontend/components/workflows/recipes-tab.tsx:298-371
The quality bar is rendered inline on each recipe card:
Sources: frontend/components/workflows/recipes-tab.tsx:353-371
Learning System
Learning Analysis Trigger
Learning analysis is triggered via API after execution completes:
The endpoint validates ownership and calls RecipeLearningService.analyze_execution().
Sources: orchestrator/api/workflow_recipes.py:713-768
Pattern Extraction
The learning service extracts three types of patterns from execution results:
Sources: orchestrator/api/workflow_recipes.py:755-760
Learning Data Schema
The learning_data JSONB field stores:
latest_suggestions
list
Most recent improvement suggestions
latest_patterns
list
Most recent pattern observations
latest_performance
dict
Most recent performance metrics
last_analyzed_at
string
ISO timestamp of last analysis
analyses
list
Historical analysis results (append-only)
Example structure:
Sources: orchestrator/api/workflow_recipes.py:853-863
Suggestions Retrieval
The suggestions endpoint exposes accumulated learning insights:
Response:
Sources: orchestrator/api/workflow_recipes.py:830-869
Suggestions UI Integration
The recipe card displays a suggestions badge when learning data exists:
Clicking the badge opens the recipe detail modal which displays the full suggestions panel.
Sources: frontend/components/workflows/recipes-tab.tsx:338-347
Execution Tracking
RecipeExecution Model
The recipe_executions table tracks execution state for quality/learning analysis:
Sources: orchestrator/alembic/versions/20260201_add_recipe_executions.py:23-43
Step Results Format
The step_results JSONB array stores per-step execution data:
This data is the primary input to both quality assessment and learning analysis.
Sources: orchestrator/alembic/versions/20260201_add_recipe_executions.py:34, frontend/components/workflows/recipe-step-progress.tsx:18-37
API Endpoints
Assessment & Learning Endpoints
POST
/api/workflow-recipes/{recipe_id}/assess-quality
Trigger quality assessment
POST
/api/workflow-recipes/{recipe_id}/learn
Trigger learning analysis
GET
/api/workflow-recipes/{recipe_id}/suggestions
Get improvement suggestions
GET
/api/workflow-recipes/{recipe_id}/executions
List executions with quality scores
GET
/api/workflow-recipes/{recipe_id}/executions/{execution_id}
Get execution detail
Sources: orchestrator/api/workflow_recipes.py:709-928
Execution Listing with Quality Scores
The executions endpoint supports filtering by status and returns quality scores:
Response:
Sources: orchestrator/api/workflow_recipes.py:872-928
Complete Quality & Learning Flow
Sources: orchestrator/api/workflow_recipes.py:542-828
Frontend Integration
Recipe Card Quality Display
Quality scores are displayed directly in the recipe grid:
Quality Score Bar: Progress bar with color coding (green/yellow/red)
Suggestions Badge: Lightbulb icon with count of suggestions
Execution Count: Number of runs for statistical confidence
Sources: frontend/components/workflows/recipes-tab.tsx:294-479
React Query Hooks
Frontend uses dedicated hooks for quality/learning data:
Sources: frontend/hooks/use-recipe-api.ts:162-196
Recipe Detail Modal Integration
The view recipe modal displays:
Quality score with grade badge
Latest suggestions in expandable panel
Recent executions with per-execution quality scores
Performance trends across executions
When a user clicks a recipe card's suggestions badge, it opens the modal and scrolls to the suggestions section.
Sources: frontend/components/workflows/recipes-tab.tsx:117-126, frontend/components/workflows/recipes-tab.tsx:494-517
Auto-Learning Configuration
Recipes can enable automatic learning analysis via execution_config.auto_learn:
When enabled, the system automatically triggers learning analysis after each completed execution without requiring manual API calls.
Sources: orchestrator/api/workflow_recipes.py:219-227
Quality Threshold Enforcement
The quality_threshold in execution_config can be used to fail executions that don't meet quality standards:
If quality assessment is enabled and the execution scores below this threshold, it can be marked as failed or trigger automatic retries.
Sources: orchestrator/api/workflow_recipes.py:219-227
Best Practices
When to Assess Quality
Always: For production recipes with SLAs
Periodically: For development recipes (e.g., every 5th execution)
Never: For simple single-step recipes (minimal benefit)
When to Trigger Learning
After failures: To identify root causes
After quality degradation: When scores drop below baseline
Periodically: Every 10-20 executions to update patterns
Before optimization: To establish baseline metrics
Interpreting Suggestions
Learning suggestions are categorized by type:
Token reduction
Optimize prompts
Medium
Parallelization
Restructure dependencies
High (performance)
Retry logic
Add error handling
High (reliability)
Timeout increase
Adjust per-step limits
Low
Agent substitution
Use different agent
Medium
Suggestions should be evaluated based on the recipe's quality score trend and execution frequency.
Sources: orchestrator/api/workflow_recipes.py:713-869
Last updated

