PRD 17: Dynamic Tool Assignment & Centralized Tool Management

Status: ✅ COMPLETE - All Phases Deployed to Production Priority: P0 - Critical for Task-Agnostic Platform Completion Date: October 18, 2024 Total Effort: 5 weeks Dependencies: Existing tool systems (agent_platform_tools, agent_action_executor, mcp_tool_executor)

See: PRD17_COMPLETION_REPORT.md for detailed results and metrics.

Executive Summary

Transform Automatos AI from a research-focused platform to a truly task-agnostic orchestration system capable of handling any workflow task (restart servers, update databases, write code, create PRs, fix bugs, analyze data, etc.) through:

Centralized Tool Registry - Single source of truth for all tools accessible to Orchestrator, agents, ChatBot, User, and future integrations
Dynamic Tool Assignment - Agents automatically receive the right tools based on task type
Complete Tool Integration - Unify file operations, shell commands, RAG, CodeGraph, and MCP tools under one system

Current Problem: agent_factory.py hardcodes only 3 research tools (search_knowledge, semantic_search, search_codebase), preventing agents from accessing file operations, shell commands, or MCP tools needed for diverse tasks.

Target State: Agents dynamically receive appropriate tool sets based on task requirements:

Code tasks → file ops + GitHub tools
Infrastructure tasks → shell commands + AWS tools
Research tasks → RAG + CodeGraph
Database tasks → SQL ops + shell tools

1. Problem Statement & Vision

1.1 Current State Analysis

Existing Tool Systems (6 separate files managing tools):

agent_platform_tools.py - 3 research tools (RAG, semantic, CodeGraph)
agent_action_executor.py - File/shell operations (read, write, execute)
mcp_tool_executor.py - MCP tool execution with permissions
function_registry.py - LLM function calling registry (OpenAI format)
agent_factory.py - Hardcoded tool injection (lines 627-650)
models.py - Database models (MCPTool, AgentToolAssignment, ToolUsageLog)

Critical Issues:

No unified tool registry accessible to all components
Agent factory hardcodes only research tools
Tools scattered across 6 files with different interfaces
No dynamic tool assignment based on task type
MCP tools exist but not available to agents
Context Engineering, Chatbot, Orchestrator all access tools differently
Risk of breaking existing functionality when integrating

1.2 Vision Alignment

Following Context Engineering paradigm:

Atoms: Individual tool functions (read_file, search_knowledge, execute_command)
Molecules: Grouped tool capabilities (file_operations, research_tools, infrastructure_tools)
Cells: Agent tool assignments (code_architect gets GitHub + file ops)
Organs: Multi-agent workflows with coordinated tool access
Organisms: Complete task-agnostic orchestration system

2. Solution Architecture

2.1 Centralized Tool Registry (ToolRegistry)

Location: orchestrator/services/tool_registry.py ✅ CREATED

Purpose: Single source of truth for ALL tools in the platform

Tool Categories:

class ToolCategory(Enum):
    RESEARCH = "research"              # RAG, semantic search, CodeGraph
    FILE_OPERATIONS = "file_ops"       # read, write, delete files
    SHELL_COMMANDS = "shell"           # execute shell commands
    MCP_TOOLS = "mcp"                  # Third-party MCP integrations
    DATABASE_TOOLS = "database"        # SQL operations (future)
    SSH_TOOLS = "ssh"                  # SSH operations (future)
    API_TOOLS = "api"                  # REST API calls (future)
    GIT_OPERATIONS = "git"             # Git operations (future)

Registered Tools (Phase 1):

Research: search_knowledge, semantic_search, search_codebase
File Operations: read_file, write_file, delete_file, list_directory, create_directory
Shell Commands: execute_command
MCP Tools: Dynamically loaded from database

2.2 Task-to-Tool Mapping (ToolCapabilityMapper)

Location: orchestrator/services/tool_capability_mapper.py ✅ CREATED

Task Type Mappings (25+ predefined):

Task Type

Required Tools

Optional Tools

Rationale

code_review

research, file_ops

mcp

Requires reading files and researching patterns

bug_fix

research, file_ops, shell

mcp

Needs code analysis, modifications, and testing

security_audit

research, file_ops, shell

mcp

Requires analysis, inspection, and security tools

server_restart

shell

ssh, mcp

Needs command execution

deployment

shell, file_ops

ssh, mcp

Requires configuration and commands

database_update

database, shell

file_ops

Needs SQL operations and validation

create_pr

file_ops, mcp

shell, research

Requires file changes and GitHub integration

documentation

research, file_ops

Research and writing

data_analysis

research, file_ops

database, shell

Analysis and processing

general

research

Safe default

2.3 Unified Tool Execution

Architecture:

┌─────────────────────────────────────────────────────────────────┐
│                    CENTRALIZED TOOL REGISTRY                     │
│                  (New: tool_registry.py)                         │
├─────────────────────────────────────────────────────────────────┤
│  Provides tools to:                                              │
│  ├─ Orchestrator (task decomposition with tool recommendations)  │
│  ├─ Agent Factory (dynamic tool injection)                       │
│  ├─ ChatBot (tool-augmented responses)                          │
│  ├─ User/API (tool discovery & execution)                        │
│  └─ Future integrations (plugins, extensions)                    │
├─────────────────────────────────────────────────────────────────┤
│  Wraps existing systems (NON-BREAKING):                          │
│  ├─ AgentPlatformTools (research: RAG, semantic, CodeGraph)     │
│  ├─ ActionExecutor (file_ops, shell commands)                   │
│  ├─ MCPToolExecutor (third-party integrations)                  │
│  ├─ FunctionRegistry (LLM function calling format)              │
│  └─ Database (MCPTool, AgentToolAssignment models)              │
└─────────────────────────────────────────────────────────────────┘

3. Implementation Status

Phase 1: Centralized Tool Registry ✅ COMPLETED

Step 1.1: Create ToolRegistry Service ✅

File: orchestrator/services/tool_registry.py (NEW)

Features:

Register all existing tools from platform_tools, action_executor, MCP
Categorize tools by type (research, file_ops, shell, mcp, etc.)
Provide query methods (by category, by task type, by name)
Export to OpenAI function calling format
Support for tool metadata (permissions, requirements, examples)
Security level tracking (safe, cautious, dangerous, critical)

Registered Tools:

3 Research tools (search_knowledge, semantic_search, search_codebase)
5 File operation tools (read, write, delete, list, create_directory)
1 Shell command tool (execute_command)
MCP tools dynamically loaded from database

Step 1.2: Tool Capability Mapper ✅

File: orchestrator/services/tool_capability_mapper.py (NEW)

Features:

25+ predefined task-to-tool mappings
Task type inference from descriptions (keyword-based)
LLM-ready for complex inference (future)
Custom mappings support via database
Security level recommendations
Tool combination validation
Context-aware recommendations

Step 1.3: Database Schema ✅

File: orchestrator/alembic/versions/004_tool_registry_schema.py (NEW)

Tables:

tool_registry - All registered tools with metadata
task_tool_mappings - Custom task-to-tool mappings
Enhanced tool_usage_logs - Additional tracking fields

Step 1.4: Unit Tests ✅

Files:

tests/test_tool_registry.py (NEW) - ToolRegistry tests
tests/test_tool_capability_mapper.py (NEW) - Mapper tests
tests/test_tool_system_integration.py (NEW) - Integration tests

Test Coverage:

Tool registration and retrieval
Category-based queries
OpenAI format export
Prompt generation
Task type inference
Tool recommendations
Security validation
End-to-end scenarios

4. Next Steps (Phase 2-5)

Phase 2: Dynamic Tool Assignment (Week 2: 16-20h)

Update RealTaskDecomposer to output required_tools
Update AgentFactory with dynamic tool injection
Create UnifiedToolExecutor
Integration testing

Phase 3: Complete Integration (Week 3: 16-20h)

Orchestrator integration
ChatBot integration
Frontend enhancements
API endpoints

Phase 4: Testing & Validation (Week 4: 12-16h)

End-to-end testing
Backward compatibility testing
Performance testing
Documentation

Phase 5: Cleanup & Polish (Week 5: 8h)

Code cleanup
Documentation
UI polish

5. API Endpoints (Planned)

# Tool Registry
GET /api/v1/tools/registry              # List all registered tools
GET /api/v1/tools/categories            # List tool categories
GET /api/v1/tools/registry/{name}       # Get specific tool
POST /api/v1/tools/recommend            # Recommend tools for task
    Body: {"task_description": "...", "task_type": "..."}
    Response: {"tools": [...], "confidence": 0.95}

# Agent Tool Management
GET /api/v1/agents/{id}/tools           # Get agent's assigned tools
PUT /api/v1/agents/{id}/tools           # Update agent tool assignments
    Body: {"tool_categories": ["research", "file_ops"]}
POST /api/v1/agents/{id}/tools/execute  # Execute tool for agent
    Body: {"tool_name": "...", "params": {...}}

# Task Tool Analysis
POST /api/v1/tasks/analyze-tools        # Analyze required tools for task
    Body: {"task_description": "..."}
    Response: {
        "task_type": "code_review",
        "recommended_tools": ["file_ops", "research", "mcp:github"],
        "required_categories": ["research", "file_ops"],
        "rationale": "..."
    }

6. Success Criteria

Functional Requirements

All 6 tool systems unified under ToolRegistry
Tool capability mapper with 25+ task mappings
Database schema for tool registry
Agents dynamically receive tools based on task type
File operations available to agents for code tasks
Shell commands available for infrastructure tasks
MCP tools dynamically assigned to relevant agents
ChatBot can use tools for augmented responses
No breaking changes to existing workflows

Performance Requirements

Tool registry query: <50ms
Dynamic assignment overhead: <200ms
Tool execution latency: <5s (varies by tool)
Memory overhead: <50MB for full registry

Quality Requirements

Tool recommendation accuracy: >85%
Backward compatibility: 100% existing workflows work
Security validation: 100% dangerous commands blocked
Test coverage: >80% for new components

7. Files Created (Phase 1)

New Services

orchestrator/services/tool_registry.py - Centralized tool registry (507 lines)
orchestrator/services/tool_capability_mapper.py - Task-to-tool mapping (387 lines)

Database Migration

orchestrator/alembic/versions/004_tool_registry_schema.py - Schema migration

Tests

orchestrator/tests/test_tool_registry.py - Unit tests for registry
orchestrator/tests/test_tool_capability_mapper.py - Unit tests for mapper
orchestrator/tests/test_tool_system_integration.py - Integration tests

8. Deployment Instructions

Deploy to Backend Server

# From local machine
cd /Users/gkavanagh/Development/Automatos-AI-Platform/automatos-ai
scp -i ~/.ssh/id_rsa -r orchestrator/services/tool_registry.py [email protected]:/root/automatos-ai/orchestrator/services/
scp -i ~/.ssh/id_rsa -r orchestrator/services/tool_capability_mapper.py [email protected]:/root/automatos-ai/orchestrator/services/
scp -i ~/.ssh/id_rsa -r orchestrator/alembic/versions/004_tool_registry_schema.py [email protected]:/root/automatos-ai/orchestrator/alembic/versions/
scp -i ~/.ssh/id_rsa -r orchestrator/tests/test_tool_*.py [email protected]:/root/automatos-ai/orchestrator/tests/

Run Database Migration

ssh -i ~/.ssh/id_rsa [email protected]
cd /root/automatos-ai/orchestrator
alembic upgrade head

Test on Server

# Test imports
python3 -c "from services.tool_registry import get_tool_registry; r = get_tool_registry(); print(f'Tools: {len(r.tools)}')"

# Run tests
python3 -m pytest tests/test_tool_registry.py -v
python3 -m pytest tests/test_tool_capability_mapper.py -v
python3 -m pytest tests/test_tool_system_integration.py -v

9. Risk Mitigation

Risk

Impact

Mitigation

Breaking existing code

Critical

Non-breaking additive changes, comprehensive testing

Security vulnerabilities

High

Sandboxed execution, command whitelisting, permission checks

Performance degradation

Medium

Caching, lazy loading, query optimization

Tool execution failures

Medium

Retry logic, fallbacks, graceful degradation

Complexity increase

Medium

Clear interfaces, good documentation, gradual rollout

10. Timeline & Effort

Week 1 (12-16h): Foundation - Tool Registry ✅ COMPLETE

Create tool_registry.py service (507 lines)
Create tool_capability_mapper.py (387 lines)
Database schema + migrations
Unit and integration tests

Week 2 (16-20h): Dynamic Assignment

Update RealTaskDecomposer to output required_tools
Update AgentFactory dynamic injection
Create UnifiedToolExecutor
Integration testing

Week 3 (16-20h): Complete Integration

Orchestrator integration
ChatBot integration
Frontend enhancements
API endpoints

Week 4 (12-16h): Testing & Validation

End-to-end testing
Backward compatibility
Performance testing
Documentation

Week 5 (8h): Cleanup & Polish

Code cleanup
Documentation
UI polish

Total: 64-80 hours (5 weeks)

Phase 1 Complete - Ready for Testing

Files Created:

✅ tool_registry.py - 507 lines
✅ tool_capability_mapper.py - 387 lines
✅ 004_tool_registry_schema.py - Database migration
✅ test_tool_registry.py - 276 lines of tests
✅ test_tool_capability_mapper.py - 348 lines of tests
✅ test_tool_system_integration.py - 281 lines of tests

Next: Deploy to backend server, run migration, execute tests, validate Phase 1 before proceeding to Phase 2.

PreviousPRD-16: LLM-Driven Orchestration Engine - Software 3.0 Transformation NextPRD-18: Style Credential Management System

Last updated 23 days ago

Good afternoon

hashtagExecutive Summary

hashtag1. Problem Statement & Vision

hashtag1.1 Current State Analysis

hashtag1.2 Vision Alignment

hashtag2. Solution Architecture

hashtag2.1 Centralized Tool Registry (ToolRegistry)

hashtag2.2 Task-to-Tool Mapping (ToolCapabilityMapper)

hashtag2.3 Unified Tool Execution

hashtag3. Implementation Status

hashtagPhase 1: Centralized Tool Registry ✅ COMPLETED

hashtagStep 1.1: Create ToolRegistry Service ✅

hashtagStep 1.2: Tool Capability Mapper ✅

hashtagStep 1.3: Database Schema ✅

hashtagStep 1.4: Unit Tests ✅

hashtag4. Next Steps (Phase 2-5)

hashtagPhase 2: Dynamic Tool Assignment (Week 2: 16-20h)

hashtagPhase 3: Complete Integration (Week 3: 16-20h)

hashtagPhase 4: Testing & Validation (Week 4: 12-16h)

hashtagPhase 5: Cleanup & Polish (Week 5: 8h)

hashtag5. API Endpoints (Planned)

hashtag6. Success Criteria

hashtagFunctional Requirements

hashtagPerformance Requirements

hashtagQuality Requirements

hashtag7. Files Created (Phase 1)

hashtagNew Services

hashtagDatabase Migration

hashtagTests

hashtag8. Deployment Instructions

hashtagDeploy to Backend Server

hashtagRun Database Migration

hashtagTest on Server

hashtag9. Risk Mitigation

hashtag10. Timeline & Effort

hashtagWeek 1 (12-16h): Foundation - Tool Registry ✅ COMPLETE

hashtagWeek 2 (16-20h): Dynamic Assignment

hashtagWeek 3 (16-20h): Complete Integration

hashtagWeek 4 (12-16h): Testing & Validation

hashtagWeek 5 (8h): Cleanup & Polish

hashtagPhase 1 Complete - Ready for Testing

Executive Summary

1. Problem Statement & Vision

1.1 Current State Analysis

1.2 Vision Alignment

2. Solution Architecture

2.1 Centralized Tool Registry (ToolRegistry)

2.2 Task-to-Tool Mapping (ToolCapabilityMapper)

2.3 Unified Tool Execution

3. Implementation Status

Phase 1: Centralized Tool Registry ✅ COMPLETED

Step 1.1: Create ToolRegistry Service ✅

Step 1.2: Tool Capability Mapper ✅

Step 1.3: Database Schema ✅

Step 1.4: Unit Tests ✅

4. Next Steps (Phase 2-5)

Phase 2: Dynamic Tool Assignment (Week 2: 16-20h)

Phase 3: Complete Integration (Week 3: 16-20h)

Phase 4: Testing & Validation (Week 4: 12-16h)

Phase 5: Cleanup & Polish (Week 5: 8h)

5. API Endpoints (Planned)

6. Success Criteria

Functional Requirements

Performance Requirements

Quality Requirements

7. Files Created (Phase 1)

New Services

Database Migration

Tests

8. Deployment Instructions

Deploy to Backend Server

Run Database Migration

Test on Server

9. Risk Mitigation

10. Timeline & Effort

Week 1 (12-16h): Foundation - Tool Registry ✅ COMPLETE

Week 2 (16-20h): Dynamic Assignment

Week 3 (16-20h): Complete Integration

Week 4 (12-16h): Testing & Validation

Week 5 (8h): Cleanup & Polish

Phase 1 Complete - Ready for Testing