Deployment & Infrastructure

chevron-rightRelevant source fileshashtag

Purpose and Scope

This document covers the containerization, orchestration, and deployment infrastructure for Automatos AI. It explains the Docker multi-stage build process, Docker Compose service orchestration, environment variable configuration, database and cache setup, and production deployment strategies.

Related Pages:


System Overview

Automatos AI uses a containerized architecture with six services orchestrated by Docker Compose. The system supports both development (hot-reload) and production (optimized) deployment targets through multi-stage Dockerfiles and profile-based service activation.

spinner

Sources: docker-compose.yml:1-280


Backend Containerization

Multi-Stage Dockerfile Architecture

The backend uses a three-stage build process to optimize for different deployment scenarios while minimizing image size.

spinner

Sources: orchestrator/Dockerfile:1-130, orchestrator/requirements.txt:1-108

Base Stage Dependencies

The base stage installs system-level dependencies required for Python packages and AI operations:

Dependency
Purpose
Configuration

gcc, g++

Compile native Python extensions (numpy, pandas, scikit-learn)

curl, git

Health checks, repository cloning (PRD-11)

postgresql-client

Database connectivity validation

libmagic1

File type detection (python-magic)

tesseract-ocr

OCR for document processing (pytesseract)

ghostscript

PDF rendering support

libpango-1.0-0, libcairo2, libgdk-pixbuf-2.0-0, libffi-dev

WeasyPrint HTML-to-PDF rendering (PRD-63)

Python Dependencies Installation

Python packages are installed with special handling for futureagi (orchestrator/Dockerfile:42-45):

The --no-deps flag prevents futureagi from overwriting already-installed package versions (requests==2.32.3, pandas==2.2.2, etc.) with its pinned dependencies.

NLTK Data Pre-download

NLTK tokenizers and stopwords are downloaded at build time to avoid runtime downloads (orchestrator/Dockerfile:48-52):

Sources: orchestrator/Dockerfile:13-53, orchestrator/requirements.txt:1-108

Development Stage

The development stage enables hot-reload for rapid iteration:

Sources: orchestrator/Dockerfile:57-86, docker-compose.yml:78-138

Production Stage

The production stage creates a secure, minimal image:

  1. Cleanup: Removes dev dependencies and caches (orchestrator/Dockerfile:105-109):

  2. Non-root User: Creates automatos user (UID 1000) (orchestrator/Dockerfile:112-113)

  3. Multi-worker: Uses 4 uvicorn workers for concurrency (orchestrator/Dockerfile:129)

  4. Dynamic Port: Supports Railway's PORT environment variable with fallback (orchestrator/Dockerfile:129):

  5. Health Check: Uses ${PORT:-8000} variable in health check command (orchestrator/Dockerfile:122)

Sources: orchestrator/Dockerfile:90-130


Frontend Containerization

Multi-Stage Dockerfile Architecture

The frontend uses a four-stage build to separate dependencies, development, build, and production runtime.

spinner

Sources: frontend/Dockerfile:1-115

Build Arguments vs Runtime Variables

The frontend distinguishes between build-time and runtime environment variables:

Build-Time Variables (NEXT_PUBLIC_*)

These are embedded into the client JavaScript bundle during npm run build (frontend/Dockerfile:58-71):

Variable
Purpose
Security
Default

NEXT_PUBLIC_API_URL

Backend API endpoint

Public (embedded in JS)

None (required)

NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY

Clerk authentication

Public (publishable key)

None (required)

NEXT_PUBLIC_CLERK_SIGN_IN_URL

Sign-in route

Public

/sign-in

NEXT_PUBLIC_CLERK_SIGN_UP_URL

Sign-up route

Public

/sign-up

NEXT_PUBLIC_CLERK_AFTER_SIGN_IN_URL

Post-login redirect

Public

/dashboard

NEXT_PUBLIC_CLERK_AFTER_SIGN_UP_URL

Post-signup redirect

Public

/dashboard

⚠️ Security Note: NEXT_PUBLIC_* variables are embedded in client-side JavaScript and must not contain secrets. These values are baked into the bundle at build time (frontend/Dockerfile:66-71) and cannot be changed without rebuilding.

Runtime Variables (Server-Side Only)

Secret variables are only available in server-side contexts and are never exposed to the client:

Variable
Purpose
Access

CLERK_SECRET_KEY

Server-side Clerk authentication

Next.js API routes only

NODE_ENV

Environment mode

Server runtime

HOSTNAME, PORT

Server binding

Production container

Sources: frontend/Dockerfile:58-71, frontend/Dockerfile:111-113, docker-compose.yml:156-160

Production Stage Optimization

The production image uses Next.js standalone output mode for minimal footprint:

  1. Standalone Build: Next.js traces dependencies and outputs only required files (frontend/Dockerfile:80)

  2. Selective Copy: Copies traced dependencies and static assets from builder stage (frontend/Dockerfile:97-99):

  3. Non-root User: Runs as nextjs user (UID 1001, group nodejs GID 1001) (frontend/Dockerfile:93-94)

  4. Direct Execution: Uses node server.js instead of npm for faster startup (frontend/Dockerfile:114)

  5. Health Check: curl -f http://localhost:3000 with 60s start period (frontend/Dockerfile:107-108)

The standalone output eliminates the full node_modules directory, reducing image size by 60-70%.

Sources: frontend/Dockerfile:85-114, frontend/.dockerignore:1-9


Docker Compose Orchestration

Service Configuration

The docker-compose.yml defines six services with health checks and dependency ordering, organized into three profiles:

spinner

Sources: docker-compose.yml:1-280

PostgreSQL Service

The database service uses the official pgvector/pgvector:pg16 image with optimized settings:

Configuration
Value
Purpose

Image

pgvector/pgvector:pg16

PostgreSQL 16 with vector extension

Container Name

automatos_postgres

DNS hostname within automatos network

Environment

POSTGRES_DB=${POSTGRES_DB:-orchestrator_db}

Database name with fallback

POSTGRES_USER=${POSTGRES_USER:-postgres}

Database user with fallback

POSTGRES_PASSWORD=${POSTGRES_PASSWORD:?required}

Required in .env file

POSTGRES_INITDB_ARGS

-c max_connections=200 -c shared_buffers=256MB

Volumes

postgres_data:/var/lib/postgresql/data

Persistent data storage

init_complete_schema.sql:/docker-entrypoint-initdb.d/01-schema.sql:ro

Schema initialization (first start only)

Health Check

pg_isready -U postgres

Runs every 10s, 5 retries, 10s start period

Port

${POSTGRES_PORT:-5432}:5432

Configurable host port

Connection Pooling Configuration

The max_connections=200 setting supports concurrent workflows and API requests. SQLAlchemy connection pools in the backend are sized accordingly:

This leaves 140 connections for workspace-worker and admin tools.

Schema Initialization

The init_complete_schema.sql script runs automatically on first database creation via Docker's entrypoint-initdb.d mechanism (docker-compose.yml:35):

  • Executes only when data directory is empty (first start)

  • Creates all tables, indexes, and extensions

  • Loads seed data (personas, categories, system settings)

Sources: docker-compose.yml:22-43

Redis Service

The cache service uses Redis 7 with LRU eviction policy and security hardening:

Configuration
Value
Purpose

Image

redis:7-alpine

Minimal Redis 7 image

Container Name

automatos_redis

DNS hostname within automatos network

Command

redis-server with flags

Custom configuration via CLI arguments

Security (PRD-70)

--requirepass ${REDIS_PASSWORD:?required}

Password authentication required

--rename-command FLUSHDB ""

Disable FLUSHDB (prevents data wipe)

--rename-command FLUSHALL ""

Disable FLUSHALL (prevents data wipe)

--rename-command DEBUG ""

Disable DEBUG command

Memory

--maxmemory 256mb

Memory limit for cache

--maxmemory-policy allkeys-lru

LRU eviction when full

Volumes

redis_data:/data

RDB snapshot persistence

Health Check

redis-cli --no-auth-warning -a ${REDIS_PASSWORD} ping

Runs every 10s, 5 retries

Port

${REDIS_PORT:-6379}:6379

Configurable host port

Security Hardening (PRD-70 FIX-05)

Dangerous Redis commands are disabled by renaming them to empty strings (docker-compose.yml:54-61):

Note: Renaming CONFIG (commented in source) would break redis-cli CONFIG commands, which may be needed for debugging.

Redis Usage Patterns

Redis serves three purposes in the system:

  1. Caching: Plugin content cache, Composio metadata cache (orchestrator/core/services/plugin_cache.py)

  2. Pub/Sub: Real-time workflow execution events (orchestrator/core/redis/client.py:91-119)

  3. Task Queues: Workspace worker task coordination (services/workspace-worker/task_queue.py)

Sources: docker-compose.yml:48-73

Backend Service

The FastAPI service builds from the development target for hot-reload:

Configuration
Value
Purpose

Build

context: ./orchestrator

Build from orchestrator directory

target: development

Use development stage (hot-reload)

Container Name

automatos_backend

DNS hostname

Dependencies

postgres: {condition: service_healthy}

Wait for PostgreSQL health check

redis: {condition: service_healthy}

Wait for Redis health check

Volumes

./orchestrator:/app

Mount source for hot-reload

./docker-entrypoint.sh:/usr/local/bin/docker-entrypoint.sh:ro

Custom entrypoint script

backend_logs:/app/logs

Persistent application logs

workspace_data:/workspaces:ro

Read-only workspace access (PRD-66)

Environment

DATABASE_URL

PostgreSQL connection string

REDIS_HOST=redis, REDIS_PORT=6379, REDIS_PASSWORD

Redis connection

GOTENBERG_URL=http://gotenberg:3000

Document generation service (PRD-63)

API_KEY, CLERK_SECRET_KEY, CLERK_JWKS_URL

Authentication

OPENAI_API_KEY, ANTHROPIC_API_KEY

LLM providers (optional)

ENVIRONMENT=${ENVIRONMENT:-development}

Runtime mode

Health Check

curl -f http://localhost:8000/health

30s interval, 40s start period

Port

${API_PORT:-8000}:8000

Configurable host port

Service Discovery

The backend uses Docker Compose service names as hostnames for inter-service communication:

  • postgres resolves to PostgreSQL container IP

  • redis resolves to Redis container IP

  • gotenberg resolves to Gotenberg container IP (when running)

This eliminates the need for hardcoded IP addresses or environment-specific DNS configuration.

Volume Mounts

Three volume types are used:

  1. Bind Mount (./orchestrator:/app): Source code for hot-reload in development

  2. Named Volume (backend_logs): Persistent logs across container restarts

  3. Shared Named Volume (workspace_data:ro): Read-only access to workspace files for code viewer widget (PRD-66)

Sources: docker-compose.yml:78-138

Frontend Service

The Next.js service builds from the development target:

Configuration
Value
Purpose

Build

context: ./frontend

Build from frontend directory

target: development

Use development stage (hot-reload)

Container Name

automatos_frontend

DNS hostname

Dependencies

backend: {condition: service_healthy}

Wait for backend health check

Volumes

./frontend:/app

Mount source for hot-reload

/app/node_modules

Anonymous volume (prevent host overwrite)

/app/.next

Anonymous volume (prevent host overwrite)

Environment

NEXT_PUBLIC_API_URL=${NEXT_PUBLIC_API_URL:-http://localhost:8000}

Backend API endpoint

NEXT_PUBLIC_WS_URL=${NEXT_PUBLIC_WS_URL:-ws://localhost:8000/ws}

WebSocket endpoint

NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY

Clerk authentication

NODE_ENV=development

Development mode

Health Check

wget --no-verbose --tries=1 --spider http://localhost:3000

30s interval, 60s start period

Port

${FRONTEND_PORT:-3000}:3000

Configurable host port

Anonymous Volume Pattern

The anonymous volumes (/app/node_modules, /app/.next) prevent the bind mount from overwriting container-generated files (docker-compose.yml:167-168):

This allows hot-reload while preserving dependencies installed during image build.

API URL Configuration

The NEXT_PUBLIC_API_URL uses localhost:8000 instead of backend:8000 because:

  • The browser (client-side) resolves this URL, not the container

  • In local development, both services are exposed on localhost

  • In production (Railway), this would be set to the backend's public URL

Sources: docker-compose.yml:146-170

Workspace Worker Service

The workspace worker service executes agent tasks in isolated workspaces (PRD-56 Phase 2):

Configuration
Value
Purpose

Build

context: ./services/workspace-worker

Separate worker service

Container Name

automatos_workspace_worker

DNS hostname

Profile

workers

Start with --profile workers

Dependencies

postgres: {condition: service_healthy}

Database access for task metadata

redis: {condition: service_healthy}

Task queue coordination

Environment

DATABASE_URL

PostgreSQL connection

REDIS_URL=redis://:${REDIS_PASSWORD}@redis:6379/0

Redis task queue

WORKSPACE_VOLUME_PATH=/workspaces

Base path for workspace directories

WORKSPACE_DEFAULT_QUOTA_GB=${WORKSPACE_DEFAULT_QUOTA_GB:-5}

Per-workspace storage limit

WORKER_CONCURRENCY=${WORKER_CONCURRENCY:-3}

Concurrent task limit

WORKER_HEALTH_PORT=8081

Health check endpoint port

Volumes

workspace_data:/workspaces

Read-write workspace access

Resource Limits

cpus: 2.0, memory: 2G

Prevent resource exhaustion

Health Check

curl -f http://localhost:8081/health

30s interval, 15s start period

Task Execution Model

The workspace worker:

  1. Polls Redis task queue for pending tasks (services/workspace-worker/task_queue.py)

  2. Creates isolated workspace directories under /workspaces/{workspace_id}/ (services/workspace-worker/workspace_manager.py:115-145)

  3. Executes commands with sandboxing (path safety, command whitelist) (services/workspace-worker/executor.py:36-470)

  4. Publishes progress events via Redis Pub/Sub (services/workspace-worker/task_queue.py)

  5. Stores results in PostgreSQL and S3 (services/workspace-worker/storage.py)

Sources: docker-compose.yml:178-217

Admin Tools Profile

Optional services are available with --profile all:

Adminer (Database Admin UI)

Configuration
Value

Image

adminer:latest

Container Name

automatos_adminer

Port

${ADMINER_PORT:-8080}:8080

Environment

ADMINER_DEFAULT_SERVER=postgres, ADMINER_DESIGN=nette

Usage

Browse to http://localhost:8080, login with PostgreSQL credentials

Gotenberg (Document Conversion)

Configuration
Value
Purpose

Image

gotenberg/gotenberg:8

Chromium + LibreOffice for conversions

Container Name

automatos_gotenberg

Port

${GOTENBERG_PORT:-3001}:3000

HTTP API

Environment

GOTENBERG_API_TIMEOUT=120s

Timeout for large documents

GOTENBERG_LOG_LEVEL=info

Logging verbosity

Usage

Backend sends POST requests for DOCX/XLSX → PDF conversion (PRD-63)

Starting Admin Tools

Sources: docker-compose.yml:223-253


Configuration Management

Centralized Configuration Class

All environment variables are accessed through a single Config class to prevent scattered os.getenv() calls:

spinner

Sources: orchestrator/config.py:1-285

Configuration Categories

The Config class organizes settings into logical groups:

Database Configuration

Precedence: If DATABASE_URL is set (Railway format), it takes precedence over individual parameters.

Sources: orchestrator/config.py:34-42

Redis Configuration

Redis configuration supports both URL format (Railway, Heroku) and component variables:

Precedence:

  1. REDIS_URL environment variable (Railway format: redis://:password@host:port/db)

  2. Component variables (REDIS_HOST, REDIS_PORT, REDIS_PASSWORD)

  3. Default host 127.0.0.1, port 6379, no password

The get_redis_client() function uses this configuration to initialize the connection pool (orchestrator/core/redis/client.py:149-198).

Sources: orchestrator/config.py:46-62

LLM Configuration (Database-Backed)

LLM settings are loaded from the system_settings database table, with environment variable fallbacks:

Precedence: Database → Environment → None (no hardcoded defaults)

Sources: orchestrator/config.py:88-106

AWS S3 Configuration

Sources: orchestrator/config.py:158-185

Feature Flags

Sources: orchestrator/config.py:154-174

Environment Variable Precedence

The configuration resolution follows this priority:

  1. Composite URLs (e.g., DATABASE_URL, REDIS_URL) - highest priority

    • Format: postgresql://user:pass@host:port/db

    • Common in Railway, Heroku deployments

  2. Component Variables (e.g., POSTGRES_HOST, REDIS_HOST)

    • Individual connection parameters

    • Used in docker-compose.yml

  3. Database Settings (LLM configuration only)

    • system_settings table via get_system_setting()

    • Allows runtime configuration changes

  4. Hardcoded Defaults (only for non-sensitive values)

    • Example: REDIS_HOST="127.0.0.1", AWS_REGION="us-east-1"

    • Secrets (passwords, API keys) have no defaults

Variable Categories

Category

Requires .env

Supports Database

Has Defaults

Database credentials

Yes

No

No

Redis credentials

Yes (if used)

No

Host/port only

LLM configuration

No (optional)

Yes

No

AWS credentials

Yes (if S3 used)

No

Region only

Feature flags

No

No

Yes

Authentication (Clerk)

Yes

No

No

Sources: orchestrator/config.py:28-285

Configuration Validation

The validate() method checks required settings on startup:

Usage: Can be called at application startup to fail fast on misconfiguration.

Sources: orchestrator/config.py:225-247


Database Infrastructure

PostgreSQL with pgvector

The database service uses PostgreSQL 16 with the pgvector extension for embedding-based retrieval.

Schema Initialization

The schema is automatically loaded on first database creation via Docker's entrypoint-initdb.d mechanism:

Execution: Scripts in /docker-entrypoint-initdb.d/ run only when the data directory is empty (first start).

Sources: docker-compose.yml:34

Connection Pooling

The backend uses SQLAlchemy with connection pooling to handle concurrent requests efficiently. The database is configured for 200 max connections:

Rationale: Supports multiple workflow executions and API requests simultaneously.

Sources: docker-compose.yml:29

Health Check Strategy

The health check uses pg_isready to verify database availability:

Benefits:

  • Fast Startup: 10-second start period prevents premature failures

  • Reliable Detection: pg_isready is more reliable than TCP checks

  • Retry Logic: 5 retries with 10s interval handles transient failures

Sources: docker-compose.yml:36-40


Redis Infrastructure

Cache Architecture

Redis serves three purposes: caching, real-time messaging, and task queues.

spinner

Sources: orchestrator/core/redis/client.py:149-198, orchestrator/core/services/plugin_cache.py:1-200

Connection Management

The RedisClient class uses a connection pool for efficiency:

Connection Pool Configuration

Parameter
Value
Rationale

max_connections

50

Supports concurrent workflows (10-15), API requests (20-30), worker tasks (5-10)

decode_responses

True

Automatic UTF-8 decoding (JSON strings)

db

0

Default database (single-tenant per Redis instance)

With 4 uvicorn workers, each can use ~12 connections without exhausting the pool.

Context Manager for Pub/Sub

Ensures proper cleanup of pub/sub subscriptions (orchestrator/core/redis/client.py:38-46).

Sources: orchestrator/core/redis/client.py:14-46

Lazy Initialization Pattern

Redis is optional—if not configured, the system gracefully degrades:

Graceful Degradation Behavior

When Redis is unavailable (get_redis_client() returns None):

  • Caching: Skipped, falls back to direct S3/database access

  • Pub/Sub: Events not published, SSE clients poll instead

  • Task Queues: Workspace worker tasks unavailable (HTTP API still works)

  • Core Functions: Agent creation, chat, recipe execution all continue working

Callers Check for None

All Redis-dependent code checks for None:

Sources: orchestrator/core/redis/client.py:149-198

Plugin Content Caching

The PluginContentCache wraps S3 access with Redis caching:

Cache Configuration

Setting
Default
Environment Variable
Purpose

TTL

3600 seconds (1 hour)

PLUGIN_CACHE_TTL_SECONDS

Balance between freshness and S3 costs

Key Pattern

plugin:content:{slug}:{version}

N/A

Unique per plugin version

Size Limit

10 MB per plugin

PLUGIN_MAX_UPLOAD_SIZE_MB

Prevents cache exhaustion

Performance Impact

Without cache:

  • Marketplace page load: 15-20 S3 API calls per plugin

  • Cost: $0.0004 per 1000 requests (S3 GET)

  • Latency: 50-100ms per plugin

With cache (1 hour TTL):

  • Marketplace page load: 0 S3 calls (after first load)

  • Cost: Redis memory only

  • Latency: <5ms per plugin

Sources: orchestrator/core/services/plugin_cache.py:119-159, orchestrator/config.py:180-185

Pub/Sub for Real-Time Updates

Workflow execution events are published to Redis channels for SSE streaming:

Channel Naming: workflow:{workflow_id}:execution:{execution_id} allows fine-grained subscriptions.

Sources: orchestrator/core/redis/client.py:91-119


Production Deployment

Railway Deployment

Automatos AI is optimized for Railway deployment with automatic detection of Railway-specific environment variables.

PORT Variable Handling

Railway provides a dynamic PORT variable that changes on each deployment. The backend Dockerfile handles this:

Fallback: If PORT is not set (local deployment), defaults to 8000.

Sources: orchestrator/Dockerfile:114-115

Database URL Format

Railway provides PostgreSQL as DATABASE_URL in the connection string format:

The Config class prioritizes DATABASE_URL over component variables:

Sources: orchestrator/config.py:42

Redis URL Format

Similarly, Railway provides Redis as REDIS_URL:

The REDIS_URL property handles URL parsing:

Sources: orchestrator/config.py:51-62

Environment-Specific Configuration

The ENVIRONMENT variable controls behavior across dev/staging/production:

Usage Examples:

  • Logging: More verbose in development

  • CORS: Stricter in production

  • Error Messages: Detailed stack traces only in development

Sources: orchestrator/config.py:114-123

CORS Configuration

The backend allows multiple frontend origins via comma-separated list:

Railway Default: Includes both localhost (dev) and Railway frontend domain.

Sources: orchestrator/config.py:72-79

Scaling Considerations

Stateless Architecture

Both frontend and backend are stateless, enabling horizontal scaling:

  • Frontend: Multiple Next.js instances can run behind a load balancer

  • Backend: Multiple uvicorn workers handle concurrent requests

  • Database: PostgreSQL with connection pooling handles load

  • Redis: Single instance sufficient for caching/Pub/Sub at current scale

Multi-Worker Backend

The production Dockerfile uses 4 uvicorn workers:

Worker Count: Can be increased via environment variable override if needed.

Sources: orchestrator/Dockerfile:115

Database Connection Limits

Each uvicorn worker maintains its own connection pool. The database is configured for 200 max connections:

Calculation: With 4 workers and ~10 connections per worker, this supports 5 backend instances.

Sources: docker-compose.yml:29

Security Hardening

Non-Root Users

Both production images run as non-root users:

Backend:

Frontend:

Sources: orchestrator/Dockerfile:98-101, frontend/Dockerfile:103-107

Secret Management

Secrets are never hardcoded or committed:

  • Development: .env file (git-ignored via .gitignore)

  • Production: Environment variables set in Railway dashboard

  • Credentials: Stored encrypted in database via CredentialStore

Sources: .gitignore:99-105, orchestrator/core/credentials/service.py:42-465


Local Development Setup

Quick Start Commands

Access Points

Service
URL
Credentials

Frontend

http://localhost:3000

Clerk sign-in

Backend API Docs

http://localhost:8000/docs

API key in header

Adminer (DB UI)

http://localhost:8080

postgres / POSTGRES_PASSWORD

PostgreSQL

localhost:5432

Connection via client

Redis

localhost:6379

redis-cli -a REDIS_PASSWORD

Service Startup Order

Due to health check dependencies, services start in this order:

  1. PostgreSQL (10s start period)

  2. Redis (5s start period)

  3. Backend (40s start period) - waits for DB + Redis healthy

  4. Frontend (60s start period) - waits for backend healthy

  5. Workspace Worker (15s start period) - waits for DB + Redis healthy (if --profile workers)

Sources: docker-compose.yml:1-280, README.md:1-150

Manual Service Startup (Non-Docker)

For active development without Docker:

Sources: docs/LOCAL_SETUP_GUIDE.md:177-199

Port Assignments

Service
Port
Protocol
Configurable Via

Frontend

3000

HTTP

FRONTEND_PORT env var

Backend

8000

HTTP

API_PORT env var

PostgreSQL

5432

PostgreSQL

POSTGRES_PORT env var

Redis

6379

Redis

REDIS_PORT env var

Workspace Worker

8081

HTTP (health)

Fixed

Adminer

8080

HTTP

ADMINER_PORT env var

Gotenberg

3001

HTTP

GOTENBERG_PORT env var

All ports can be customized via .env file or environment variables.

Sources: docker-compose.yml:32-251


Troubleshooting

Common Deployment Issues

Port Conflicts

Symptom: Error: bind: address already in use

Solution:

Sources: docs/LOCAL_SETUP_GUIDE.md:173-174

Database Connection Failures

Symptom: sqlalchemy.exc.OperationalError: could not connect to server

Solution:

  1. Check PostgreSQL health: docker-compose ps postgres

  2. Verify health check passes: docker-compose logs postgres | grep healthy

  3. Check connection string in .env: DATABASE_URL or POSTGRES_* variables

Sources: docker-compose.yml:76-87

Redis Connection Failures

Symptom: redis.exceptions.ConnectionError: Error connecting to Redis

Solution:

  1. Check Redis health: docker-compose ps redis

  2. Verify password: docker-compose exec redis redis-cli -a automatos_redis_dev ping

  3. System gracefully degrades if Redis unavailable (caching disabled)

Sources: orchestrator/core/redis/client.py:149-198

Frontend Build Failures

Symptom: Error: Cannot find module ... during build

Solution:

Sources: docs/LOCAL_SETUP_GUIDE.md:150-153

Missing Environment Variables

Symptom: Configuration validation failed: API_KEY required

Solution:

  1. Copy .env.example to .env: cp orchestrator/.env.example orchestrator/.env

  2. Fill in required values (at minimum: LLM API keys)

  3. Restart services: docker-compose restart backend

Sources: orchestrator/config.py:225-247, orchestrator/.env.example:1-64

Health Check Debugging

To debug service health:

Sources: docker-compose.yml:36-121


Summary

The Automatos AI deployment infrastructure provides:

  1. Multi-stage Docker builds for optimized development and production images

  2. Docker Compose orchestration with health checks and dependency management

  3. Centralized configuration via config.py with database-backed settings

  4. PostgreSQL with pgvector for relational data and embeddings

  5. Redis caching and Pub/Sub for performance and real-time updates

  6. Railway-optimized deployment with dynamic port and URL handling

  7. Security hardening via non-root users and encrypted credentials

The architecture supports both local development (hot-reload, debug tools) and production deployment (multi-worker, optimized images) with minimal configuration changes.

Sources: orchestrator/Dockerfile:1-116, frontend/Dockerfile:1-120, docker-compose.yml:1-197, orchestrator/config.py:1-285


Last updated