Tier 1: Cache Lookup

Relevant source files

The following files were used as context for generating this wiki page:

Purpose and Scope

Tier 1: Cache Lookup is the second tier in the Universal Router's decision-making pipeline, immediately after Tier 0: User Overrides. When a request has no explicit override, the router first checks a Redis-backed routing cache to see if an identical request has been routed recently. This tier provides sub-millisecond routing decisions for repeated requests, dramatically reducing LLM API costs and latency compared to Tier 3: LLM Classification.

The cache stores complete RoutingDecision objects keyed by workspace, content, and source. Cache hits return immediately with high confidence; cache misses fall through to Tier 2: Rule-Based Routing or Tier 3: LLM Classification, which then populate the cache for future requests.

Related Pages:

For the overall routing architecture, see Routing Architecture
For rule-based routing that follows cache misses, see Tier 2: Rule-Based Routing
For the final LLM fallback tier, see Tier 3: LLM Classification

Cache Lookup Flow

Sources: orchestrator/core/routing/engine.py:171-176

The cache lookup implementation is minimal and fast:

def _tier1_cache(self, envelope: RequestEnvelope) -> Optional[RoutingDecision]:
    if self._cache is None:
        return None
    return self._cache.get(
        envelope.workspace_id, envelope.content, envelope.source
    )

If the RoutingCache instance is not available (Redis unavailable or not configured), Tier 1 immediately returns None and routing proceeds to Tier 2. Otherwise, it queries the cache with three parameters:

Parameter

Description

Example

workspace_id

UUID of the workspace making the request

550e8400-e29b-41d4-a716-446655440000

content

Normalized request content/message text

"Create a new GitHub PR for bug fix"

source

Channel source enum value

ChannelSource.WEB_CHAT

Sources: orchestrator/core/routing/engine.py:171-176, orchestrator/core/models/routing.py

Cache Key Generation

Sources: orchestrator/core/routing/engine.py:54, orchestrator/core/routing/cache.py

The cache key is constructed by:

Normalizing content - The _normalize_content() function strips whitespace, converts to lowercase, and removes punctuation variations to ensure similar requests match
Hashing - A SHA-256 hash is computed from normalized_content + "|" + source.value to create a deterministic, compact key
Workspace scoping - The workspace ID is prepended to ensure complete tenant isolation

This ensures that:

Identical requests from the same workspace always hit the same cache entry
Minor formatting differences (e.g., trailing spaces) don't create cache misses
Different workspaces never share routing decisions, even for identical text
Different channels (web chat vs Slack vs email) maintain separate cache entries

Sources: orchestrator/core/routing/engine.py:51-54, orchestrator/core/routing/cache.py

RoutingCache Implementation

Sources: orchestrator/core/routing/cache.py, orchestrator/config.py:143

The RoutingCache class provides a Redis-backed storage layer for routing decisions. Key characteristics:

Redis Connection

Uses the centralized Redis client from core.redis.client.get_redis_client()
Lazy initialization - connection established on first cache access
Graceful degradation - if Redis is unavailable, cache operations return None and routing continues

Data Structure

The cached RoutingDecision object includes:

{
    "route_type": "agent" | "workflow" | "orchestrate",
    "agent_id": int | None,
    "workflow_id": int | None, 
    "confidence": float,  # 0.0 to 1.0
    "reasoning": str,
    "cached": True,  # Always set to True for cached decisions
    "intent_category": str | None
}

TTL Configuration

Cache entries expire after ROUTING_CACHE_TTL_HOURS (default: 24 hours) to ensure:

Routing logic changes eventually propagate to all requests
Stale decisions don't persist indefinitely if agents are deleted or modified
Redis memory usage remains bounded

Sources: orchestrator/core/routing/cache.py, orchestrator/config.py:143, orchestrator/core/models/routing.py

Cache Population

Sources: orchestrator/core/routing/engine.py:404-409, orchestrator/core/routing/engine.py:421-427

Cache population occurs in two scenarios:

1. After Tier 3 LLM Classification (High Confidence)

When the LLM classifies a request with confidence ≥ ROUTING_LLM_CONFIDENCE_THRESHOLD (default: 0.5), the router immediately caches the decision:

# High confidence → route to agent
decision = RoutingDecision(
    route_type="agent",
    agent_id=agent_id,
    confidence=confidence,
    reasoning=f"LLM classification (confidence={confidence:.2f})",
)

# Cache the result for future Tier 1 hits
if self._cache is not None:
    self._cache.put(
        envelope.workspace_id,
        envelope.content,
        envelope.source,
        decision,
    )

Location: orchestrator/core/routing/engine.py:413-428

2. After Tier 3 LLM Classification (Low Confidence)

Even when confidence is below threshold (triggering orchestrated workflow execution), the router still caches the decision to avoid re-invoking the LLM:

# Low confidence → orchestrate (full decomposition needed)
decision = RoutingDecision(
    route_type="orchestrate",
    agent_id=agent_id,
    confidence=confidence,
    reasoning=f"LLM classification below threshold ({confidence:.2f} < {_LLM_CONFIDENCE_THRESHOLD})",
)

# Still cache the low-confidence result so we don't re-call LLM
if self._cache is not None:
    self._cache.put(
        envelope.workspace_id,
        envelope.content,
        envelope.source,
        decision,
    )

Location: orchestrator/core/routing/engine.py:396-410

This ensures that even uncertain routing decisions are cached, preventing repeated LLM invocations for the same ambiguous request.

Sources: orchestrator/core/routing/engine.py:390-428

Cache Configuration

Environment Variables

The routing cache is controlled by three primary configuration settings:

Variable

Default

Description

ROUTING_CACHE_TTL_HOURS

24

Cache entry lifetime in hours

REDIS_HOST

(required)

Redis server hostname or IP

REDIS_PORT

6379

Redis server port

REDIS_PASSWORD

(optional)

Redis authentication password

REDIS_URL

(optional)

Complete Redis URL (overrides individual params)

Sources: orchestrator/config.py:47-62, orchestrator/config.py:143

Cache TTL Strategy

The 24-hour default TTL balances multiple concerns:

Sources: orchestrator/config.py:143

Redis Configuration Fallback

The configuration system supports multiple Redis configuration patterns:

Complete URL - REDIS_URL=redis://:password@host:port/0 (highest priority)
Component-based - Individual REDIS_HOST, REDIS_PORT, REDIS_PASSWORD variables
No Redis - Cache gracefully degrades, all requests skip Tier 1

Sources: orchestrator/config.py:51-62

Cache Lifecycle and Invalidation

Sources: orchestrator/core/routing/cache.py, orchestrator/config.py:143

Automatic Expiration

Cache entries are automatically removed by Redis after ROUTING_CACHE_TTL_HOURS. When this happens:

The next identical request becomes a cache miss at Tier 1
Routing falls through to Tier 2 (rules/subscriptions/intent matching)
If Tier 2 still produces no match, Tier 3 LLM re-classifies the request
The new LLM decision (which may differ from the expired one) is cached again

This ensures that routing logic naturally adapts to:

Agent description changes
New agent additions
Routing rule modifications
Model provider updates

No Explicit Invalidation

The current implementation does not provide manual cache invalidation APIs. When agents or routing rules are modified, cached decisions persist until their TTL expires. This is a deliberate design choice to:

Keep the cache implementation simple and stateless
Avoid complex invalidation logic tracking which cache entries are affected by which agent changes
Rely on the 24-hour TTL to provide "eventual freshness" within a reasonable timeframe

For immediate routing changes, administrators can manually flush Redis keys or restart the Redis instance (which clears all cached routing decisions).

Sources: orchestrator/core/routing/cache.py, orchestrator/core/routing/engine.py:390-428

Performance Impact

Cache Hit Metrics

Sources: orchestrator/core/routing/engine.py:171-176, orchestrator/core/routing/engine.py:332-433

For a workspace with 1000 routing requests per day and a 70% cache hit rate:

Metric

Without Cache

With Cache (70% hit)

Improvement

LLM API Calls

1000/day

300/day

70% reduction

LLM Cost (at $0.005/call)

$5.00/day

$1.50/day

$3.50/day saved

Avg Response Time

800ms

310ms

61% faster

The cache hit rate typically improves over time as:

Common user requests establish stable routing patterns
Repeated questions from the same channels populate the cache
High-frequency workflows (e.g., "create JIRA ticket") become instant-route

Sources: orchestrator/core/routing/engine.py:171-176, orchestrator/core/routing/engine.py:332-433

Relationship to Plugin Cache

The routing cache shares architectural patterns with the plugin content cache, but serves a fundamentally different purpose:

Aspect

RoutingCache

PluginContentCache

Purpose

Cache routing decisions (agent/workflow selection)

Cache marketplace plugin files from S3

Key Structure

routing:{workspace}:{content_hash}

plugin_content:{slug}:{version}

Value Type

JSON RoutingDecision object

JSON Dict[filepath, content]

TTL

24 hours (routing freshness)

1 hour (S3 read reduction)

Population

After Tier 3 LLM classification

On-demand when plugins loaded

Backend

Redis only

Redis (cache) + S3 (source of truth)

Both caches use similar Redis interaction patterns (lazy initialization, graceful degradation, TTL-based expiration) but operate at different layers of the system.

Sources: orchestrator/core/routing/cache.py, orchestrator/core/services/plugin_cache.py:1-263

PreviousTier 0: User Overrides NextTier 2: Rule-Based Routing

Last updated 23 days ago

Good afternoon

hashtagPurpose and Scope

hashtagCache Lookup Flow

hashtagCache Key Generation

hashtagRoutingCache Implementation

hashtagRedis Connection

hashtagData Structure

hashtagTTL Configuration

hashtagCache Population

hashtag1. After Tier 3 LLM Classification (High Confidence)

hashtag2. After Tier 3 LLM Classification (Low Confidence)

hashtagCache Configuration

hashtagEnvironment Variables

hashtagCache TTL Strategy

hashtagRedis Configuration Fallback

hashtagCache Lifecycle and Invalidation

hashtagAutomatic Expiration

hashtagNo Explicit Invalidation

hashtagPerformance Impact

hashtagCache Hit Metrics

hashtagRelationship to Plugin Cache