Command Execution
Purpose and Scope
This document covers sandboxed shell command execution within workspace environments. The command execution system allows AI agents to run development commands (tests, linters, builds, git operations) in isolated workspace directories with strict security controls.
For broader workspace architecture and task lifecycle, see Workspace Worker Architecture. For file operations (read/write/grep), see File Operations. For detailed security policies, see Security & Sandboxing.
Architecture Overview
Command execution follows a multi-tier architecture where requests flow from agents through the orchestrator to the workspace worker, which performs the actual execution in a sandboxed environment.
End-to-End Request Flow
Sources: services/workspace-worker/executor.py:122-225, orchestrator/core/workspace_client.py:133-151, services/workspace-worker/main.py:643-668, orchestrator/modules/tools/execution/unified_executor.py
Command Validation Pipeline
All commands pass through a three-stage validation pipeline before execution. This is the primary security boundary preventing malicious or destructive operations.
Validation Stages
Sources: services/workspace-worker/executor.py:448-500
Command Whitelist
The ALLOWED_COMMANDS set defines the only binaries that can execute. This whitelist covers development toolchains but excludes system administration commands.
Shell
sh, bash, cd, pwd, export, source, test, true, false
Version Control
git
Python
python, python3, pip, pip3, uv, pytest, ruff, black, mypy, isort, flake8, coverage, tox
Node.js
node, npm, npx, pnpm, yarn, vitest, jest, tsc, eslint, prettier
File Tools
ls, cat, grep, find, tree, wc, sort, uniq, cut, tr, head, tail, diff, patch, jq, sed, awk
Network
curl, wget
Build Tools
make, cmake
Archive
tar, gzip, gunzip, zip, unzip, bzip2
File Ops
touch, mkdir, cp, mv, rm, ln, chmod
System Info
echo, printf, env, which, whoami, id, date, basename, dirname, realpath, stat, file, du, df
Process
ps, kill, sleep, timeout
Polyglot
cargo, go, ruby, java, javac, mvn, gradle, rustc, gcc, g++
Docker
docker-compose (read-only inspection)
Sources: services/workspace-worker/executor.py:35-73
Blocked Patterns
Even if a binary is whitelisted, commands matching any blocked pattern are rejected. These patterns use regex to catch dangerous operations.
rm\s+-rf\s+/\s*$
Delete root filesystem
rm -rf /
rm\s+-rf\s+/[^w]
Delete non-workspace paths
rm -rf /etc
\bsudo\b
Privilege escalation
sudo apt install
\bsu\s
User switching
su root
\bchmod\s+777\b
Dangerous permissions
chmod 777 /tmp
\bkubectl\b
Kubernetes access
kubectl delete pod
>\s*/dev/
Device file access
cat data > /dev/sda
\bmkfs\b
Filesystem formatting
mkfs.ext4 /dev/sdb
\bdd\s+if=
Raw disk operations
dd if=/dev/zero of=/dev/sda
\biptables\b
Firewall manipulation
iptables -F
\bsystemctl\b
Service management
systemctl stop nginx
\bpasswd\b
Password changes
passwd root
\buseradd\b
User creation
useradd attacker
\bmount\b / \bumount\b
Filesystem mounting
mount /dev/sdb /mnt
`
Backtick execution
cat `malicious.sh`
\n
Embedded newlines
Multi-line injection
Sources: services/workspace-worker/executor.py:76-98
Execution Modes
The executor automatically selects between shell mode and exec mode based on command syntax. This balances security (exec mode is safer) with functionality (shell mode supports pipes and redirects).
Mode Selection Logic
Sources: services/workspace-worker/executor.py:164-184
Shell Mode (Compound Commands)
When the command contains shell operators (|, &&, ||, ;, >, <), the executor uses asyncio.create_subprocess_shell:
Security: Even in shell mode, the command has already passed the validation pipeline, so each binary segment was checked against the whitelist.
Exec Mode (Simple Commands)
When the command is a simple binary invocation, the executor uses asyncio.create_subprocess_exec with shlex.split:
Advantages: Exec mode is immune to shell injection attacks since arguments are passed as an array rather than a string parsed by the shell.
Sources: services/workspace-worker/executor.py:167-184
Sandboxed Environment
Every command executes in a stripped-down environment with minimal privileges. The environment is rebuilt for each command to prevent contamination.
Environment Variables
The _build_sandboxed_env() method constructs a clean environment that strips all host variables:
PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Standard binary locations only
WORKSPACE_ID
{workspace_id}
Identity tracking
HOME
/workspaces/{workspace_id}
User home directory
GIT_CONFIG_GLOBAL
/workspaces/{workspace_id}/.gitconfig
Per-workspace git config
GIT_SSH_COMMAND
ssh -F {root}/.ssh/config -i {root}/.ssh/id_ed25519 -o StrictHostKeyChecking=no
Secure git auth
LANG / LC_ALL
en_US.UTF-8
Locale settings
PYTHONDONTWRITEBYTECODE
1
Prevent .pyc files
PYTHONUNBUFFERED
1
Immediate stdout/stderr
NODE_ENV
test
Node.js environment
npm_config_cache
/workspaces/{workspace_id}/.npm_cache
NPM cache location
Sources: services/workspace-worker/executor.py:506-536
Working Directory
Commands execute in a working directory that is:
Validated: Passed through
WorkspaceManager.resolve_safe_path()to prevent traversalWorkspace-scoped: Must resolve to a path inside
/workspaces/{workspace_id}/Defaults to root: If no
cwdparameter provided, uses/workspaces/{workspace_id}/
Sources: services/workspace-worker/executor.py:146-152
Output Handling
Command output is captured, truncated, and sanitized to prevent resource exhaustion and ensure consistent behavior.
Output Limits
stdout
100 KB (MAX_STDOUT_BYTES)
Truncated, truncated: true flag set
stderr
50 KB (MAX_STDERR_BYTES)
Truncated, truncated: true flag set
Sources: services/workspace-worker/executor.py:100-102
Timeout Enforcement
Commands that exceed their timeout are killed:
Default timeout: 120 seconds (
DEFAULT_TIMEOUT)Maximum timeout: 300 seconds (enforced at API layer)
On timeout:
proc.kill()called, result returns{"timed_out": true, "exit_code": -1}
Sources: services/workspace-worker/executor.py:186-200
Encoding and Truncation
Output bytes are decoded with error replacement and truncated to size limits:
This ensures:
Non-UTF-8 output doesn't crash the worker
Binary output is represented as replacement characters
Large outputs don't exhaust memory
Clients know when output was truncated
Sources: services/workspace-worker/executor.py:204-207
API Integration
Command execution is exposed through multiple API layers to support different use cases.
Orchestrator API (Frontend → Worker)
The /api/workspaces/{workspace_id}/exec endpoint proxies requests from the frontend to the worker:
Request Schema (orchestrator layer):
Response Schema:
Sources: orchestrator/api/workspace_files.py:77-107, orchestrator/core/workspace_client.py:133-151
Agent Tool Integration
Agents access command execution via the workspace_exec tool registered in the action registry:
Sources: orchestrator/modules/tools/discovery/workspace_actions.py:159-200
Worker HTTP Endpoints
The workspace worker exposes a direct HTTP API for both internal (orchestrator) and agent tool execution:
/workspaces/{id}/exec
POST
Execute command with JSON body
/workspaces/{id}/git
POST
Execute git operation (calls exec internally)
Both endpoints require the X-Internal-Token header for authentication when WORKER_INTERNAL_TOKEN is configured.
Sources: services/workspace-worker/main.py:643-668, services/workspace-worker/main.py:761-796
Execution Result Structure
Commands return a consistent result structure regardless of success or failure:
Success Result
Failure Result (Non-Zero Exit)
Timeout Result
Validation Error
Sources: services/workspace-worker/executor.py:141-224
Git Operations Integration
Git commands are exposed through a specialized endpoint that validates operations and constructs safe git command strings:
Allowed Git Operations
Git Endpoint Flow
Example Request:
Constructed Command:
Sources: services/workspace-worker/main.py:761-796
Security Guarantees
The command execution system provides defense-in-depth through multiple security layers:
1. Binary Whitelist
Only pre-approved development tools can execute. System administration commands (sudo, systemctl, useradd, etc.) are excluded.
2. Pattern Blacklist
Dangerous operations are blocked even if the binary is whitelisted (rm -rf /, device access, mounting).
3. Path Validation
All working directories must resolve within the workspace boundary. Symlink escapes and ../ traversal are prevented by resolve_safe_path().
4. Environment Isolation
Commands execute with a stripped environment that excludes host variables, SSH keys, and cloud credentials.
5. Resource Limits
Timeout: Max 300 seconds per command
Output: Max 100KB stdout, 50KB stderr
Concurrency: Worker semaphore limits concurrent executions
6. Credential Injection
OAuth tokens and SSH keys are injected per-task and cleaned up after execution. They're never stored in environment variables accessible to user code.
What's Prevented
Privilege escalation
sudo, su blocked by pattern regex
Filesystem escape
Path validation via resolve_safe_path()
Root filesystem deletion
rm -rf / blocked by pattern regex
Network attack tools
nmap, netcat not in whitelist
Container escape
docker run, kubectl blocked
Credential theft
Sandboxed environment strips host credentials
Resource exhaustion
Output limits, timeout enforcement, storage quotas
Backdoor installation
No persistent system access, ephemeral task dirs
Sources: services/workspace-worker/executor.py:76-98, services/workspace-worker/workspace_manager.py:228-253, orchestrator/api/workspace_files.py:77-107
Example Usage Patterns
Running Tests
Running Linter
Build Command
Pipeline with Output Redirection
Sources: orchestrator/modules/tools/discovery/workspace_actions.py:159-200
Last updated

