Evidence Chain
Purpose
This document separates evidence by type and strength so counsel can assess what is solid and what needs caveats.
Conception Timeline
2026-03-15
PRD-108 specification completed
docs/PRDS/108-MEMORY-FIELD-PROTOTYPE.md
2026-03-21 03:23 UTC
First commit with working implementation
Git commit 674474722ac998c06a016cda4c2e5d05f9fbda68
2026-03-21 ~04:00 UTC
Unit tests passing (57/57)
orchestrator/tests/test_vector_field.py
2026-03-21 ~04:30 UTC
Proof suite passing (33/33)
orchestrator/tests/test_prd108_proof.py
2026-03-21 ~05:00 UTC
Multi-scenario suite passing (13/13)
orchestrator/tests/test_prd108_scenarios.py
2026-03-21 05:59 UTC
Production mission started
Mission 77c58227, Railway deployment logs
2026-03-21 06:17 UTC
Production mission completed
Mission accepted by user
2026-03-21 ~06:30 UTC
Evidence documents written
This folder
All timestamps are verifiable via git log and Railway deployment logs.
Evidence Type 1: Code (Strong)
What it proves: The invention is implemented and functional.
Abstract interface
orchestrator/core/ports/context.py
SharedContextPort ABC with 4 methods
Vector field adapter
orchestrator/modules/context/adapters/vector_field.py
Full implementation (~357 lines)
Message-passing baseline
orchestrator/modules/context/adapters/redis_context.py
Comparison backend
Backend factory
orchestrator/modules/context/factory.py
A/B switch via environment variable
Coordinator integration
orchestrator/services/coordinator_service.py
Mission lifecycle (create/seed/inject/destroy)
Agent tool definitions
orchestrator/modules/tools/discovery/actions_field.py
3 ActionDefinitions
Agent tool handlers
orchestrator/modules/tools/discovery/handlers_field.py
3 handler functions
Configuration
orchestrator/config.py
8 environment variables
Strength: Strong. Code is in a private repository with git history showing authorship and dates.
Evidence Type 2: Local Tests with Deterministic Embeddings (Moderate)
What it proves: The algorithms work correctly. The resonance formula, decay, reinforcement, deduplication, and scoring pipeline produce the expected mathematical results.
What it does NOT prove: That the system works with production-grade embeddings or at production scale with real network latency.
Important caveat: These tests use bag-of-words deterministic embeddings, not real neural network embeddings. This is intentional — it makes the tests reproducible and isolates the algorithmic behavior from embedding model variability. But it means the A/B comparison results (62% vs 29% coverage) are mechanism tests, not production benchmarks. The embeddings produce predictable cosine similarities based on word overlap, which is sufficient to verify the algorithms but does not represent real-world semantic similarity quality.
Unit tests (test_vector_field.py)
57
Adapter mechanics (mocked Qdrant)
Mocked
Proof suite (test_prd108_proof.py)
33
All claim elements on real Qdrant in-memory
Bag-of-words deterministic
Multi-scenario (test_prd108_scenarios.py)
13
5 domains, A/B comparison
Bag-of-words deterministic
Stress tests (demo_field_stress.py)
16
Scale, performance, cross-agent visibility
Bag-of-words deterministic
Saved outputs:
runs/2026-03-21/01-proof-suite.txt
33/33 passed in 1.55s
runs/2026-03-21/02-unit-tests.txt
57/57 passed in 0.33s
runs/2026-03-21/03-ab-comparison.txt
VF 86% vs MP 43% (single scenario)
runs/2026-03-21/04-stress-tests.txt
16/16 passed, 755 qps
runs/2026-03-21/05-environment.txt
Python 3.10.9, qdrant-client 1.17.1
runs/2026-03-21/06-multi-scenario.txt
5-scenario summary, VF 62% avg vs MP 29% avg
runs/2026-03-21/07-multi-scenario-pytest.txt
13/13 passed in 1.66s
Strength: Moderate. Proves the algorithms work as specified. Does not prove production-grade effectiveness. The distinction must be stated clearly.
Evidence Type 3: Production Mission (Strong for Reduction to Practice)
What it proves: The system runs in production with real agents, real embeddings (Qwen 2048-dim via OpenRouter), and a real vector database (Qdrant on Railway). Three agents shared a field, queried by meaning, and the writer produced output incorporating findings from all research agents.
What it does NOT prove: That the vector field systematically outperforms message passing in production. This was a single mission without a controlled baseline comparison on the same prompt.
Mission ID
77c58227-defb-42c9-b070-c04a1b918764
Field ID
8bdb19ba-cb03-45d7-a005-3ba04765ad17
Qdrant instance
qdrant-production-c691.up.railway.app
Embedding model
qwen/qwen3-embedding-8b (2048 dimensions)
Agents
141 (Researcher), 191 (Writer), 102 (Document Generator)
Field queries
7 (all via platform_field_query)
Tasks completed
6
Total tokens
158,564
Duration
~18 minutes (05:59 to 06:17 UTC)
Branch
ralph/prd-82b-mission-intelligence
Platform
Automatos AI on Railway
Evidence source: Railway deployment logs (timestamped), Qdrant collection creation logs, API logs showing [PRD-108] tagged events.
Strength: Strong for proving reduction to practice. The system works end-to-end in a real environment. This is the single strongest piece of evidence for a provisional filing.
Evidence Type 4: Technical Documentation (Supporting)
PRD-108-ALGORITHMS.md
15 algorithms with math, pseudocode, implementation references
PRD-108-IMPLEMENTATION.md
Architecture decisions, files changed, engineering narrative
PRD-108-TECHNICAL-DISCLOSURE.md
Formal disclosure with prior art analysis and 6 novelty claims
Strength: Supporting. These establish that the invention was thoughtfully designed and documented, not accidental.
What the Evidence Chain Proves (Honestly)
The invention is real. Working code in a private repository with git timestamps.
The algorithms are correct. 119 automated assertions verify the mathematical behavior.
The system runs in production. One live mission with 3 agents, real embeddings, real Qdrant.
The A/B comparison shows a directional advantage. In controlled local tests with deterministic embeddings, the vector field recovered more information than message passing across 5 scenarios.
What the Evidence Chain Does NOT Prove
Production superiority. No controlled A/B comparison has been run in production (same mission, both backends, real embeddings).
Generalizability. The 5 test scenarios use deterministic embeddings and synthetic data. Real mission performance may vary.
Novelty under formal patent examination. The inventor's prior art review is preliminary. Counsel should conduct a formal search.
Commercial viability. Technical evidence only. No market validation data.
Reproduction Instructions
No external API calls. No credentials needed. Qdrant runs in-memory mode. Fully deterministic.
Last updated

