Horizon Vision
Architecture captured, not yet built
Cognitive Evolution & Earned Agency
AD-357
7 reinforcement gaps addressed: multi-dimensional rewards, hindsight replay, tournament evaluation, emergent capability profiles, memetic evolution, curiosity-driven exploration, semantic Hebbian generalization. Plus Earned Agency tier progression: Ensign → Lieutenant → Commander → Senior Officer. Agency is earned, not granted.
Captain's Yeoman — Phase 36
AD-359
Dedicated scheduling and logistics assistant for the Captain. Manages the Captain's operational workload: reminders, briefings, meeting prep, status rollups, and cross-department coordination. Like a real yeoman — handles administrative burden so the Captain can focus on command decisions.
Model Diversity & Neural Routing
Planned
Model Registry with catalog of available providers (Ollama, Copilot proxy, OpenAI, Anthropic, local). Hebbian-learned routing: successful completions strengthen model→task type pairings. Over time, the system routes poetry to the model that writes best poetry, code to the model that codes best.
Cognitive Journal
Planned
Complete token ledger recording every LLM request/response with full context, duration, cost, and outcome. Enables replay, learning analysis, cost attribution by department, and retrospective training. The ship's complete cognitive log — every thought the ship has ever had.
Ward Room & IntentBus Enhancements
Planned
Ward Room: direct agent-to-agent messaging for structured deliberation. IntentBus priority lanes (critical/high/normal/background). Alert Conditions (like DEFCON levels — normal/elevated/critical) that reconfigure the entire system simultaneously. EPS compute budgeting across departments.
The Nooplex — Model of Models
Long Horizon
Full multi-ship federation. Emergent meta-intelligence that arises from cooperative, governed agent ecosystems. Each ProbOS instance is one Cognitive Mesh. The Nooplex is the network formed when many meshes federate. General intelligence as a property of the ecosystem, not any single model.
High Priority
Next sprint — Phase 30+
Phase 30: Self-Improvement Pipeline
Planned
Northstar III. ProbOS proposes improvements to itself based on failure analysis, capability gaps, usage patterns. Architect + Builder work autonomously, Captain approves.
Medium Priority
Phase 31: Security Team
Planned
Chief of Security agent, ThreatDetectionAgent, InputValidationAgent, AuditTrailAgent. Security Council quorum integration. Anomaly detection from runtime event stream. The Security team protects the ship's systems and data.
Phase 32 Remaining: Engineering Ops
Planned
Structural Integrity Field (invariant enforcement), Cognitive Journal, Model Diversity & Neural Routing, Ship's Telemetry (performance instrumentation). The remaining engineering infrastructure items after Northstar I/II completion.
Phase 33: Operations Team
Planned
Ward Room (agent messaging), IntentBus priority lanes, Alert Conditions (system-wide operational modes), EPS resource budgeting. Operations keeps the ship running efficiently at scale.
Phase 24 Remaining: Channel Adapters
Planned
Slack, Telegram, Microsoft Teams adapters built on the ChannelAdapter ABC. Discord is done (AD-274–278). Each adapter exposes per-channel conversation history and routes requests through the full ProbOS cognitive pipeline.
Phase 25: Persistent Task Scheduling
Planned
Tasks survive runtime restarts with checkpoint and resume. The current TaskScheduler (AD-281–284) is session-scoped. Phase 25 adds KnowledgeStore persistence so scheduled reminders and recurring tasks survive restarts.
Low Priority / Deferred
Phase 26: Inter-Agent Deliberation
Deferred
Structured debate protocol between agents with conflicting perspectives. More nuanced than quorum voting — agents reason against each other's positions. Novel ideas emerge from disagreement.
Phase 28: Meta-Learning
Deferred
System learns from its own learning — detects which feedback types cause the most improvement, which episodic patterns repeat, which agents improve fastest. Learning-to-learn at the civilization level.
Federation Hardening
Deferred
Multi-node stability, fault tolerance, automatic peer reconnection, partition healing. Current ZeroMQ federation works for demos. Production federation requires hardened transport with retries, gossip convergence guarantees, and graceful split-brain handling.
MCP & A2A Protocol Adapters
Deferred
MCP Federation Adapter (join/leave MCP-compatible networks). A2A Protocol Adapter (Google Agent-to-Agent protocol compatibility). Enables ProbOS to interoperate with the broader agent ecosystem.
Wave 6 Complete
0 open bugs — Wave 6 COMPLETE (8/8)
Wave 6 Closed
7 closed
BF-095 — God Object Reduction
Closed
ontology.py (1,060 lines, 53 methods) → ontology/ package (5 files). ward_room.py (1,612 lines, 39 methods) → ward_room/ package (6 files). 7 LoD violations fixed. Dead code removed. 2 import compat tests added.
BF-094 — Sync File I/O in Async
Closed
All sync open() in async paths eliminated. _read_yaml_sync() + _write_archive_sync() + load_seed_profile_async() via run_in_executor. 3 modules fixed. 2 new tests.
BF-093 — API Boundary Validation
Closed
All raw-dict endpoints eliminated. AgentLifecycleRequest + SetCooldownRequest Pydantic models. ACM errors → HTTPException(503/409). Cooldown range 60–1800 enforced. 15 new tests.
BF-091 — Mock Discipline Phase 2
Closed
Spec compliance 22.6% → 51.9% (+222 spec'd mocks across 19 files). 3 real bugs caught by spec= (BF-078 class): phantom generate(), get_trust(), get_trust_score() methods.
BF-092 — Trust Threshold Constants
Closed
19 named constants in config.py replacing ~30 magic numbers. format_trust() utility replacing 52+ round(x,4) calls. EventEmitterMixin deduplicating 4 identical _emit() methods.
BF-090 — Exception Audit Phase 2
Closed
71 silent swallows fixed (43 logger.debug, 4 narrowed to sqlite3.OperationalError, 24 justified). 42 bare catches fixed (exc_info=True). DRY helper _safe_log_event() in feedback.py.
BF-089 — Emergent Detector Trust Anomaly False Positives
Closed
Crew-reported (Forge + Reyes). Seven rapid-fire alerts during normal duty cycles. Fixed: adaptive baselines + temporal buffer + configurable sustain window.
18 closed (BF-001–023)
BF-023 — Degraded Agent Death Spiral
Closed
LLM exceptions in proactive loop swallowed at DEBUG level without calling
update_confidence(). Agents stuck at ~0.185 confidence (DEGRADED) with no recovery path. Fix: (a) exception handler tracks failures, (b) DEGRADED->ACTIVE recovery when confidence rises above 0.2. 5 tests.BF-022 — Crew Cannot Respond to Ship's Computer Advisories
Closed
Bridge alerts posted to All Hands with
same_department=False. Earned Agency blocked all Lieutenants. Fixed in AD-424: INFORM threads skip notification, DISCUSS threads pass same_department=True.BF-021 — Duty Schedule Hard Gate Missing
Closed
Agents with no duty due were still called by the proactive loop, relying on LLM to return [NO_RESPONSE]. Wesley ignored the instruction and kept producing scout reports. Fixed: skip agent entirely when no duty due (no LLM call).
BF-020 — Discord Adapter False Success Report
Closed
Discord adapter startup reported success even when discord.py was not installed. Fixed: check
adapter._started before printing success; show install command on failure.BF-013 — Ship's Computer Callsign Awareness
Closed
Ship's computer didn't recognize crew callsigns. "Is Wesley aboard?" returned "no agents found." Fixed: callsign fallback in
_agent_info(), callsign injection into decomposer prompt.BF-012 — Discord Shutdown Hang Redux
Closed
SelectorEventLoop +
asyncio.to_thread() hangs on Windows. Replaced with async polling loop using asyncio.sleep(0.1).BF-008 — Dream Cycle Double-Replay After Dolphin Dreaming
Closed
Micro-dream (Tier 1) already replayed episodes incrementally; full dream re-replayed same 50. Fixed: dream_cycle() now starts with micro_dream() flush, then maintenance only.
BF-007 — Verification False Positive on Per-Pool Agent Counts
Closed
Per-pool/per-department agent counts flagged against system-wide total. Fixed: context-window analysis + known pool size whitelist in _verify_response().
BF-004 — Transporter HXI Visualization Not Rendered
Closed
Transporter Pattern WebSocket events fire correctly but IntentSurface.tsx had no rendering block for
transporterProgress. Chunk status panel added.BF-001 — Self-Mod False Positive on Knowledge Questions
Closed · AD-348
Knowledge questions ("who is Alan Turing?") incorrectly triggered capability_gap and self-mod pipeline. Fixed by updating prompt rules to classify well-known factual questions as conversational responses, not task gaps.
BF-002 — Agent Orbs Escaping Pool Group Spheres
Closed · AD-349
Newly added agents appeared at origin (0,0,0) center instead of their correct crew cluster. The
agent_state WebSocket handler was calling computeLayout() without persisted pool group data. Fixed by persisting poolToGroup and poolGroups from state_snapshot in Zustand.BF-003 — Diagnostician Bypassing VitalsMonitor
Closed · AD-350
DiagnosticianAgent was answering
diagnose_system intents from LLM training memory instead of fetching live metrics from VitalsMonitorAgent. Fixed by adding scan_now() to VitalsMonitor and overriding perceive() in Diagnostician to fetch live metrics proactively.
Post-Wave Fixes
3 closed
BF-110 — Game Board Invisible to Agents
Closed
Agents can't see game board in proactive context —
get_recent_activity() only returns top-level threads, not replies. Fixed: inject active game state directly from RecreationService into proactive context.BF-109 — Qualification Probe Param Key Mismatch
Closed
_send_probe() used "message" key but perceive() reads "text". All prior qualification probe results unreliable. One-line fix.BF-108 — LLM Unreachable — No Runtime Visibility
Closed
MockLLMClient.get_health_status() reports mock/offline, runtime.llm_is_mock property, chat endpoint shows explicit offline message, self-mod suppressed when mock.