SWARMAI

Agentic OS Architecture

High-Level Design Document
Harness Engineering: How a Stateless LLM Becomes a Persistent, Evolving Agent
SWARMAI — AGENTIC OS ARCHITECTURE Harness Engineering: How a Stateless LLM Becomes a Persistent, Evolving Agent INTERFACE INTELLIGENCE HARNESS SESSION ENGINE PLATFORM SwarmWS Explorer Knowledge/ Projects/ .context/ Git-tracked + ETag polling Drag-to-chat context inject workspace_api.py + ExplorerContext Chat Center (1-4 tabs) SSE streaming + per-tab state 55+ skills + MCP tools Abort / retry / tab persistence ChatPage + useChatStreamingLifecycle Swarm Radar ToDos + Sessions + Artifacts Background Jobs + Signals Drag-to-chat work packets todo_manager + tscc_state_manager Channel Gateway Slack Future 1 dedicated slot, shared session gateway.py + adapters/ + registry.py Proactive Intelligence Session briefing + suggestions L0-L4: parse, time, score, learn proactive_intelligence.py (1,142 lines) Signal Pipeline HN + RSS + GitHub signals Daily digest + weekly rollup signal_fetch + signal_digest Autonomous Pipeline 8 stages: EVALUATE to REFLECT DDD + SDD + TDD methodology s_autonomous-pipeline + artifact_registry Job System launchd scheduler + user/system jobs Morning inbox, self-tune, signals job_manager + executor + scheduler THE HARNESS — What Makes a Raw LLM into an Agentic OS Context Engineering 11-file P0-P10 priority chain Token budget (100K for 1M) Priority-based truncation L0/L1 git-first caching Session-type-aware loading DailyActivity + briefing inject context_directory_loader + prompt_builder Memory Pipeline 3-layer distillation Session -> DailyActivity -> MEMORY Git cross-reference verification Weekly LLM maintenance COE registry + open threads Auto-distillation (>=3 files) daily_activity_hook + distillation_hook Self-Evolution EVOLUTION.md persistent registry Gap detection (reactive + proactive) Skill building + validation Correction capture (permanent) Competence tracking + archival 55+ skills (self-built + curated) evolution_*_hook + skill_manager Safety + Self-Harness Tool logger (audit trail) Command blocker (13 patterns) Permission dialog + persistence ContextHealthHook (light+deep) DDD staleness detection Auto-commit + index refresh security_hooks + context_health_hook SessionRouter Slot acquisition + IDLE eviction MAX_CONCURRENT=2, queue 60s SessionUnit (5-state) COLD-STREAMING-IDLE-WAIT-DEAD 3x retry + --resume + SSE LifecycleManager 60s health loop + 12hr TTL Orphan reaper + dead cleanup Post-Session Hooks DailyActivity | auto-commit | distillation evolution | context-health | improvement Claude Agent SDK CLI subprocess + --resume + sandbox Bedrock / Anthropic API Opus 4.6 (1M) + Sonnet 4.6 MCP Servers (5+) GitHub, Slack, Sentral, Taskei, Builder Skills Engine (55+ skills) skill_manager + SKILL.md discovery + lazy load Tauri 2.0 (Rust) React 19 + TypeScript FastAPI (Python) SQLite (WAL) Filesystem (local) launchd (macOS) hooks feed back context feeds sessions COMPOUND LOOP: Session -> Hooks -> Memory -> Context -> Next Session (every interaction makes the next one better)
Version: 1.0  •  Date: March 26, 2026
Author: Xiaogang Wang (XG) + Swarm (AI Co-Architect)
Status: For PE / Tech Leadership Review
Classification: Internal

Table of Contents

  1. Executive Summary 3
  2. Architecture Overview 4
    1. Six-Layer Architecture
    2. The Compound Loop
  3. The Harness — Core Innovation 5
    1. Context Engineering
    2. Memory Pipeline
    3. Self-Evolution
    4. Safety & Self-Harness
  4. Intelligence Layer 9
    1. Autonomous Pipeline
    2. Proactive Intelligence
    3. Job System
  5. Session Architecture 11
    1. Multi-Tab Parallel Sessions
    2. Swarm Brain — Multi-Channel
  6. Interface Layer 13
    1. Three-Column Command Center
  7. Core Engine & Growth Trajectory 14
  8. Key Design Decisions & Tradeoffs 15
  9. Competitive Positioning 16
  10. Future Roadmap 17

1. Executive Summary

SwarmAI is a desktop application that wraps Claude's Agent SDK inside a harness — a structured layer of context management, persistent memory, self-evolution, and safety controls that transforms a stateless large language model into a persistent, evolving personal AI agent.

The core thesis is simple: most AI tools reset when you close them. Context is lost, decisions are forgotten, and users re-explain the same things session after session. SwarmAI solves this structurally, not through fine-tuning but through engineered knowledge persistence.

Key Innovation: The "Harness" — an 11-file context priority chain, 3-layer memory distillation pipeline, self-evolution registry, and 7 post-session hooks that create a compound loop: every session makes the next one better. Every correction prevents a class of future mistakes. The system doesn't just run; it compounds.

Key Metrics (as of March 2026)

MetricValue
Commits613+
Built-in Skills55+
Context Files11 (P0–P10 priority chain)
Post-Session Hooks7 (auto-commit, DailyActivity, distillation, evolution, context-health, improvement, evolution-trigger)
Pipeline Stages8 (EVALUATE → REFLECT)
Session States5 (COLD → STREAMING → IDLE → WAITING_INPUT → DEAD)
Core Engine LevelL2 (Self-Improving) complete; L3 (Self-Governing) in progress
ChannelsDesktop + Slack (unified brain)
Tech Stack4 languages: Rust (Tauri), TypeScript (React), Python (FastAPI), SQL (SQLite)

2. Architecture Overview

2.1 Six-Layer Architecture

SwarmAI's architecture is organized into six horizontal layers. Each layer has a clear responsibility boundary and communicates with adjacent layers through well-defined interfaces. The Harness layer (Layer 3) is the core innovation — it is what differentiates SwarmAI from a simple LLM wrapper.

SWARMAI — AGENTIC OS ARCHITECTURE Harness Engineering: How a Stateless LLM Becomes a Persistent, Evolving Agent INTERFACE INTELLIGENCE HARNESS SESSION ENGINE PLATFORM SwarmWS Explorer Knowledge/ Projects/ .context/ Git-tracked + ETag polling Drag-to-chat context inject workspace_api.py + ExplorerContext Chat Center (1-4 tabs) SSE streaming + per-tab state 55+ skills + MCP tools Abort / retry / tab persistence ChatPage + useChatStreamingLifecycle Swarm Radar ToDos + Sessions + Artifacts Background Jobs + Signals Drag-to-chat work packets todo_manager + tscc_state_manager Channel Gateway Slack Future 1 dedicated slot, shared session gateway.py + adapters/ + registry.py Proactive Intelligence Session briefing + suggestions L0-L4: parse, time, score, learn proactive_intelligence.py (1,142 lines) Signal Pipeline HN + RSS + GitHub signals Daily digest + weekly rollup signal_fetch + signal_digest Autonomous Pipeline 8 stages: EVALUATE to REFLECT DDD + SDD + TDD methodology s_autonomous-pipeline + artifact_registry Job System launchd scheduler + user/system jobs Morning inbox, self-tune, signals job_manager + executor + scheduler THE HARNESS — What Makes a Raw LLM into an Agentic OS Context Engineering 11-file P0-P10 priority chain Token budget (100K for 1M) Priority-based truncation L0/L1 git-first caching Session-type-aware loading DailyActivity + briefing inject context_directory_loader + prompt_builder Memory Pipeline 3-layer distillation Session -> DailyActivity -> MEMORY Git cross-reference verification Weekly LLM maintenance COE registry + open threads Auto-distillation (>=3 files) daily_activity_hook + distillation_hook Self-Evolution EVOLUTION.md persistent registry Gap detection (reactive + proactive) Skill building + validation Correction capture (permanent) Competence tracking + archival 55+ skills (self-built + curated) evolution_*_hook + skill_manager Safety + Self-Harness Tool logger (audit trail) Command blocker (13 patterns) Permission dialog + persistence ContextHealthHook (light+deep) DDD staleness detection Auto-commit + index refresh security_hooks + context_health_hook SessionRouter Slot acquisition + IDLE eviction MAX_CONCURRENT=2, queue 60s SessionUnit (5-state) COLD-STREAMING-IDLE-WAIT-DEAD 3x retry + --resume + SSE LifecycleManager 60s health loop + 12hr TTL Orphan reaper + dead cleanup Post-Session Hooks DailyActivity | auto-commit | distillation evolution | context-health | improvement Claude Agent SDK CLI subprocess + --resume + sandbox Bedrock / Anthropic API Opus 4.6 (1M) + Sonnet 4.6 MCP Servers (5+) GitHub, Slack, Sentral, Taskei, Builder Skills Engine (55+ skills) skill_manager + SKILL.md discovery + lazy load Tauri 2.0 (Rust) React 19 + TypeScript FastAPI (Python) SQLite (WAL) Filesystem (local) launchd (macOS) hooks feed back context feeds sessions COMPOUND LOOP: Session -> Hooks -> Memory -> Context -> Next Session (every interaction makes the next one better)
Figure 1: SwarmAI Agentic OS Architecture — Six-layer design with the Harness as the core innovation layer
LayerResponsibilityKey Components
Interface User interaction surfaces: visual workspace, multi-tab chat, attention dashboard, multi-channel messaging SwarmWS Explorer, Chat Center (1–4 tabs), Swarm Radar, Channel Gateway (Slack)
Intelligence Proactive awareness, autonomous execution, background automation, external signal processing Proactive Intelligence (L0–L4), Signal Pipeline, Autonomous Pipeline (8 stages), Job System (launchd)
Harness The core innovation: transforms raw Claude into a persistent, evolving agent with safety controls Context Engineering (11 files), Memory Pipeline (3-layer), Self-Evolution (55+ skills), Safety + Self-Harness
Session Multi-session lifecycle management with process isolation, health monitoring, and automatic recovery SessionRouter, SessionUnit (5-state), LifecycleManager (60s loop), 7 Post-Session Hooks
Engine AI model access, tool ecosystem, external integrations Claude Agent SDK, Bedrock/Anthropic API, MCP Servers (5+), Skills Engine
Platform Desktop application infrastructure, all local, zero cloud dependency for user data Tauri 2.0 (Rust), React 19 + TypeScript, FastAPI (Python), SQLite (WAL), local filesystem, launchd

2.2 The Compound Loop

The defining characteristic of SwarmAI's architecture is the compound loop — a feedback cycle where every session's output becomes the next session's input:

  1. Session executes — user interacts with the agent, decisions are made, code is written, files are created
  2. Hooks fire — 7 post-session hooks automatically capture: DailyActivity (raw logs), auto-commit (workspace changes), distillation (promote to memory), evolution (skill/correction updates), context-health (validate integrity), improvement (DDD writeback), evolution-trigger (gap detection)
  3. Memory updates — DailyActivity files accumulate; when ≥3 unprocessed files exist, distillation promotes recurring themes, key decisions, and corrections to MEMORY.md
  4. Context enriched — next session's system prompt is assembled from the updated 11-file chain, now containing the latest memory, evolution state, and project context
  5. Agent is smarter — the next session starts with full awareness of everything that happened, mistakes to avoid, and capabilities to leverage
Design Principle: Prevention over recovery. The compound loop is designed to make errors structurally impossible over time, not to handle them after they occur. Every correction captured in EVOLUTION.md prevents an entire class of future mistakes.

3. The Harness — Core Innovation

The Harness is what makes SwarmAI more than a ChatGPT wrapper. It is a structured layer of engineering between the user interface and the raw LLM that provides four critical capabilities: context continuity, memory persistence, self-improvement, and safety. Without the Harness, Claude is a stateless function. With it, Claude becomes a persistent, evolving agent that compounds value across sessions.

3.1 Context Engineering

Most AI tools assemble a single system prompt and send it with every request. SwarmAI maintains an 11-file priority chain (P0–P10) that is assembled, cached, and budget-managed through a multi-stage pipeline. This is the most token-intensive subsystem and the one with the highest impact on agent quality.

CONTEXT ENGINEERING 11-File Priority Chain with Token Budget Management SOURCE FILES P0 SWARMAI.md SAFE P1 IDENTITY.md SAFE P2 SOUL.md SAFE P3 AGENT.md TRIM system-owned P4 USER.md TRIM P5 STEERING.md TRIM P6 TOOLS.md TRIM user-owned P7 MEMORY.md HEAD-TRIM P8 EVOLUTION.md HEAD-TRIM P9 KNOWLEDGE.md TRIM P10 PROJECTS.md TRIM agent / auto ASSEMBLY PIPELINE ContextDirectoryLoader ensure_directory() + load_all() Git freshness check + L1 cache PromptBuilder + BOOTSTRAP + DailyActivity + Session briefing + Metadata SystemPromptBuilder + Identity + Safety + Runtime + Datetime + Workspace info Token Budget Engine 1M model: 100K budget Priority truncation P10 first DELIVERED TO CLAUDE SDK System Prompt Core Identity + Safety Principles P0-P2 Behavioral Directives + Operating Rules P3 User Preferences + Session Overrides P4-P6 Persistent Memory + Evolution Registry P7-P8 Domain Knowledge + Project Context P9-P10 DailyActivity (today + yesterday) injected Session Briefing + Sibling Digests injected Open File + Active Sessions + Signals runtime ~40K tokens used / 100K budget L1 Cache: git-first freshness, mtime fallback Never truncated Truncatable (P10 first) Head-trimmed (newest kept)
Figure 2: Context Engineering — 11-file priority chain with token budget management and L0/L1 caching

Priority Chain

PriorityFileOwnerTruncationPurpose
P0SWARMAI.mdSystemNeverCore identity & principles
P1IDENTITY.mdSystemNeverAgent name, avatar, intro
P2SOUL.mdSystemNeverPersonality & tone
P3AGENT.mdSystemTruncatableBehavioral directives
P4USER.mdUserTruncatableUser preferences & background
P5STEERING.mdUserTruncatableSession-level overrides
P6TOOLS.mdUserTruncatableTool & environment config
P7MEMORY.mdAgentHead-trimmedPersistent memory (newest kept)
P8EVOLUTION.mdAgentHead-trimmedSelf-evolution registry
P9KNOWLEDGE.mdAutoTruncatableDomain knowledge index
P10PROJECTS.mdAutoLowest priorityActive projects index

Assembly Pipeline

  1. ContextDirectoryLoader.ensure_directory() — provisions/updates .context/ files from source templates
  2. ContextDirectoryLoader.load_all() — L1 cache check (git-first freshness), assemble from sources if stale
  3. PromptBuilder.build_system_prompt() — adds BOOTSTRAP.md, DailyActivity (today + yesterday), session briefing, metadata
  4. SystemPromptBuilder.build() — adds identity, safety rules, workspace info, datetime, runtime context
  5. Token Budget Engine — enforces budget (100K tokens for 1M context models), truncates P10–P3 as needed, never touches P0–P2

Key Design Decisions

3.2 Memory Pipeline

The memory pipeline is a three-layer distillation system that converts raw session activity into durable, curated knowledge. It solves the fundamental problem of AI amnesia: without this pipeline, every session starts from zero.

MEMORY PIPELINE Three-Layer Distillation: Session Activity to Durable Knowledge LAYER 1: CAPTURE Chat Session Decisions made Code committed Lessons learned Files changed DailyActivityExtractionHook auto LAYER 2: RAW LOGS DailyActivity/ 2026-03-26.md (today) 2026-03-25.md (yesterday) ... Per-session: deliverables, git, decisions, lessons, next steps distill >=3 files LAYER 3: CURATED MEMORY MEMORY.md Open Threads (P0/P1/P2) Key Decisions + Lessons Learned COE Registry (worst bugs) Recent Context (rolling 30 days) Read at every session start Git Cross-Reference Verify implementation claims against git log verify before promoting [UNVERIFIED] tagged Weekly LLM Maintenance Prune resolved threads, archive stale entries feeds next session Capture Log (30 days) Verify (git) Promote Prune (weekly)
Figure 3: Memory Pipeline — Three-layer distillation from session capture to curated long-term memory

Three Layers

LayerStorageLifecycleContent
1. Capture DailyActivity/YYYY-MM-DD.md 30 days, then archived Per-session: deliverables, git commits, decisions, lessons, next steps. Auto-extracted by DailyActivityExtractionHook
2. Distillation Triggered when ≥3 unprocessed files At session start (silent) Recurring themes promoted; one-off noise filtered; implementation claims verified against git log
3. Curated Memory MEMORY.md Permanent (weekly LLM maintenance) Open Threads (P0/P1/P2), Key Decisions, Lessons Learned, COE Registry, Recent Context

Git Cross-Reference

This is a critical safety mechanism born from a real Sev-2 incident (COE C005). The distillation hook verifies all implementation claims against git log before promoting them to MEMORY.md. Without this, a mid-session DailyActivity snapshot captured before later commits can create false memories that compound across sessions. Claims that fail verification are tagged [UNVERIFIED] rather than promoted as fact.

Weekly LLM Maintenance

A scheduled background job (self-tune) performs weekly maintenance: prune resolved Open Threads, archive stale Recent Context entries (>30 days), verify Key Decisions still reflect reality. This prevents memory bloat and ensures the agent's knowledge stays current.

3.3 Self-Evolution

Self-evolution is the capability that makes SwarmAI get better over time. When the agent encounters a capability gap, it doesn't just fail — it can build a new skill, test it, and register it for future sessions. When the user corrects a mistake, the correction is captured permanently so the same class of error never recurs.

SELF-EVOLUTION Capability Building, Correction Capture, and Continuous Growth EVOLUTION.md Persistent Registry Gap Detection Agent hits capability wall DailyActivity error patterns s_self-evolution (always-active) Skill Building Create SKILL.md + scripts Test and validate s_skill-builder (3 attempts max) Correction Capture User corrects a mistake Pattern recorded permanently Never deleted (highest value) Competence Registry Track proven capabilities Usage count + auto-archive 0 usage for 30d = archived build new skill register feedback loop record 55+ Built-in Skills browser-agent | pdf | xlsx | slack | outlook | deep-research | code-review | qa | autonomous-pipeline | 46 more... Capabilities Corrections Competence Failed Evos Detect gap Build skill Capture correction Track competence Register in EVOLUTION.md
Figure 4: Self-Evolution — Capability building, correction capture, and continuous growth loop

EVOLUTION.md Registry

The persistent registry tracks five categories of evolutionary data:

CategoryLifecycleExamples
Capabilities BuiltActive until 0 usage for 30 days → archivedBrowser agent, context monitor, workspace finder
Optimizations LearnedPermanentUse CDP over WebSocket for persistent browser sessions
Corrections CapturedPermanent (never deleted)Reported features as "not started" when fully shipped (C005)
Competence LearnedCross-referencedSSE streaming pipeline, multi-session re-architecture
Failed EvolutionsPermanentApproaches attempted and abandoned (with reasons)
Design Principle: Corrections are the highest-value entries. They represent proven failure modes with known patterns. Deleting a correction is equivalent to removing a safety guard. The registry is append-mostly; corrections are append-only.

Gap Detection (Reactive + Proactive)

3.4 Safety & Self-Harness

Safety is not a feature — it is a structural property of the architecture. SwarmAI implements defense-in-depth through multiple independent safety layers:

LayerMechanismDetails
Tool LoggerAudit trailEvery tool invocation logged with timestamp, parameters, and result
Command BlockerPattern matching13 dangerous patterns blocked (rm -rf, DROP TABLE, force push, etc.)
Permission DialogHuman approvalFirst-time external actions require explicit approval; approvals persist
Bash SandboxClaude SDK sandboxFilesystem write restrictions, network allowlists, process isolation
Escalation ProtocolConfidence-gated3 levels: INFORM (act + tell), CONSULT (options + ask), BLOCK (stop + wait)
ContextHealthHookIntegrity validationLight mode (every session): file existence, format. Deep mode (weekly): DDD staleness, cross-reference
Decision ClassificationJudgment frameworkEvery decision in autonomous pipeline tagged: mechanical (auto), taste (batch at delivery), judgment (block for human)

The Self-Harness subsystem (context_health_hook.py) performs continuous validation: checking that all 11 context files exist and parse correctly, detecting DDD document staleness, auto-refreshing KNOWLEDGE.md and PROJECTS.md indexes, and auto-committing workspace changes.

4. Intelligence Layer

The Intelligence layer sits above the Harness and provides proactive awareness, autonomous execution capabilities, and background automation. While the Harness ensures the agent remembers and improves, the Intelligence layer ensures it anticipates, acts, and automates.

4.1 Autonomous Pipeline

The autonomous pipeline drives the full development lifecycle from a one-sentence requirement to a PR-ready delivery. It is the implementation of AIDLC Phase 3 (AI-Management) where AI makes autonomous decisions and humans step in only when needed.

AUTONOMOUS PIPELINE From One-Sentence Requirement to PR-Ready Delivery EVALUATE ROI scoring GO / DEFER / REJECT THINK 3 alternatives Min / Ideal / Creative PLAN Design doc + SDD Acceptance criteria BUILD TDD: RED -> GREEN Atomic commits REVIEW Code + security scan Quality gate TEST Full suite, no regress WTF gate (halt if risky) DELIVER PR + decision log Report + artifacts REFLECT Lessons METHODOLOGY STACK DDD: "What should we build?" PRODUCT.md TECH.md IMPROVEMENT.md PROJECT.md 4 docs = autonomous judgment SDD: "Here's the spec" Design doc with acceptance criteria file list + effort estimates Produced by PLAN stage TDD: "Proof we built it" RED -> GREEN -> VERIFY Tests before code, not after Test suite IS the quality gate SAFETY MECHANISMS ROI Gate Score before committing resources Decision Classification mechanical | taste | judgment Escalation Protocol INFORM / CONSULT / BLOCK WTF Gate QA halts if fixes get risky INFRASTRUCTURE Per-run artifact isolation Token budget tracking Background execution Pipeline validator 5 profiles (full/trivial/research/docs/bugfix) DDD drives judgment. SDD produces specs. TDD verifies delivery.
Figure 5: Autonomous Pipeline — 8-stage lifecycle with DDD+SDD+TDD methodology and safety mechanisms

Eight Stages

StageOutputGate
EVALUATEROI score, GO/DEFER/REJECT recommendationROI ≥ 3.5 to proceed
THINK3 alternatives (Minimal/Ideal/Creative) with tradeoffsUser picks approach
PLANDesign doc (SDD) with acceptance criteria, file list, effortDesign approval
BUILDCode + tests (TDD: RED → GREEN → VERIFY)All tests pass
REVIEWCode quality scan + security scan (confidence-gated)No high-severity findings
TESTFull test suite, regression checkWTF Gate (halt if fixes get risky)
DELIVERPR description, decision log, delivery reportTaste decisions batched for review
REFLECTLessons written to IMPROVEMENT.md

Methodology Stack: DDD + SDD + TDD

Three methodologies form a closed loop: DDD (4 project documents) provides autonomous judgment — "should we build this?". SDD (design doc with acceptance criteria) produces specs — "here's exactly what to build". TDD (tests before code) verifies delivery — "proof we built it correctly". The key insight: when no human reviews every line, the test suite IS the quality gate.

4.2 Proactive Intelligence

The proactive intelligence subsystem (1,142 lines, 106+ tests) provides session-start briefings and suggestions through four levels of analysis:

LevelCapabilityHow
L0ParsingExtract structured data from DailyActivity, MEMORY.md, open threads
L1Temporal awarenessTime-sensitive items, deadlines, recency weighting
L2Scoring enginePriority × staleness × frequency × blocking × momentum scoring per item
L3Cross-session learningJSON-persisted learning state: skip penalty for ignored suggestions, affinity bonus for accepted ones
L4Signal highlightsReads signal_digest.json for external intelligence (HN, RSS, GitHub); effectiveness scoring with trend detection

4.3 Job System

The job system provides background automation that runs independently of chat sessions. Jobs are scheduled via macOS launchd and execute as headless Claude CLI processes with full MCP tool access.

5. Session Architecture

The session layer manages the lifecycle of Claude subprocess instances. It replaced a monolithic AgentManager (5,428 lines) with four focused components during the v7 re-architecture in March 2026. The decomposition was driven by a real need: supporting parallel chat tabs and dedicated channel slots without resource exhaustion.

5.1 Multi-Tab Parallel Sessions

MULTI-TAB PARALLEL SESSIONS SessionRouter + SessionUnit + LifecycleManager + SessionRegistry SessionRouter Slot acquisition | IDLE eviction | Queue timeout (60s) | MAX_CONCURRENT=2 Tab 1 — STREAMING COLD -> STREAMING -> IDLE Own subprocess + state machine SSE streaming to frontend 3x retry with --resume Protected: never evicted Tab 2 — IDLE COLD -> STREAMING -> IDLE Subprocess alive but paused Ready for next message Evictable if slots needed 12hr TTL before auto-cleanup Tab 3 — available Spawns on demand RAM-adaptive (1-4 tabs) Channel — DEDICATED 1 reserved slot (always) Shared across Slack Claude --resume for continuity Slack Future Channels never starve chat Chat never starves channels LifecycleManager 60s health loop | 12hr TTL kill | Dead->Cold cleanup | Startup orphan reaper SessionRegistry Module-level singletons | Wire at startup 5-State Machine COLD STREAM IDLE WAIT DEAD Tab 1 crashing does not affect Tab 2. Each tab has its own subprocess, state machine, and error recovery.
Figure 6: Multi-Tab Parallel Sessions — SessionRouter, 5-state SessionUnits, and dedicated channel slot

Four Components

ComponentFileResponsibility
SessionRoutersession_router.pyThin routing layer. Slot acquisition with IDLE eviction. Queue timeout (60s). Maps session_id → SessionUnit. MAX_CONCURRENT=2.
SessionUnitsession_unit.pyOne unit per session. 5-state machine (COLD → STREAMING → IDLE → WAITING_INPUT → DEAD). Owns subprocess spawn, 3x exponential retry with --resume, SSE streaming.
LifecycleManagerlifecycle_manager.pyBackground loop (60s interval). 12hr TTL kill, health check, DEAD→COLD cleanup, startup orphan reaper.
SessionRegistrysession_registry.pyModule-level singletons. initialize() wires all components at startup. configure_hooks() for post-session hooks.

Key Invariants

5.2 Swarm Brain — Multi-Channel

The Swarm Brain architecture ensures that regardless of which channel the user communicates through — desktop chat tab, Slack, or future platforms — it is always the same Swarm, same memory, same context.

SWARM BRAIN One AI, Every Channel, Shared Memory L1 : SHARED MEMORY loaded into every session at prompt build MEMORY.md USER.md EVOLUTION.md DailyActivity Knowledge/ +6 more files L3 : ACTIVE SESSION DIGEST — cross-session awareness reads / writes reads / writes CHAT TABS Independent sessions | Parallel | Per-topic Tab 1 "Deploy feature" Tab 2 "Review PR #42" Tab 3 (available) max_tabs - 1 slots CHANNEL SESSION Shared session | Serialized | 1 dedicated slot ONE Session Claude --resume Slack WeChat Teams ... 1 dedicated slot (always reserved) USER IDENTITY MAP Slack W017T04E = WeChat wxid_xxx = user_key "xg" Chat (parallel, per-topic) Channel (serialized, shared) Shared memory (all sessions) Cross-session awareness
Figure 7: Swarm Brain — One AI, every channel, shared memory with three layers of continuity

Three Layers of Continuity

LayerMechanismScope
L1: Shared Memory11 context files loaded at every prompt buildAll sessions (tabs + channels)
L2: Cross-Channel SessionAll channels share ONE Claude conversation (--resume)Slack + future
L3: Active Session DigestSibling session summaries injected into promptsTabs ↔ Channels (bidirectional)

Adding a new channel: Write an adapter (~250 lines implementing ChannelAdapterBase), register in the gateway, map user identity. Zero architecture change required.

6. Interface Layer

6.1 Three-Column Command Center

The interface is designed as a single integrated system where the Chat Center orchestrates everything. The three columns are not independent panels — they are views into one unified workspace connected by drag-to-chat context injection.

THREE-COLUMN COMMAND CENTER One Integrated System: Chat Center Orchestrates Everything SwarmWS Explorer Persistent local workspace Knowledge/ Notes/ Reports/ DailyActivity/ Library/ Projects/ SwarmAI/ PRODUCT.md TECH.md .context/ MEMORY.md EVOLUTION.md USER.md +8 more files git-tracked + ETag polling Chat Center Multi-session command surface Tab 1 Tab 2 + "Summarize today's notes" Here are today's key points... "Create a todo for auth refactor" Done. Added to Radar. "Remember to deploy at 10am" Saved to MEMORY.md. 55+ skills | SSE streaming | per-tab state Swarm Radar Attention dashboard ToDos Fix auth race condition Auth refactor (just added) Active Sessions Tab 1 - STREAMING Tab 2 - IDLE Background Jobs Morning inbox (daily 8am) Self-tune (daily) Signal digest (daily) drag-to-chat for instant context write create drag drag DRAG-TO-CHAT Drag any file from Explorer or any ToDo/Artifact from Radar into Chat. Agent gets full context and starts executing immediately.
Figure 8: Three-Column Command Center — SwarmWS Explorer, Chat Center, and Swarm Radar with drag-to-chat
ColumnPurposeKey Interactions
SwarmWS Explorer (left) Persistent local workspace: Knowledge/, Projects/, .context/, DailyActivity/ Git-tracked with ETag polling (5s). Drag files to chat for instant context. Agent reads/writes/organizes/commits directly.
Chat Center (center) Multi-session command surface with 1–4 parallel tabs SSE streaming, per-tab state isolation, 55+ skills, MCP tools. Controls both Explorer (write files) and Radar (create todos).
Swarm Radar (right) Attention dashboard: ToDos, active sessions, artifacts, background jobs Drag ToDo/artifact to chat — agent gets full work packet. Background job results appear here. Session status visible.

7. Core Engine & Growth Trajectory

The Swarm Core Engine is the meta-architecture that ties all six flywheels together. Each flywheel feeds the others, creating compound growth: memory informs context, context improves sessions, sessions trigger evolution, evolution builds skills, skills improve memory capture, and the cycle continues.

SWARM CORE ENGINE The Self-Growing Intelligence Self-Evolution Learn capabilities Capture corrections Never repeat mistakes Self-Memory 3-layer distillation Git-verified accuracy LLM-powered pruning Self-Context 11-file priority chain Token budget mgmt L0/L1 smart caching Self-Health Service monitoring Auto-restart on crash Resource diagnostics Self-Jobs Scheduled automation Sidecar services Signal pipeline Self-Harness Validate context files Detect DDD staleness Auto-refresh indexes COMPOUND LOOP L0 Reactive L1 Self-Maintaining (current) L2 Self-Improving L3 Self-Governing L4 Autonomous GROWTH TRAJECTORY FLYWHEELS Evolution Memory Context Health Jobs Harness Active Data flow Every session makes the next one better. The system doesn't just run — it compounds.
Figure 9: Swarm Core Engine — Six interconnected flywheels with compound loop and growth trajectory

Growth Trajectory

LevelStateCapabilitiesStatus
L0ReactiveResponds to questions, no memory between sessionsPassed
L1Self-MaintainingRemembers, self-commits, captures corrections, monitors health, auto-generates DailyActivityComplete
L2Self-ImprovingWeekly LLM maintenance, unified job system, feedback loops closed (10/12), stale correction detectionComplete
L3Self-GoverningContext adapts per session type, proactive gap detection, DDD auto-sync, growth metricsIn Progress (3/6)
L4AutonomousFull AIDLC pipeline with checkpoint/resume, self-directed learning, human-in-the-loop judgment frameworkPlanned

8. Key Design Decisions & Tradeoffs

DecisionChoiceAlternative ConsideredRationale
Memory architecture 3-layer distillation (file-based) Vector database (RAG) Files are git-trackable, human-readable, editable. Vector DB adds latency, opacity, and a dependency. File-based memory can be inspected, corrected, and version-controlled.
Session management 4-component decomposition Monolithic AgentManager 5,428-line God Object caused 15+ bugs during v7 migration (COE). Decomposition into Router/Unit/Lifecycle/Registry enabled parallel sessions and clean error boundaries.
Context assembly 11-file priority chain with budget Single large system prompt Priority-based truncation ensures identity and safety survive even under extreme context pressure. Budget management prevents context overflow crashes.
Channel architecture Shared session (serialized) Independent sessions per channel "One brain" principle: user says something on Slack knows it too. Independent sessions fragment the agent's understanding of the user.
Skill system SKILL.md instruction files Compiled plugins / function registry SKILL.md files are LLM-native: the agent reads them as natural language instructions. No compilation step, no registration API. A new skill is a markdown file.
Data storage All local (SQLite + filesystem) Cloud database Zero cloud dependency for user data. Privacy by default. Works offline. No account required. User owns all their data.
Safety model Defense-in-depth (7 layers) Single permission gate No single layer is sufficient. Tool logger + command blocker + sandbox + permission dialog + escalation + health hook + decision classification provide redundant protection.
Background jobs macOS launchd In-process cron / cloud scheduler launchd survives app restarts, runs when app is closed, managed by OS. In-process cron dies with the app. Cloud scheduler adds dependency.

9. Competitive Positioning

SwarmAI occupies a unique position in the AI tooling landscape: it is not a code editor (Cursor/Windsurf), not an IDE (Kiro), not a CLI agent (Claude Code), and not a multi-platform connector (OpenClaw). It is an agentic operating system that optimizes for depth over breadth.

CapabilitySwarmAIClaude CodeKiroCursorOpenClaw
Persistent memory3-layer pipelineCLAUDE.md (manual)Per-project specsPer-projectSession pruning
Context system11-file P0-P10 + budgetsSingle promptSpec-drivenCodebase indexingStandard prompt
Multi-session1-4 parallel tabs1 session1 session1 sessionPer-channel
Self-evolution55+ skills, correctionsNoNoNoNo
Autonomous pipeline8-stage + DDD+TDDManualSpec-drivenNoNo
Multi-channelUnified brainTerminalIDE onlyIDE only21+ (isolated)
ScopeAll knowledge workCodingCodingCodingMessaging + skills
Core Differentiator: The Harness. Every competitor either provides a raw LLM with a chat interface (Cursor, Claude Code) or a skill marketplace with session management (OpenClaw). None provides the compound loop of context engineering + memory distillation + self-evolution + safety harness that makes an AI agent genuinely improve over time.

10. Future Roadmap

PhaseTargetKey Deliverables
L3 Completion Q2 2026 Growth metrics dashboard, DDD auto-sync, stale correction auto-healing, full session-type optimization
L4 Autonomous Q3 2026 Full AIDLC pipeline with checkpoint/resume across sessions, self-directed learning loops, judgment framework with calibrated confidence
MCP Gateway When SDK supports Shared MCP server instances across sessions (currently 4 sessions × 5 MCPs = 20 instances). Reduces memory from ~2.9GB to ~750MB.
Multi-User Q4 2026 Team workspace with shared projects, role-based access, collaborative memory (separate from personal memory)
Cross-Platform Q4 2026 Linux support (currently macOS + Windows). launchd → systemd adaptation for background jobs.