Agent Security Posture — OrchestKit vs NanoClaw vs Hermes

Pick an attack, press play, watch each system respond step by step. The honest verdict up front: harnesses contain a hostile agent at a hardware boundary; OrchestKit only blocks known-bad patterns — it has no containment. A seatbelt, not a roll cage.

Harness layer
NanoClaw · Hermes

Own the runtime. Sandbox at the OS: container / microVM per agent. Answer: “can the agent escape onto my host?”

Plugin layer
OrchestKit

Runs inside a harness. Guards at PreToolUse: deny-rules, write-time secret block, supply-chain gates. Answer: “should this command run at all?” — but a guard is advisory in-process text, not a trust boundary.

⚠️

The question that matters: “if the agent turns hostile, am I contained?”  NanoClaw: yes (microVM). Hermes (Docker): yes; Hermes (local default): no. OrchestKit: no — bypass one regex or jailbreak the model once, and your host (~/.ssh, ~/.aws, .env, every other repo) is fully reachable. ork is prevention; they are containment. Prevention-by-denylist loses to novel input.

Scenario

agent@session
step 1 of 1
Blocked / private — defended here Contained — runs but can't escape Asks / partial Exposed — executes on host Unverified

What NanoClaw & Hermes genuinely do better — the part I downplayed

✕ containment
A real trust boundary ork has none of

microVM / seccomp / cap-drop holds even when everything inside is hostile. ork’s denylist stops known-bad; base64, $IFS, char-by-char script writes, python -c all walk through. One bypass → full host.

✕ credential isolation
Vault vs ambient access

NanoClaw’s OneCLI vault means the agent never holds raw keys (injected per-request, rate-limited). ork’s session has full ambient access to every key already on disk. The secret-scanner only stops a key being written, not read + exfiltrated.

✕ egress control
ork has zero, today

A network-namespaced container can rule out exfil destinations. ork can’t — the agent can curl your data anywhere. (This is improvement #2 below, not a shipped feature.)

~ scrutiny & posture
Audited & assume-breach

Hermes took an independent audit (#7826). ork’s security is self-asserted. And their architecture is “never trust the agent, contain it” — the correct post-prompt-injection stance. ork’s is “trust the agent, catch mistakes.”

Where I oversold ork last turn — honest reframe

“Immune to skill-poisoning” → ork has no runtime self-improvement at all; it’s immune because the feature is absent. Crash-proof because there’s no engine — not a win on equal terms.
“Supply-chain hygiene wins” → true, but that protects against a malicious ork distribution, not against the agent on your box. Different threat; I conflated them.
What actually holds up: telemetry discipline (off-by-default, self-hosted, hashed — beats NanoClaw’s unverified); write-time secret-commit block (narrow but real); hooks fire in every permission mode. All prevention. None is containment.

Where OrchestKit should grow its sandboxing story — top 3

▲ 1 · highest leverage
Nudge on CC's native Bash-sandbox

Claude Code ships an opt-in OS Bash-sandbox (Seatbelt/bwrap). ork:doctor reads settings.local.json for sandbox.enabled and nudges /sandbox-on with a sane allowlist. Honest limits: no runtime detection API, and it's Bash-only (Read/Write/MCP unsandboxed) — raises the floor, not the ceiling.

effort: Spartial: bash egress + write
▲ 2 · closes the one real gap
Network-egress / remote-exec guard hook

One PreToolUse Bash guard that asks on outbound curl/wget/nc to non-allowlisted hosts and on staged curl→sh / eval $(curl) / base64-decode-exec. Pure policy-layer win.

effort: Scloses: curl|sh exfil
▲ 3 · honest positioning
Documented hardened-harness recipe

Ship a reference devcontainer.json + “run CC+ork inside NanoClaw’s microVM” guide. OrchestKit = the policy layer; here’s how to pair it with an isolation layer for the strongest posture.

effort: Munlocks: compose story

🛡 Defense architecture — how OrchestKit actually judges a command

Everything below is prevention: it decides whether a tool call runs. None of it is containment (see the missing layer 0). This is the honest design, faithfully traced from src/hooks/src/entries/pretool.ts.

1 · The PreToolUse pipeline — two lanes, one verdict

Synchronous blocking lane FAIL-CLOSED · can DENY / ASK
Bash calltool_input.command
compound-splita&&b ; c|d → parts
dangerous-command3-tier match
git-validatorbranch / push
agent-browser-safetyautomation guard
DENY · catastrophic ASK · escalate to you ALLOW · silent
Async advisory lane NON-BLOCKING · hints only
default-timeouterror-pattern-warnerconflict-predictoraffected-testspre-commit-test-gateci-simulationversion-syncchangelog-gengh-label / milestonecommit-atomicity+ more

The key design move: compound-split runs first so echo ok && rm -rf / can't smuggle a dangerous command past the matcher by hiding behind a harmless prefix. Security guards are synchronous (the command waits on the verdict); advisory hooks are async and can never block — so a slow hint never delays you, and a security verdict is never skipped.

2 · Defense in depth — 3 enforcement layers, and the one that's missing

runtime · every mode
Layer 3 — runtime
PreToolUse hook guards

The pipeline above. Fires in every permission mode (incl. skip-permissions) — the deny rules get bypassed there, the hooks don't.

runtime · config
Layer 2 — config
settings.json permission deny-rules

Bash(rm -rf /), chmod 777, force-push to main/master, mkfs, dd if=, fork-bomb — hard CC-level deny.

pre-merge · CI
Layer 1 — supply chain
security-test CI gate (16 suites)

npm run test:security must pass before merge — command-injection, dep-confusion, secret-scanning, path-traversal, symlink, unicode, npm-audit (moderate+, 4 trees).

opt-in
Layer 0 — isolation
⚠ available in the harness — ork doesn't manage it

Claude Code has a native OS sandbox (Seatbelt / bubblewrap, sandbox.* settings, since 2.1.83). But it's opt-in and ork neither enables nor requires it. With it OFF (the common default), a bypassed guard has nothing underneath. Turn it on — or run in a container — and there's a real boundary. ork ships zero sandbox config by design.

3 · The guard registry — what's actually wired (security-relevant subset)

dangerous-command-blockerDENY/ASK

3-tier: DENY catastrophic (rm -rf /, fork-bomb, DROP DATABASE, |sh), ASK gray-zone (terraform destroy, sudo, kill), ALLOW rest.

pretool/bash/dangerous-command-blocker.ts
compound-command-validatorNORMALIZE

Splits && ; | $( ) so a dangerous command can't evade matching by hiding in a compound. Anti-evasion.

pretool/bash/compound-command-validator.ts
git-validatorDENY/ASK

Branch protection (no direct commit to main/dev), branch-naming, commit-message, atomic-commit, stacked-PR delete guard.

pretool/bash/git-validator.ts
content-secret-scannerDENY

15 high-confidence patterns (OpenAI/Anthropic/GitHub keys…) on Write/Edit content. OWASP ASI02/03 — blocks the secret before it lands.

pretool/write-edit/content-secret-scanner.ts
file-guardDENY

Protects sensitive files (.env, keys); resolves symlinks first (anti-bypass); enforces file-size gate (300 / 500 test lines).

pretool/write-edit/file-guard.ts
security-pattern-validatorWARN/ASK

Flags risky code on Write — eval, dynamic exec, shell-string interpolation, injection-shaped patterns.

pretool/Write/security-pattern-validator.ts
memory-validatorVALIDATE

Sanitizes MCP memory writes — untrusted graph input validated before it persists to the knowledge graph.

pretool/mcp/memory-validator.ts
cron-guardGUARD

CI / headless guard on CronCreate — prevents unattended scheduling escapes in non-interactive runs.

pretool/cron-guard.ts

4 · Pre-merge supply-chain gate — npm run test:security (must pass to merge)

test-command-injectiontest-dependency-confusiontest-secret-scanningtest-npm-audittest-compound-commandstest-path-traversaltest-symlink-attackstest-unicode-attackstest-line-continuation-bypasstest-jq-injectiontest-sqlite-injectiontest-mcp-deny-casetest-input-validationtest-packaging-leakstest-additional-securityrun-security-tests

Honest caveat on this layer: several of these (command-injection, path-traversal, unicode-attacks, jq/sqlite-injection) harden the guards' own parsers against malicious input — i.e. they secure the security tool, not your machine from the agent. Necessary, but it's meta-security, not containment.

🧭 “Do we use a sandbox?” — the trust-boundary map

The diff in one diagram. A boundary (the box) is what holds when the agent turns hostile. ork's guards always live inside the boundary — they pick what runs; the boundary picks what can escape.

Does ork use a sandbox? No. ork is a plugin — hooks running inside Claude Code. A plugin can't wrap an OS boundary around the process that loads it. Wrong layer.
So am I unprotected? Not necessarily — Claude Code itself has a native sandbox (/sandbox → Seatbelt on macOS, bubblewrap on Linux). ork just doesn't enable or require it. Most people run with it off.
Should ork “be like them”? It can't be a sandbox (it'd have to become a harness — a different product). It can orchestrate the one already under it. That's the fit.
ork on vanilla CCsandbox OFF — the common case
your host · keys · network
✕ no boundary
agent + ork guards
Hostile agent = full host. Guards are a lock on a door with no walls.
ork + CC Bash-sandbox/sandbox ON
your host · keys · network
◐ Bash sandbox (Seatbelt/bwrap)
agent + ork guards
Partial. Real egress-allowlist + write-scope — but Bash only.
Read/Write tools, MCP & hooks run unsandboxed; ~/.ssh still readable by default.
ork + container / microVMdevcontainer or NanoClaw
your host
▓ container / microVM
agent + ork guards
Full boundary + ork's policy inside. Strongest ork posture.
Hermes (Docker backend)local default = no box
your host
▓ container · cap-drop · tmpfs
agent (env-filtered)
Full boundary in Docker mode. ⚠ local default mode has none.
NanoClawmicroVM per agent
your host
▓ microVM per agent
agent
🔑 keys via OneCLI vault — outside the agent
Full boundary + credential isolation. The reference bar.
Read it left→right: the only stacks where a hostile agent can't reach your host have a solid box around the agent. ork on vanilla CC has a red gap — that's the honest picture. ork's value (the guards) is identical in every column; what changes is whether there's a wall behind the lock. ork can't build the wall — it can only tell you to turn one on.

Does sandboxing “fit” ork? — the 3 honest options

✓ FITS · effort S
Orchestrate CC's native sandbox

ork:doctor reads settings.local.json for sandbox.enabled and nudges you to /sandbox-on (+ a recommended fs/network allowlist). Caveat: no runtime detection API, and it's Bash-only — so this raises the floor, not the ceiling.

✓ FITS · effort M
Ship a hardened container recipe

A reference devcontainer.json + “run CC+ork inside a container / NanoClaw microVM” guide. This is the real full boundary — column C. ork stays the policy layer; the container is the wall.

✕ DOESN'T FIT
Become a harness like NanoClaw

To own an OS boundary, ork would have to stop being a CC plugin and become a runtime that spawns containers. That's a different product, not a feature. Don't.

Research: NanoClaw = nanocoai/nanoclaw (MIT, ~30k★, microVM per agent, OneCLI vault). Hermes = NousResearch/hermes-agent (MIT, Docker-hardened, local default = no isolation; audit #7826: 4 critical / 9 high). OrchestKit verdicts file-cited to src/hooks/src/… + tests/security/. ⚠️ Avoid typosquat forks qwibitai/nanoclaw, thenvoi/nanoclaw-thenvoi — install only from nanocoai/nanoclaw.