Open Source · Apache 2.0 · Alpha

The multi-agent layer that learns with your team.

Praxia is a workflow-specialized multi-agent orchestrator with a built-in personal-to-organizational memory loop. Your senior engineers' tacit knowledge promotes itself into shared best practices — automatically.

Windows 10 / 11 x64 alpha. pip install や Python は不要。macOS / Linux は近日公開。

🗓 Chat → cron schedules (NL → POSIX) ⚡ Chat → parallel batches 👀 Documents auto-watch on launch 🐍 Python 3.11+ 🧠 6 LTM backends + multi-LTM fusion 🤖 Claude · ChatGPT · Gemini · Gemma · Qwen · DeepSeek · Mistral · Grok · Llama 🔐 SSO + RBAC + audit · OAuth per-user 🛡 Resource ACL 📄 PDF · Word · Excel · PowerPoint 🖨 HTML / PPTX / DOCX output 🎙 Voice in / out (STT + TTS) 🪪 Memory mode (accumulate / read-only) 🌐 SDK · Streamlit UI · FastAPI HTTP 🔑 KMS-backed encryption (AWS / Azure / GCP / Vault) ⚖️ A/B experiments + LLM-quality eval 🪐 MCP (stdio + remote HTTP/SSE) 🔗 20 connectors (Box / Notion / Slack / Jira / GitHub / S3 / …) 📱 Mobile-responsive 🤖 Autonomous agent (LLM-driven tool-use loop) ✨ Prompt Designer (intent → polished template) 🎨 Document Designer (code-gen pptx / docx) ✅ Hermetic test harness (stubs & drivers)
New in alpha13 · June 2026

Stop clicking. Start talking.

The 3 things that make Praxia different from ChatGPT Desktop and Claude Desktop today:

🗓

Chat → cron schedules

Say “every Monday morning, summarise yesterday's Documents folder”. Praxia parses the intent, generates the POSIX cron expression, persists the schedule, and tells you when the next run fires. Each firing creates a regular Task — viewable, cancellable, auditable.

“毎週月曜の朝に〇〇して” → 0 9 * * 1

Chat → parallel batches

Say “for each of these 10 PDFs, extract the action items” and Praxia fans out 10 agent runs in parallel under a single Batch. Concurrency capped so provider rate-limits don't 429 you. Cancel the whole batch — or any single child — from the UI.

10 items · 4 in flight · 1 batch id

🎨

Decks with actual design

Ask for a pitch deck and you get a deck with colored title bars, accent stripes, and matplotlib charts — generated by the LLM authoring real python-pptx code in a sandboxed runtime. Themes are JSON; bring your own. Fallback to plain export if codegen fails — you always get a file.

Designer · sandboxed · themable

How it stacks up

Praxia vs. ChatGPT Desktop vs. Claude Desktop

The capabilities other AI desktops don't have today. As of June 2026.

Praxia Desktop ChatGPT Desktop Claude Desktop
Cron schedules from natural language
Parallel batch fan-out (one prompt per item)
Auto-re-index of local docs on file change
Code-gen PowerPoint with theme + charts
9+ LLM providers in one app
Bring your own API key (local-first, no SaaS proxy)
Personal → organizational memory promotion
UI in 8 languages
Open source (Apache 2.0 server)

Information current as of June 2026, based on each product's public release notes. ChatGPT Desktop and Claude Desktop are excellent at what they do — Praxia just does different things, primarily around workflow automation and local-first document handling.

Why Praxia

38+ advantages no other framework offers in one package.

🧠

Personal → org memory loop

Senior staff's "magic prompts" auto-promote into shared knowledge via three independent paths: frequency, outcome correlation, and LLM self-eval.

🔄

3-path promotion engine

Frequency-based, outcome-correlated, LLM-scored. Run in parallel — never depending on a single signal. Configurable thresholds for auto-promote vs review.

🧬

Skills also promoted

Not just memory — your personal skills get tracked, scored, and promoted to the org skill catalog when they prove themselves.

📊

Outcome tracking built-in

record_outcome() attaches success/failure to episodes. The consolidator uses these signals statistically — no separate analytics pipeline needed.

🪪

Memory mode toggle

Per-user switch: accumulate (default) or read_only. Read-only sessions silently drop writes — useful for sensitive content. Admins can lock the mode tenant-wide or by role.

🧬

Multi-LTM fusion + routing

Run several LTMs in parallel and fuse with Reciprocal Rank Fusion — or route per query (temporal → Zep, audit → JSON, entity → Mem0). English + Japanese keyword detection. Higher recall without picking a winner.

🔌

6 LTM backends

JSON, Mem0, LangMem, Letta, Zep, HindSight — switch with one line. Plus Graph layer (optional) for relationship-heavy domains. Zero vendor lock-in.

🤖

Multi-LLM (15+ first-class · 100+ via LiteLLM)

Claude, ChatGPT, Gemini, Gemma, Qwen-API, Qwen-local (Ollama), DeepSeek, Mistral, Grok, Llama, Cohere, Perplexity, Phi + 100+ via LiteLLM. Same models on enterprise clouds: azure/* (Azure OpenAI), azure_ai/* (AI Foundry), bedrock/* (AWS Bedrock), vertex_ai/* (GCP Vertex AI). Auto-detect from env vars; switch model per-call.

🔐

Auth, RBAC, SSO, audit — in OSS

API key + JWT + OIDC (Google/MS/Okta/GitHub/Keycloak) + 4 default roles + append-only audit log. Most competitors paywall this.

🔐

User-delegated OAuth (13 providers)

Each Praxia user authorizes Box / Microsoft 365 / Dropbox / Drive / Salesforce / Notion / Atlassian / Slack / GitHub / HubSpot / Zendesk / Linear / kintone with their own credentials. The external system's native ACL is enforced per Praxia user — alice can only see what alice has access to. Self-service UI under Preferences → Service Connections.

🔑

KMS-backed token encryption

OAuth tokens use envelope encryption — fresh DEK per write, AES-GCM payload, DEK wrapped by your KMS. 5 adapters: local / aws / azure / gcp / vault. Master key never lives on the application host.

🌐

Production OAuth callback (HTTP)

praxia serve exposes /api/v1/oauth/{provider}/{start,callback,status}. Multi-worker safe state cache (TTL-pruned JSON), pinned redirect URI via PRAXIA_PUBLIC_URL, optional success-redirect to your frontend.

🛡

Resource access policies (ACL)

Glob-pattern allow / deny rules per resource type (connector, memory, prompt, skill). Built for enterprise IS departments. Every decision audit-logged.

🤖

Autonomous agent — LLM-driven tool-use loop

An LLM-driven tool-use loop over your full Praxia stack — personal memory, org memory, frozen layer, skills, connectors. The agent picks tools on its own (search → run skill → pull connector → answer) with ACL gates and audit logging. Ships as praxia.agent.AutonomousAgent, praxia agent run, and an MCP meta-tool for remote clients.

🛡️

CommandedAgent — autonomous agent with external verification

For workloads where the environment doesn't give you a free answer key — private-corpus fact QA, SOP / compliance, customer support over manuals. Wraps the autonomous agent with task-type routing + query decomposition (multi-hop) + pre-retrieval + grounding verification + no-improvement early-stop, plus an explicit abstain path when sources don't support a confident answer. Calibrated against an in-house HotpotQA / SQuAD v2 / JEMHopQA harness — decomposition added +12pt on HotpotQA-distractor 40q. Every accepted claim carries [L1#0, L3#2, …] citations; every round is in the audit log. Verifier / QueryDecomposer / TaskClassifier are all pluggable — drop in TiDB Vector / pgvector / GraphRAG / HHEM-grounding / your own without touching core.

Prompt Designer — turn intent into a polished template

Describe the task in one line ("score contract risk 1-5 in JSON") → get a production-grade prompt design back: tuned system message, ${variable} user template, 2-3 few-shot examples, 5-criterion rubric. Per-LLM idioms applied automatically (Claude XML / OpenAI JSON-mode / DeepSeek-R1 reasoning / Mistral concise / Llama numbered steps).

🎨

Document Designer — code-gen pptx / docx (Claude-Skills-style)

The LLM authors python-pptx / python-docx code, a sandbox runs it (AST allowlist + 30s timeout + 512MB cap on POSIX), and you get a design-rich .pptx / .docx back — multi-column layouts, matrix slides, embedded matplotlib charts, themed branding (colors / fonts / logo / footer from .praxia/themes/). On traceback the error is fed back to the LLM and the attempt repeats up to 3 times. Themes managed in Admin → 🎨 Themes.

⚙️

Workflow-specialized flows

Sales prep, logic checking, RAG self-correction — three production-ready multi-agent pipelines that run in 5 minutes. No bespoke orchestration code required.

🎯

6 default business skills

Investment, sales, design, purchasing, patent, legal — domain-tuned agents with built-in guardrails (tax law, jurisdictional caveats, hallucination guards).

🔬

MCP / Claude Skills compatible

Skills serialize to standard SKILL.md. Drop into Claude Skills, Cursor Skills, or any MCP-compatible registry without code changes.

🛡️

Evidence by default

Sentence-level hallucination detection and retrieval metrics ship as first-class modules. "It works" comes with proof attached.

🎯

LLM output quality eval (CI gate)

Catch quality regressions before merge. tests/llm_eval/ grades real LLM output against rubrics + a committed baseline. Score drop > 5pt fails the build. Per-skill cases ship for all 6 skills.

⚖️

A/B experiments built in

Test prompt variants on real users with deterministic per-user assignment (SHA-256 bucket). Audience filter (roles / users / window). Outcome rollup + tentative winner detection. CLI + SDK.

🧮

Hermetic test harness — stubs & drivers for every surface

Every public surface (auth / memory / fusion / exporters / OAuth / parsers / CLI / extensions / experiments / connectors / agent) ships with backend stubs, fixture factories, and protocol-conforming drivers — so contributors can write hermetic tests without standing up real services. CI runs them on every PR.

🔗

20 storage / SaaS connectors

Box / SharePoint / Dropbox / Drive / kintone / Salesforce + Notion / Confluence / Jira / Slack / Teams / GitHub / HubSpot / Zendesk / Linear / S3 / Azure Blob / GCS / WebDAV / Email. Per-user OAuth means alice only sees what alice can in each system.

📄

File parsers (PDF · Word · Excel · PowerPoint · CSV · HTML)

Drop a file in — auto-dispatch by extension. PDF page-by-page, Word with heading detection, Excel as Markdown tables, PowerPoint with speaker notes. Custom formats register via entry points.

🖨

Output exporters (HTML · PPTX · DOCX · MD · JSON)

Skills produce Markdown by default. OutputFormatSkill infers requested format from natural-language hints ("パワポで" → PPTX, "as a Word doc" → DOCX). Custom formats register via entry-point.

🎙

Voice input + voice output

Speech-to-text (Whisper) and text-to-speech (OpenAI TTS / ElevenLabs / Piper). Embedded in Streamlit UI as record-and-go input and read-aloud output.

👥

Full admin user CRUD

Create / update / delete / deactivate / rotate keys / change roles — all via CLI, UI, or SDK. All operations audited.

🛡

Admin-controlled LTM policy

Pin which backend(s) users may pick and what the default mode is, at the tenant level. Resolution: admin enforced > call-site > user pref > admin default.

💾

Admin data exports

CSV / JSON / JSONL exports of audit log, users, usage, memory, policies — for compliance, SIEM, backups. Each export action self-audited.

📊

Personal & org dashboards

Flow / skill counts, success rate, top users, promoted blocks, frozen files, distributed skills — out of the box, with no separate analytics pipeline.

📝

Custom prompt distribution

Users save personal prompts. Admins promote them to org or push to specific roles / users. Three scopes with merge precedence.

🪐

MCP server (stdio + remote HTTP/SSE)

Use Praxia from Claude Desktop / Cursor / Continue.dev. Local: praxia mcp serve. Remote (multi-host): praxia serve exposes /api/v1/mcp with auth + audit log. Every skill + flow becomes an MCP tool automatically.

🌐

Backend-only or full-stack

Use Praxia as a brain behind your own frontend (SDK embed or praxia serve FastAPI HTTP API), or run the bundled Streamlit UI for the fastest path. Same auth, memory, skills.

📜

Apache 2.0 + Open Core ready

Permissive license, commercial-friendly. NOTICE.md inventories every dependency's license. Open Core path for enterprise extras planned.

📱

Mobile-responsive UI + landing

Landing has chip-style nav on phones, scrollable tabs, ≥44px touch targets, prefers-reduced-motion respected. Streamlit UI injects responsive CSS + a "Compact mode" toggle for slow connections.

Try it — pick your scenario

See exactly what Praxia does for your work.

Pick a role + use case to see the matching CLI command, sample output, and concrete Before/After. Then click Run preview to see a typed-out simulation in your browser — no install yet.

I am
I want to

Pre-meeting research for a B2B account

You're meeting Acme Manufacturing tomorrow at 14:00. Praxia ingests their IR, recent press, and your past wins → produces top-3 pain hypotheses, a 5-row FAQ with citations, and a proposal outline.

Before

  • 6 hours of LinkedIn skim + 10-K reading
  • Hit-or-miss prep — CFO asks about a recent capex you didn't know about
  • Acceptance rate ≈ 55%

After

  • 1 hour total prep
  • 3-pain-hypotheses + 5-row FAQ + proposal outline
  • Acceptance rate +15–20pt

Variation: attach .pdf board deck → Praxia auto-parses + cites it. Or pull straight from Salesforce → praxia connector pull salesforce "SELECT Id,Name FROM Account WHERE Id='001..'".

~/your-project terminal
praxia run sales \
  --customer-name "Acme Manufacturing" \
  --product "Praxia"
# Click ▶ Run preview to see a typed-out simulation
Who it's for

Seven target personas, one platform.

Praxia is opinionated about where it shines — mid-cap to large enterprises with senior staff whose tacit knowledge is currently locked in one person's editor.

🏢

Information Systems / Platform team (300–5,000 employees)

Need: Roll out AI tools across the org without handing every team a different vendor — and without paywalling SSO / RBAC / audit.

Fit: Auth + RBAC + ACL + per-user OAuth + audit log all in OSS, not behind an enterprise tier. Self-hostable on-prem or private cloud. Same code as the OSS, just operated by you.

Typical year-1 result: 100 knowledge workers, ~$1.25M net benefit, full audit trail, no per-seat licensing surprises.

🏗️

Engineering / Product VP (50–500 in scope)

Need: Senior architects' code-review and design intuition is the bottleneck. Junior PMs ramp in 12–18 months. Best practices live in Slack threads and one staff engineer's head.

Fit: DesignSkill + sleep-time consolidation distills "how senior X reviews specs" into reusable shared blocks. Markdown + git frozen layer fits existing PR review workflow.

Typical year-1 result: Senior load 16h/wk → 4h/wk, junior PM ramp 6–9 months, NFR coverage 5–7 → 15–20 axes.

⚖️

Legal / Compliance lead (regulated industry)

Need: 50–100 contracts/month bottlenecked on 2–3 people. Critical risk slips through under deadline. Need an auditable AI workflow with no vendor lock-in.

Fit: LegalSkill (RACE framework) + read-only memory mode for sensitive contracts + per-user OAuth respects external system ACL + every action audited. Apache 2.0 means you can show the source to your auditors.

Typical year-1 result: Per-contract review 60–90min → 10–15min, throughput 50–80/mo → 200–300/mo, critical-miss rate 5–10% → 1–2%.

🧪

OSS / Research integrator (engineering team)

Need: Build a domain-specific agent system over Mem0 / LangGraph / your-own-vector-DB without re-implementing auth, memory cycling, dashboards, exporters yourself.

Fit: 7 plugin types (~50 LoC each) — connectors, memory backends, parsers, exporters, OAuth providers, skills, flows. Use as a Python library, run praxia serve as a backend, embed in LangGraph. Apache 2.0.

Typical day-30: domain skill PR'd, custom connector pip-installable, memory cycling working, ~3 weeks ahead of building it from scratch.

📈

Sales / Revenue Operations lead

Need: 50+ AEs prepping for meetings; quality of pre-call research is uneven. Senior reps win 2× more deals than juniors and the pattern doesn't transfer.

Fit: SalesSkill + memory cycling distills "how senior X researches an account" into shared playbooks. Salesforce + Slack + GitHub connectors feed real customer context. Per-user OAuth means each AE only sees their own pipeline.

Typical year-1 result: Pre-call prep 6h → 1h, proposal acceptance rate +15-20pt, meetings/wk per AE 3 → 6-8.

🛒

Procurement / Supply Chain lead

Need: 5-supplier RFQs take 2-3 weeks. ESG / BCP / single-source risk is treated as an afterthought. Subcontract Act / Anti-Bribery compliance creates legal exposure if missed.

Fit: PurchasingSkill (QCD+S framework) + connectors to Salesforce / kintone / Box for RFQ documents. Audit log captures every supplier evaluation step.

Typical year-1 result: 5-supplier eval 3-4wk → 3-5 days, hidden cost discovery +30%, single-source detection 70% → 95%+.

📑

IP / Patent agent or in-house counsel

Need: Prior-art searches cost $3-5k each via outside counsel. Cross-domain art is often missed. Inventors expect first-pass results in days, not weeks.

Fit: PatentSkill (5-step framework) + file parsers for inventor disclosure docs. Memory cycling captures "patterns that distinguish prior art from real novelty" across cases. Read-only memory mode for confidential client work.

Typical year-1 result: Per-case time 1-2 days → 2-4h internal, external counsel fees −50-70%, faster turn for inventors.

Why OSS matters here

The capabilities you typically pay for — already in the package.

SSO + RBAC + audit are not paywalled

OIDC SSO (Google / Microsoft / Okta / GitHub / Keycloak) is in the OSS. Most agent frameworks ship without it; most agent platforms paywall it. Praxia treats it as table stakes.

Memory format is not locked in

Layer 4 is plain Markdown in your git repo. Layer 3 exports to JSONL. Layer 1 is your chosen backend's native format. The framework doesn't hold your data hostage — leaving costs nothing.

You can read every line

Apache 2.0. Show the source to your auditors, your security team, your customers. No "trust us, the SaaS is secure" — inspect the auth manager yourself.

Multi-LTM ensembles, not single-vendor

Run Mem0 + Zep + HindSight in parallel and fuse with RRF, or route per query. No commercial agent platform exposes this — they pick a backend and lock you in. Praxia treats it as a first-class feature.

Per-user OAuth respects external ACL

When alice pulls from Box, Box's own ACL applies — alice only sees what alice can see. Service-account designs (typical SaaS shortcut) leak data across users. Praxia's per-user OAuth makes this the default.

Run fully on-prem with Gemma / Qwen

Set PRAXIA_LOCAL_MODEL=gemma, run Ollama, choose backend=json. No cloud LLM, no cloud vector DB, no telemetry. Air-gapped customers run identical code as cloud customers.

KMS-backed token encryption (5 adapters)

OAuth tokens are envelope-encrypted with the master key in AWS KMS, Azure Key Vault, GCP KMS, HashiCorp Vault — or locally for dev. Most agent frameworks store tokens with a local symmetric key; Praxia treats KMS as a first-class concern in the OSS.

Production OAuth callback handler

Multi-worker safe — state cache survives across processes (TTL-pruned JSON file), redirect URI pinned via env var. Run praxia serve behind nginx and the callback works correctly with N replicas. Most OSS competitors only support CLI loopback.

A/B testing + LLM quality eval included

Run controlled experiments on prompts / skills / LLMs with deterministic assignment + outcome tracking. CI-gate quality regressions with a baseline-flagging eval framework. Both in the OSS — no separate "experimentation platform" subscription.

How it works

From individual usage to organizational standard — automatically.

1

You just work

Run flows and skills via CLI / SDK / UI. Every interaction lands in your personal memory automatically — no save() calls in business code.

p = Praxia(user_id="alice")
p.run(SalesAgentFlow, inputs={...})
# Memory accumulates implicitly
2

Outcomes get attached

When deals close, tests pass, or PRs merge, attach an outcome. The consolidator uses these to weight which patterns are actually effective.

p.personal_memory.record_outcome(
    episode_id=ep.id,
    success=True, score=0.9,
    notes="closed-won",
)
3

Nightly distillation

The Sleep-time Consolidator clusters similar memories across users, runs each through the 3-path engine, and auto-promotes the high-confidence ones.

praxia consolidate
# auto_promoted: 3, review_queued: 5
4

Living → frozen

Promoted shared blocks become living org knowledge. The most stable get frozen into Markdown + git for PR review. Every step is auditable.

praxia freeze --block manufacturing_pain
# → .praxia/frozen/.../*.md
Architecture

Six layers that turn one expert's drawer into everyone's playbook.

UI · CLI · SDK
Orchestrator · Flows · Skills
Layer 1 — Personal Memory (auto-extracted)
Layer 2 — Sleep-time Consolidation · 3 promotion paths
Layer 3 — Shared Memory Blocks (living)
Layer 4 — Markdown + git frozen layer (stable)
Layer 5 — Graph layer (optional)
Layer 6 — Skills Registry (promotion-aware)
Auth · RBAC · Audit · SSO (OIDC / SAML)
Three workflow flows

Run a multi-agent pipeline in 30 seconds.

Sales Agent Flow

Customer IR + past minutes + RAG → hypotheses → FAQ → proposal outline.

praxia run sales \
  --customer-name "Acme" \
  --product "BizFlow"

Logic Checker Flow

Three agents (structure / contradiction / reader) review long docs.

praxia run logic \
  --document spec.md

RAG Optimization Flow

Self-correcting RAG: query expansion → eval → hallucination check loop.

praxia run rag \
  --question "What license?"
Six default business skills

Domain-tuned agents, ready out of the box.

Investment

InvestmentSkill

Equity research, due diligence, portfolio decisions with bull/bear analysis.

Sales

SalesSkill

Account research, proposal drafting, FAQ prep, objection handling.

Engineering

DesignSkill

System design review, requirements engineering, architecture trade-offs.

Procurement

PurchasingSkill

Supplier evaluation, RFQ analysis, TCO calculation, BCP risk scoring.

IP / Patent

PatentSkill

Prior-art search, claims drafting, patent maps, filing strategy.

Legal

LegalSkill

Contract review, compliance checks, M&A diligence, policy drafting.

5-minute Quickstart

From pip install to live agent in under 5 minutes.

🖥

Native desktop app (no Python required)

The easiest way to try Praxia is the native desktop installer. The app embeds the Praxia server inside the installer — install, launch, paste an LLM provider key, and you're running. No praxia serve process to start, no pip install, no Python on the user's machine. Settings exposes only the three things a user controls: LLM provider keys (Anthropic / OpenAI / Google / Azure OpenAI / Qwen DashScope / Hugging Face — Gemma covered via all three cloud paths), local LLM (Ollama URL + model), and optional SSO tenant URL. Everything else (port, API key, storage layout, CORS) is managed automatically.

Desktop-only features: 🗂 Local folder ingestion with auto-discovery — point Praxia at a folder (e.g. ~/Documents/Contracts/) and the app walks it recursively, parses every supported file (PDF / DOCX / PPTX / XLSX / TXT / MD / code) and makes the contents searchable by the agent alongside L1 / L3 / L4 memory. Useful for confidential documents you don't want uploaded to cloud storage. Plus native notifications when a long-running agent task finishes, and native file dialogs for drag-and-drop attachments. (🚧 Phase 1b)

📦 Download Praxia Desktop (.exe, ~165 MB)

Windows 10 / 11 x64. Unsigned alpha — Windows SmartScreen will warn on first launch; click 「詳細情報」→「実行」 (or "More info" → "Run anyway") to proceed. Also available: .msi for managed deployment · all releases & notes. macOS / Linux land in Phase 1b.

For library / SDK / CLI use, the Python install:

# Install (with UI + connectors + office parsers)
pip install "praxia[ui,connectors,office]"

# Initialize
praxia init

# Run flows + skills
praxia run sales --customer-name "Acme"
praxia skill run investment "3-year thesis on Acme Mfg (fictional)"

# Launch the UI (11 tabs incl. Dashboard / Policies / Admin / Connectors)
praxia ui --port 8501

# OR — backend-only mode for your own frontend (FastAPI HTTP)
pip install "praxia[server]"
praxia serve --host 0.0.0.0 --port 8000

# Output exporters — render skill output to HTML / PPTX / DOCX
praxia export report.md slides.pptx --title "Q3 Review"

# Memory mode — accumulate (default) or read-only per user
praxia memory mode --user-id alice read_only
praxia admin memory-policy-set --enforced-backend mem0 --allowed mem0,zep

# A/B experiments — test prompt variants with deterministic assignment
praxia experiment create proposal_v2 --name "Prompt v2" \
  --variants '{"control":{"prompt":"..."},"candidate":{"prompt":"..."}}' \
  --traffic-split "control=0.5,candidate=0.5"
praxia experiment start proposal_v2

# Production-grade OAuth + KMS-encrypted tokens
export PRAXIA_KMS_ADAPTER=aws
export PRAXIA_KMS_KEY_ID=arn:aws:kms:...
pip install "praxia[server,kms-aws]"
praxia serve --host 0.0.0.0 --port 8000

# Personal → org memory distillation
praxia consolidate

# Enterprise: resource policies, audit exports, connectors
praxia policy add deny connector "box:/Confidential/*" --principals "role:member"
praxia admin export-audit audit.csv --since-days 30
praxia connector pull salesforce "SELECT Id, Name FROM Account"

Bring your own LLM key — Anthropic, OpenAI, Google (Gemini / Gemma), Alibaba (Qwen), or run Gemma / Qwen locally via Ollama. Two deployment modes: full-stack praxia ui or backend-only praxia serve behind your own frontend — see deployment-modes.md.

Use cases

Concrete Before / After across six business functions.

Function
Before
After Praxia
Lift
Investment DD
4–6h / deck
45–60 min
−80%
Sales prep
Hit-or-miss prep
Storyboard + FAQ + RAG
+15–20pt acceptance
Design review
16h / week senior load
4h / week
−75%
Procurement RFQ
Direct cost only
Full TCO + ESG
+30% true cost surfaced
Patent prior art
1–2 days + counsel
2–4h internal
−50–70% counsel fees
Legal M&A
4–8 weeks
2–3 weeks
−50% external costs

See full Before/After tables in docs/use-cases.md.

UI tour

The Streamlit dashboard at a glance.

Run Flow tab — execute a multi-agent flow with file upload
🎬 Run Flow — pick a flow, fill inputs (or attach files), watch each agent step run
Business Skill tab — invoke a domain-tuned agent like legal_reviewer
🛠 Business Skill — one of six domain-tuned agents (here: legal contract review)
Dashboard tab — personal and organizational usage metrics
📊 Dashboard — flow / skill counts, success rates, top users, top skills
Policies tab — resource access control list management
🛡 Policies — glob-pattern allow/deny ACL for IS departments
Connectors tab — pull and push data between Praxia and external systems
🔌 Connectors — Pull / Push to Box · SharePoint · Dropbox · Drive · kintone · Salesforce
Admin Downloads tab — export audit log and other data
💾 Admin Downloads — CSV / JSON / JSONL export with chain-of-custody audit logs

Plus 👥 Users, 📝 Prompts, 🧠 Memory, 🌙 Consolidate, ℹ About tabs (11 in total). Local file upload supported throughout.

Concrete examples

One CLI invocation, real business output.

VC pre-screening: 1 deck, 45 minutes

Before: 4–6h reading the deck, scrubbing competitor research, modeling financials.

After: Full 5-section memo (Profile / Quant / Qual / Risk / Decision) with bull-and-bear cases and confidence intervals.

  • 📉 Time: 4–6h → 45–60 min
  • 📊 Coverage: 3–5 competitors → 10–15 surrounding-domain peers
  • 🎯 Capacity: 5–10 deals/wk → 20–30 deals/wk
# CLI
praxia skill run investment "\
Mid-term thesis on a hypothetical issuer:
- sector: consumer electronics, mid-cap JP
- horizon: 3 years
- compare with two anonymized peers
"

B2B account research: 8h → 1h

Before: Hit-or-miss prep based on LinkedIn skim. CFO asks about a recent capex you didn't know about.

After: Praxia ingests IR + 6 months of press, extracts top-3 pain hypotheses, and generates a 5-row FAQ with citations.

  • ⏱ Prep: 6h → 1h
  • 📈 Acceptance rate: +15–20pt
  • 📞 Meetings/week: 3 → 6–8
# Multi-agent flow
praxia run sales \
  --customer-name "Acme Manufacturing" \
  --product "Praxia Sales" \
  --additional-context "Mid-term plan
calls for 30B JPY DX investment"

Architecture review: 4h → 30min

Before: Senior architect spending 16h/wk on PR-style design reviews. NFRs slip through.

After: DRAGON framework (Data flow / Requirements / Architectural fit / Gaps / Operation / NFRs) — checks all 6 axes systematically.

  • ⏱ Senior load: 16h/wk → 4h/wk
  • 📋 NFR coverage: 5–7 → 15–20 axes
  • 👶 Junior PM ramp: 12–18mo → 6–9mo
praxia run logic --document spec.md

# or single-skill review
praxia skill run design "\
Review the attached architecture for
the new payments microservice...
"

RFQ analysis: 2 weeks → 3 days

Before: Direct cost only; ESG / geopolitics / BCP risk treated as afterthoughts.

After: Full TCO matrix + QCD+S framework + Subcontract Act compliance check + risk grid.

  • 📦 30-supplier eval: 3–4wk → 3–5 days
  • 💸 Hidden cost discovery: +30% of initial quote
  • 🚨 Single-source detection: 70% → 95%+
praxia skill run purchasing "\
Evaluate 5 PCB suppliers for our new
product line. Annual volume 2M units.
Constraints: Japan-domiciled HQ,
ISO9001, no Russia/Belarus exposure.
"

Prior-art search: 2 days → 4 hours

Before: 30–50万円 / case to outside counsel for first-pass research.

After: 5-step framework (element extraction → IPC/FI/F-term search formula → hit analysis → novelty → inventive step). Counsel only reviews the draft.

  • ⏱ Per-case: 1–2 days → 2–4h
  • 💴 External fees: −50–70%
  • 📊 Cross-domain art: significantly improved
praxia skill run patent "\
Prior-art search: solid-state battery
with three-layer ceramic electrolyte
and Li-rich cathode. Provide:
1. Element decomposition
2. IPC/FI/F-term search strategy
3. Hit-analysis table
4. Novelty + inventive-step verdict
"

Resume screening: 200 candidates in 2h

Before: Recruiter screens 50-80 resumes/day; quality varies; senior recruiters' "spot the right hire" instinct doesn't transfer to juniors.

After: Custom HRSkill applies your role criteria + culture fit signals consistently. Memory cycling captures "what predicted a successful hire" from past placements.

  • ⏱ 200 resumes: 6h → 2h
  • 🎯 Top-10 quality: junior matches senior recruiter
  • 📊 Time-to-hire -30%; offer-acceptance +10pt
# Custom skill (yours) + connectors
praxia connector pull s3 \
  "hiring-bucket/q3-applicants/" \
  --user-id alice
praxia skill run hr_screener "\
Apply ICP criteria + grade against
the Senior PM role posted Sep 5.
Output: top-10 ranked + flag risks.
"

Customer ticket triage: hands-on or autonomous

Before: 200+ tickets/day, junior agents escalate ~40% to seniors; SLA breaches in regulated industries trigger fines.

After: Zendesk + GitHub + Confluence connectors give context. Custom SupportSkill drafts replies in your voice. Memory cycling captures "how senior X handled the tricky ones".

  • 📂 Median resolution: 4h → 1.5h
  • ⬆ Senior escalation rate 40% → 15%
  • 🎯 CSAT +8pt; SLA breaches -70%
praxia connector pull zendesk \
  "tickets:status:open priority:high"
praxia skill run support_triage "\
Read the ticket and last 5 comments.
Suggest a reply matching our brand voice.
Flag if escalation is needed.
"

Platform rollout: 5,000 users in 4 weeks

Before: 6-month vendor evaluation, lawyer review, custom SSO integration, separate audit log pipeline. Each tool needs its own.

After: OIDC SSO (Microsoft / Okta) day-one. SCIM provisioning auto-syncs user lifecycle. KMS-backed token encryption per cloud. Audit log for SIEM ingest.

  • ⏱ Vendor eval: weeks → days (open source)
  • 🔐 SSO + SCIM + KMS in OSS, no enterprise tier
  • 📊 Same code as paid tier — auditors verify
# Production deploy
export PRAXIA_SSO_PROVIDER=microsoft
export PRAXIA_KMS_ADAPTER=aws
export PRAXIA_SCIM_TOKEN="$(openssl rand -hex 32)"
praxia serve --host 0.0.0.0 --port 8000

# Okta admin: point SCIM at /scim/v2/Users

Lit review: 200 papers in 3 hours

Before: PhD students spend weeks doing prior-art / state-of-the-art reviews. Sometimes they miss the one paper that already solved the problem.

After: Email + GitHub + WebDAV (institutional repo) + S3 (preprints) connectors feed papers in. Custom ResearchSkill extracts methodology + findings + relevance score. RAG-fused memory across the lab's history.

  • 📚 Lit review: 3 weeks → 3 hours initial pass
  • 🎯 Cross-domain hits found: +40%
  • 🤝 Lab-wide memory: students inherit predecessors' knowledge
praxia connector pull s3 "arxiv-mirror/2024/cond-mat/"
praxia run rag --question "\
Latest research on three-layer ceramic
electrolytes for solid-state batteries —
group by approach, flag contradictions.
"

Full Before/After tables (10 industries × 3 use cases each) in docs/use-cases.md.

ROI projection

Cumulative effect compounds with the memory loop.

Annual ROI formula

Year 1 ROI = (N × C × t × s₁) + Q − P
Year 2+    = (N × C × t × s₂) + Q × g − P

N  = knowledge workers in scope
C  = loaded cost / FTE
t  = time on routine work
s₁ = year-1 time savings (typ. 30–50%)
s₂ = year-2 time savings (typ. 50–75%)
   ↑ s₂ > s₁ because org memory compounds
Q  = quality lift (errors avoided)
P  = Praxia cost (license + infra)

Worked example: 100 knowledge workers

VariableYear 1Year 2
Workers in scope (N)100100
Loaded cost (C)$90k$90k
Routine work share (t)40%40%
Time savings (s)35%60%
Quality lift (Q)$65k$200k
Praxia cost (P)$80k$80k
Net benefit$1.25M$2.30M

3-year cumulative net ≈ $5.2M. Even after halving each parameter, ROI remains > 10×.

3-year compounding effects

KPI Before 1 year 3 years
New-hire ramp time6–12 months4–6 months2–3 months
Knowledge loss on departureSeveral / yr50% reductionZero
Output quality variance2–3× spread50% narrower≤ 20% spread
Cross-team best-practice flowAlmost none5–10 / mo30+ / mo
AI utilization (org avg / individual best)30–50%60–70%80%+
Extensibility

Built to grow with your team.

Custom flow (~30 lines)

Define a multi-agent pipeline by subclassing Flow. Each step references prior outputs via ${var} templates.

class IncidentResponseFlow(Flow):
    name = "incident_response"
    steps = [
        FlowStep("triage", ...),
        FlowStep("hypothesis", ...),
        FlowStep("mitigation", ...),
    ]

Custom skill (~20 lines)

Subclass Skill with a system prompt + manifest. Auto-serializes to SKILL.md for MCP / Claude Skills.

class HRRecruitingSkill(Skill):
    manifest = SkillManifest(
        name="hr_recruiting",
        domain="hr",
        ...
    )
    system_prompt = """..."""

Custom LTM backend

Implement the 4-method MemoryBackend protocol. Plug in any vector DB (Pinecone, Weaviate, Qdrant, ...) — and optionally combine with built-ins via CompositeBackend / RoutedBackend.

class PineconeBackend:
    def add(...): ...
    def search(...): ...
    def all(...): ...
    def clear(...): ...

Custom connector (~50 lines)

Implement the 2-method Connector protocol (pull / push). Per-user OAuth, ACL enforcement, and audit logging plug in for free. End-to-end Notion example in the guide.

class NotionConnector:
    name = "notion"
    def pull(self, path, *, limit): ...
    def push(self, path, data): ...

Custom output format

Built-in: HTML, PPTX, DOCX, MD, JSON. Add your own (LaTeX? RTF? Confluence Storage?) by implementing the Exporter protocol and declaring an entry-point.

class LatexExporter:
    format = "latex"
    extensions = ("tex",)
    def export(self, content) -> bytes: ...

Detailed extension guides: PLUGINS.md · CUSTOM_CONNECTORS.md · design specs (EN + JA).

Where it fits

What Praxia is best at.

Best for…

  • Teams where senior staff's tacit knowledge needs to compound
  • Workflows in sales / legal / patent / design / purchasing / investment
  • Enterprises that need RBAC + ACL + audit out of the box
  • Self-hosted / on-prem deployments (LLM + memory both)

Less ideal when…

  • You need a fully generic agent graph builder — try LangGraph
  • You want a hosted-only knowledge platform — try a SaaS
  • You're building a single-shot chatbot — Praxia is overkill
  • Your team is a single user with no organizational learning need

Plays well with…

  • Mem0 / LangMem / Letta / Zep / HindSight — Praxia uses them as backends, or several at once via Composite / Routed
  • Claude / ChatGPT / Gemini / Gemma / Qwen via LiteLLM (local Ollama or cloud)
  • Box / SharePoint / Dropbox / Drive / kintone / Salesforce connectors
  • MCP / Claude Skills / Cursor Skills (skill format compatible)
  • Your own frontend (Next.js / Slack / mobile) — run praxia serve as the HTTP backend

Detailed feature inventory and integration matrix in docs/FEATURES.md.

FAQ

The questions everyone asks.

Does Praxia send my data to a third party?

No. The default json backend stores everything on local disk. LLM calls go to whichever provider you configure — pick qwen-local (Ollama) for fully in-house operation. You choose the trust boundary.

How is this different from "just using Mem0"?

Mem0 is a memory layer. Praxia is the orchestrator + memory + skill registry + flows + eval + auth. Mem0 is one of six interchangeable backends inside Praxia.

Is auto-promotion actually safe?

Three guardrails: (1) auto-threshold defaults to 0.75 — high; (2) review queue catches mid-confidence items for human approval; (3) the audit log records every promotion so rollback is trivial.

Can I run Praxia fully offline / on-prem?

Yes. Pick qwen-local (Ollama) for the LLM and json or self-hosted Mem0/HindSight for memory. No cloud calls.

How does Praxia compare to LangGraph?

LangGraph excels at general agent orchestration but doesn't ship workflow templates, business skills, memory cycling, or auth. Praxia is opinionated and batteries-included for the "specialized multi-agent + organizational memory" niche.

Can I use this commercially?

Yes — Apache 2.0. Even auth/SSO is in the OSS, where competitors typically paywall those features.

Is the 6-skill set fixed?

No. Add your own with ~20 lines. PRs that contribute new skills are very welcome.

What about MCP / Claude Skills compatibility?

Skills serialize to standard SKILL.md frontmatter. Drop any Praxia skill into Claude Skills, Cursor Skills, or any MCP registry without code changes.

Is my org memory locked into Praxia?

No. Layer 4 is plain Markdown in your git repo. Layer 3 exports to JSONL. Layer 1 personal memory is standard JSONL or your chosen backend's native format. You can leave at any time.

How big can my org grow before hitting limits?

JSON backend handles ~10k users comfortably. Beyond that, switch to Mem0 + Qdrant/Pinecone or HindSight. The promotion engine scales with LLM tokens — budget 10–50 LLM calls per consolidation per cluster.

Can I use Praxia behind my own frontend (Next.js / mobile / Slack bot)?

Yes — that's mode B. Two paths: embed the Python SDK directly if your backend is Python, or run praxia serve (FastAPI, 8 endpoints under /api/v1) and call it from any HTTP client. Same auth, RBAC, ACL, and audit log as the Streamlit UI. Setup recipe: deployment-modes.md.

Does the user have to write to memory every time?

No. Personal memory accumulates implicitly during normal use. Per user, you can also flip read_only mode for sensitive sessions — writes are silently dropped, reads still work. Admins can lock the mode tenant-wide or per role.

How do I get a PowerPoint / Word output instead of just text?

Use OutputFormatSkill — it detects format hints in natural language ("パワポで" / "as a Word doc" / "HTML please") and renders via the matching exporter. CLI: praxia export report.md slides.pptx. Custom formats register via the praxia.exporters entry-point.

Can I plug in a connector to a system you don't ship?

Yes. The Connector protocol is two methods (pull / push) and ~50 lines. Per-user OAuth, ACL enforcement, and audit logging plug in for free. End-to-end Notion example: CUSTOM_CONNECTORS.md.

Is Gemma supported?

Yes. gemma / gemma-2b / gemma-9b / gemma-27b via local Ollama; gemma-cloud via Google Vertex AI. PRAXIA_LOCAL_MODEL=gemma makes auto_detect() fall back to Gemma instead of Qwen-local when no cloud key is set.

How are OAuth tokens protected at rest?

Envelope encryption: a fresh 256-bit DEK per token, AES-GCM payload encryption, and the DEK wrapped by a configurable KmsAdapter. 5 adapters ship: local (HKDF, dev), aws (AWS KMS CMK), azure (Key Vault Keys), gcp (Cloud KMS), vault (HashiCorp Transit). Master key never leaves the KMS / HSM. Switch with PRAXIA_KMS_ADAPTER=aws.

Can I A/B test a new prompt before rolling it out?

Yes — built into the OSS. Define an experiment with control / treatment variants, set traffic split, restrict the audience (roles / users / time window). Each user's assignment is deterministic (SHA-256 hash) so they always see the same variant during the experiment. Outcomes auto-track via the existing record_outcome() API. praxia experiment results <id> shows tentative winner.

Can I use Praxia from Claude Desktop / Cursor over the network?

Yes. Praxia is an MCP server in two flavors. Local (recommended for desktop): praxia mcp serve → configure Claude Desktop's mcp.json to spawn it via stdio. Remote (multi-host / team): run praxia serve and the MCP HTTP+SSE endpoints under /api/v1/mcp are available. Auth via API key, JWT, or a shared X-MCP-Token. Every business skill + multi-agent flow + memory search becomes an MCP tool automatically — no per-tool wiring required.

How do I prevent prompt changes from silently degrading quality?

Run tests/llm_eval/ in CI. Each PR runs 6 canonical cases (one per business skill) against the configured LLM and grades output with rubrics (keyword / structure / length / must-not-contain / LLM-as-judge). Scores below the committed baseline minus 5pt fail the build. Update the baseline with --update-baselines after a known-good change.

Editions

Self-host today, or join the hosted alpha.

Praxia is fully open source under Apache 2.0 — every feature (SSO, RBAC, ACL, audit, OAuth, all skills, all connectors, AutonomousAgent) is in the OSS package. The hosted edition is invitation-only alpha while we tune onboarding; commercial pricing will be set at v1.0.

Hosted (alpha)

Invitation-only

pricing TBD at v1.0 · waitlist open

  • Same OSS framework — we run it for you
  • Hosted infrastructure + automatic upgrades
  • Onboarding session + best-practice templates
  • Selected pilots run free during alpha
  • Custom deployment (on-prem / VPC / air-gapped) negotiable per engagement

Alpha status: hosted backend is being stabilized. We onboard waitlist organizations in batches of ~10 as capacity allows. Need on-prem or compliance review? Mention it in the waitlist form and we'll coordinate.

Join waitlist

Looking for OSS license interpretation, embedded use, or revenue share? See LICENSE and NOTICE.

Ready to make your team's tacit knowledge unforgettable?

Star us on GitHub, run the quickstart, or reach out for a tailored PoC.