🧠
Personal → org memory loop
Senior staff's "magic prompts" auto-promote into shared knowledge via three independent paths: frequency, outcome correlation, and LLM self-eval.
🔄
3-path promotion engine
Frequency-based, outcome-correlated, LLM-scored. Run in parallel — never depending on a single signal. Configurable thresholds for auto-promote vs review.
🧬
Skills also promoted
Not just memory — your personal skills get tracked, scored, and promoted to the org skill catalog when they prove themselves.
📊
Outcome tracking built-in
record_outcome() attaches success/failure to episodes. The consolidator uses these signals statistically — no separate analytics pipeline needed.
🪪
Memory mode toggle
Per-user switch: accumulate (default) or read_only. Read-only sessions silently drop writes — useful for sensitive content. Admins can lock the mode tenant-wide or by role.
🧬
Multi-LTM fusion + routing
Run several LTMs in parallel and fuse with Reciprocal Rank Fusion — or route per query (temporal → Zep, audit → JSON, entity → Mem0). English + Japanese keyword detection. Higher recall without picking a winner.
🔌
6 LTM backends
JSON, Mem0, LangMem, Letta, Zep, HindSight — switch with one line. Plus Graph layer (optional) for relationship-heavy domains. Zero vendor lock-in.
🤖
Multi-LLM (15+ first-class · 100+ via LiteLLM)
Claude, ChatGPT, Gemini, Gemma, Qwen-API, Qwen-local (Ollama), DeepSeek, Mistral, Grok, Llama, Cohere, Perplexity, Phi + 100+ via LiteLLM. Same models on enterprise clouds: azure/* (Azure OpenAI), azure_ai/* (AI Foundry), bedrock/* (AWS Bedrock), vertex_ai/* (GCP Vertex AI). Auto-detect from env vars; switch model per-call.
🔐
Auth, RBAC, SSO, audit — in OSS
API key + JWT + OIDC (Google/MS/Okta/GitHub/Keycloak) + 4 default roles + append-only audit log. Most competitors paywall this.
🔐
User-delegated OAuth (13 providers)
Each Praxia user authorizes Box / Microsoft 365 / Dropbox / Drive / Salesforce / Notion / Atlassian / Slack / GitHub / HubSpot / Zendesk / Linear / kintone with their own credentials. The external system's native ACL is enforced per Praxia user — alice can only see what alice has access to. Self-service UI under Preferences → Service Connections.
🔑
KMS-backed token encryption
OAuth tokens use envelope encryption — fresh DEK per write, AES-GCM payload, DEK wrapped by your KMS. 5 adapters: local / aws / azure / gcp / vault. Master key never lives on the application host.
🌐
Production OAuth callback (HTTP)
praxia serve exposes /api/v1/oauth/{provider}/{start,callback,status}. Multi-worker safe state cache (TTL-pruned JSON), pinned redirect URI via PRAXIA_PUBLIC_URL, optional success-redirect to your frontend.
🛡
Resource access policies (ACL)
Glob-pattern allow / deny rules per resource type (connector, memory, prompt, skill). Built for enterprise IS departments. Every decision audit-logged.
🤖
Autonomous agent — LLM-driven tool-use loop
An LLM-driven tool-use loop over your full Praxia stack — personal memory, org memory, frozen layer, skills, connectors. The agent picks tools on its own (search → run skill → pull connector → answer) with ACL gates and audit logging. Ships as praxia.agent.AutonomousAgent, praxia agent run, and an MCP meta-tool for remote clients.
🛡️
CommandedAgent — autonomous agent with external verification
For workloads where the environment doesn't give you a free answer key — private-corpus fact QA, SOP / compliance, customer support over manuals. Wraps the autonomous agent with task-type routing + query decomposition (multi-hop) + pre-retrieval + grounding verification + no-improvement early-stop, plus an explicit abstain path when sources don't support a confident answer. Calibrated against an in-house HotpotQA / SQuAD v2 / JEMHopQA harness — decomposition added +12pt on HotpotQA-distractor 40q. Every accepted claim carries [L1#0, L3#2, …] citations; every round is in the audit log. Verifier / QueryDecomposer / TaskClassifier are all pluggable — drop in TiDB Vector / pgvector / GraphRAG / HHEM-grounding / your own without touching core.
✨
Prompt Designer — turn intent into a polished template
Describe the task in one line ("score contract risk 1-5 in JSON") → get a production-grade prompt design back: tuned system message, ${variable} user template, 2-3 few-shot examples, 5-criterion rubric. Per-LLM idioms applied automatically (Claude XML / OpenAI JSON-mode / DeepSeek-R1 reasoning / Mistral concise / Llama numbered steps).
🎨
Document Designer — code-gen pptx / docx (Claude-Skills-style)
The LLM authors python-pptx / python-docx code, a sandbox runs it (AST allowlist + 30s timeout + 512MB cap on POSIX), and you get a design-rich .pptx / .docx back — multi-column layouts, matrix slides, embedded matplotlib charts, themed branding (colors / fonts / logo / footer from .praxia/themes/). On traceback the error is fed back to the LLM and the attempt repeats up to 3 times. Themes managed in Admin → 🎨 Themes.
⚙️
Workflow-specialized flows
Sales prep, logic checking, RAG self-correction — three production-ready multi-agent pipelines that run in 5 minutes. No bespoke orchestration code required.
🎯
6 default business skills
Investment, sales, design, purchasing, patent, legal — domain-tuned agents with built-in guardrails (tax law, jurisdictional caveats, hallucination guards).
🔬
MCP / Claude Skills compatible
Skills serialize to standard SKILL.md. Drop into Claude Skills, Cursor Skills, or any MCP-compatible registry without code changes.
🛡️
Evidence by default
Sentence-level hallucination detection and retrieval metrics ship as first-class modules. "It works" comes with proof attached.
🎯
LLM output quality eval (CI gate)
Catch quality regressions before merge. tests/llm_eval/ grades real LLM output against rubrics + a committed baseline. Score drop > 5pt fails the build. Per-skill cases ship for all 6 skills.
⚖️
A/B experiments built in
Test prompt variants on real users with deterministic per-user assignment (SHA-256 bucket). Audience filter (roles / users / window). Outcome rollup + tentative winner detection. CLI + SDK.
🧮
Hermetic test harness — stubs & drivers for every surface
Every public surface (auth / memory / fusion / exporters / OAuth / parsers / CLI / extensions / experiments / connectors / agent) ships with backend stubs, fixture factories, and protocol-conforming drivers — so contributors can write hermetic tests without standing up real services. CI runs them on every PR.
🔗
20 storage / SaaS connectors
Box / SharePoint / Dropbox / Drive / kintone / Salesforce + Notion / Confluence / Jira / Slack / Teams / GitHub / HubSpot / Zendesk / Linear / S3 / Azure Blob / GCS / WebDAV / Email. Per-user OAuth means alice only sees what alice can in each system.
📄
File parsers (PDF · Word · Excel · PowerPoint · CSV · HTML)
Drop a file in — auto-dispatch by extension. PDF page-by-page, Word with heading detection, Excel as Markdown tables, PowerPoint with speaker notes. Custom formats register via entry points.
🖨
Output exporters (HTML · PPTX · DOCX · MD · JSON)
Skills produce Markdown by default. OutputFormatSkill infers requested format from natural-language hints ("パワポで" → PPTX, "as a Word doc" → DOCX). Custom formats register via entry-point.
🎙
Voice input + voice output
Speech-to-text (Whisper) and text-to-speech (OpenAI TTS / ElevenLabs / Piper). Embedded in Streamlit UI as record-and-go input and read-aloud output.
👥
Full admin user CRUD
Create / update / delete / deactivate / rotate keys / change roles — all via CLI, UI, or SDK. All operations audited.
🛡
Admin-controlled LTM policy
Pin which backend(s) users may pick and what the default mode is, at the tenant level. Resolution: admin enforced > call-site > user pref > admin default.
💾
Admin data exports
CSV / JSON / JSONL exports of audit log, users, usage, memory, policies — for compliance, SIEM, backups. Each export action self-audited.
📊
Personal & org dashboards
Flow / skill counts, success rate, top users, promoted blocks, frozen files, distributed skills — out of the box, with no separate analytics pipeline.
📝
Custom prompt distribution
Users save personal prompts. Admins promote them to org or push to specific roles / users. Three scopes with merge precedence.
🪐
MCP server (stdio + remote HTTP/SSE)
Use Praxia from Claude Desktop / Cursor / Continue.dev. Local: praxia mcp serve. Remote (multi-host): praxia serve exposes /api/v1/mcp with auth + audit log. Every skill + flow becomes an MCP tool automatically.
🌐
Backend-only or full-stack
Use Praxia as a brain behind your own frontend (SDK embed or praxia serve FastAPI HTTP API), or run the bundled Streamlit UI for the fastest path. Same auth, memory, skills.
📜
Apache 2.0 + Open Core ready
Permissive license, commercial-friendly. NOTICE.md inventories every dependency's license. Open Core path for enterprise extras planned.
📱
Mobile-responsive UI + landing
Landing has chip-style nav on phones, scrollable tabs, ≥44px touch targets, prefers-reduced-motion respected. Streamlit UI injects responsive CSS + a "Compact mode" toggle for slow connections.