gleann REST API Reference

gleann exposes a full OpenAPI-compatible REST API for programmatic index management, semantic search, RAG queries, and memory operations.

Base URL

Default: http://localhost:8080. Configure via server_addr in config.json or GLEANN_SERVER_ADDR environment variable.

Index Operations

POST /api/v1/indexes

Create a new vector index from a list of text passages.

{
  "name": "my-docs",
  "texts": ["passage one", "passage two"],
  "metadata": [{"source": "doc.md", "section": "intro"}]
}

GET /api/v1/indexes

List all available indexes with metadata including passage count, embedding model, and backend.

DELETE /api/v1/indexes/{name}

Remove an index and all associated embeddings. Irreversible.

Search

POST /api/v1/indexes/{name}/search

Semantic vector search with optional BM25 hybrid reranking.

{
  "query": "Byzantine fault tolerant consensus",
  "top_k": 10,
  "hybrid_alpha": 0.7,
  "filters": [{"field": "source", "op": "contains", "value": "protocol"}]
}

Memory API

POST /api/v1/memory/ingest

Store facts, summaries, or any text content into the hierarchical memory system.

{
  "content": "The user prefers concise technical answers with code examples",
  "tier": "long",
  "tags": ["preference", "style"],
  "project": "my-project"
}

POST /api/v1/memory/recall

Retrieve relevant memories using semantic similarity and tier/tag filters.

OpenAI-Compatible Proxy

gleann implements the OpenAI Chat Completions API, allowing any OpenAI-compatible client to use gleann indexes as knowledge sources by setting model to gleann/{index-name}.

POST /v1/chat/completions
{
  "model": "gleann/my-docs",
  "messages": [{"role": "user", "content": "What is the token bucket algorithm?"}]
}

Authentication

Configure an API key via GLEANN_API_KEY environment variable. When set, all API requests must include Authorization: Bearer {key}.