能力标签

🤖 Agent 🔄 工作流 🐳 Docker 💻 CLI 🔗 REST API 📚 RAG 🧠 Claude ✨ GPT 🖥 本地 LLM

🛠

AI工具

日志分析工具

基于 Python · 开源免费，本地部署，数据完全自主可控

英文名：logatory

⭐ 10 Stars 💻 Python 📄 Apache-2.0 🏷 AI 7.5分

7.5AI 综合评分

anomaly-detectionclidevopsfastapillmpython

🌐 访问官网

✦ AI Skill Hub 推荐

经 AI Skill Hub 精选评估，日志分析工具获评「推荐使用」。这款AI工具在功能完整性、社区活跃度和易用性方面表现出色，AI 评分 7.5 分，适合有一定技术背景的用户使用。

📚 深度解析

日志分析工具是一款基于 Python 的开源工具，在 GitHub 上收获 0k+ Star，是anomaly-detection、cli、devops、fastapi领域中的优质开源项目。开源工具的最大优势在于代码完全透明，你可以审计每一行代码的安全性，也可以根据自身需求进行二次开发和定制。

**为什么要使用开源工具而非商业 SaaS？**
对于个人开发者和有隐私需求的用户，本地部署的开源工具意味着数据不离本机，不受第三方服务商的数据政策约束。同时，开源工具通常没有使用次数限制和月度费用，一次安装即可长期使用，对于高频使用场景的总拥有成本（TCO）远低于订阅制商业工具。

**安装与环境准备**
日志分析工具依赖 Python 运行环境。建议通过 pyenv（Python）或 nvm（Node.js）管理 Python 版本，避免全局环境污染。对于新手用户，推荐先创建虚拟环境（python -m venv venv && source venv/bin/activate），再安装依赖，这样即使出现问题也可以随时删除虚拟环境重新开始，不影响系统稳定性。

**社区与维护**
GitHub Issue 和 Discussion 是获取帮助的最快渠道。在提问前建议先检查 Closed Issues（已关闭的问题），大多数常见问题都已有解答。遇到 Bug 时，提供 pip list 的输出、完整错误堆栈和最小可复现示例，能显著提高开发者响应速度。AI Skill Hub 将持续追踪日志分析工具的版本更新，及时通知重要功能变化。

📋 工具概览

本地日志分析，支持个人信息遮蔽、威胁检测、异常检测

日志分析工具是一款基于 Python 开发的开源工具，专注于 anomaly-detection、cli、devops 等核心功能。作为 GitHub 开源项目，它拥有活跃的社区支持和持续的版本迭代，代码完全透明可审计，支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流，都能提供稳定可靠的解决方案。

GitHub Stars

⭐ 10

开发语言

Python

支持平台

Windows / macOS / Linux

维护状态

轻量级项目，按需更新

开源协议

Apache-2.0

AI 综合评分

7.5 分

工具类型

AI工具

Forks

—

📖 中文文档

以下内容由 AI Skill Hub 根据项目信息自动整理，如需查看完整原始文档请访问底部「原始来源」。

本地日志分析，支持个人信息遮蔽、威胁检测、异常检测

📌 核心特色

开源免费，支持本地部署，数据完全自主可控
活跃的 GitHub 开源社区，持续迭代更新
提供详细文档和使用示例，新手友好
支持自定义配置，灵活适配不同使用环境
可作为基础组件集成进现有技术栈或进行二次开发

🎯 主要使用场景

本地部署运行，保护数据隐私，满足合规要求
自定义集成到现有系统，扩展技术栈能力
作为开源基础组件进行商业化二次开发

以下安装命令基于项目开发语言和类型自动生成，实际以官方 README 为准。

安装命令

# 方式一：pip 安装（推荐）
pip install logatory

# 方式二：虚拟环境安装（推荐生产环境）
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install logatory

# 方式三：从源码安装（获取最新功能）
git clone https://github.com/T0nd3/logatory
cd logatory
pip install -e .

# 验证安装
python -c "import logatory; print('安装成功')"

📋 安装步骤说明

访问 GitHub 仓库页面
按照 README 文档完成依赖安装
根据系统环境完成初始化配置
参考官方示例或文档开始使用
遇到问题可在 GitHub Issues 中查找解答

以下用法示例由 AI Skill Hub 整理，涵盖最常见的使用场景。

常用命令 / 代码示例

# 命令行使用
logatory --help

# 基本用法
logatory input_file -o output_file

# Python 代码中调用
import logatory

# 示例
result = logatory.process("input")
print(result)

以下配置示例基于典型使用场景生成，具体参数请参照官方文档调整。

配置示例

# logatory 配置文件示例（config.yml）
app:
  name: "logatory"
  debug: false
  log_level: "INFO"

# 运行时指定配置文件
logatory --config config.yml

# 或通过环境变量配置
export LOGATORY_API_KEY="your-key"
export LOGATORY_OUTPUT_DIR="./output"

📑 README 深度解析真实文档完整度 95/100 查看 GitHub 原文 →

以下内容由系统直接从 GitHub README 解析整理，保留代码块、表格与列表结构。

Logatory

Local log analysis with PII redaction, rule-based threat detection, anomaly detection, LLM-powered insights, and a web dashboard — all running on your machine, no data leaves your infrastructure by default.

Logatory web dashboard

Or stay in the terminal — format auto-detected, PII redacted, threats flagged:

```text

Features

Capability	Details
Format support	Syslog, Nginx/Apache access, Apache error, HAProxy, Traefik, JSON Lines, logfmt, CEF, LEEF, plaintext — auto-detected; reads plain, gzip, and `.xlsx` files
PII redaction	Emails, IPv4/IPv6, credit cards (Luhn-checked), IBANs, German phone numbers — deterministic pseudonymisation or masking
Rule engine	YAML-based rules with `eq`, `ne`, `contains`, `startswith`, `endswith`, `re`, `gt`, `lt`, `gte`, `lte` operators; multi-field AND/OR
Sigma support	Convert Sigma rules to native format
Anomaly detection	Statistical Z-score baseline over 60-second buckets, trains automatically from historical logs
LLM integration	Ollama (default), Claude, OpenAI-compatible APIs; explain findings, summarize errors, RAG Q&A
Web dashboard	FastAPI + HTMX; findings/errors table, trend chart (ECharts), inline LLM explain, log file upload
Log upload	Drag-and-drop log upload in the browser — instant scan with PII redaction, results shown inline
REST API v1	Bearer-token auth, JSON endpoints for findings, errors, stats, live event ingestion
OpenSearch	Query and analyse logs from OpenSearch / Elasticsearch clusters
systemd journal	Read logs straight from journald via `journalctl` — scan history or follow live
Docker logs	Read container logs straight from the Docker daemon — scan or follow, no log stack required
Kubernetes	Read pod logs through `kubectl` — by namespace, label selector or pod; scan or follow, no log stack required
Windows Event Log	Analyze a JSON event export anywhere (even on Linux), or read a live log on Windows via `Get-WinEvent`
S3 / object storage	Read log objects straight from a bucket via the `aws` CLI — AWS S3 or any S3-compatible store; gzip decompressed on the fly
Syslog listener	Bind UDP/TCP 514 and receive syslog (RFC 3164 / RFC 5424) from network devices, firewalls and appliances
AWS CloudWatch	Pull events from a CloudWatch log group via the `aws` CLI — no boto3; scan or follow live
GCP Cloud Logging	Read entries via the `gcloud` CLI — no google-cloud dependency; scan or follow live with native severities
Remote over SSH	Pull logs from any SSH-reachable host — no agent on the remote box; scan or follow live with auto-reconnect
Grafana Loki	Query a Loki instance with LogQL — scan or follow live
Graylog	Query a Graylog server via its search API — scan or follow live
Fleet	Declare many log sources in one file — scan, follow, and manage a whole fleet at once
Finding persistence	SQLite store for HIGH/CRITICAL findings with retention, dedup, severity filtering
FP suppression	Dismiss rules globally or per source file; reversible
Markdown export	Automated security reports from the SQLite database
Plugin system	Drop Python files into a directory to add custom rules, PII patterns, parsers and source adapters
Docker	Multi-stage image, non-root user, `/data` volume — production-ready

---

Install (core only — no external dependencies beyond PyYAML and typer)

pip install logatory

Build the vector index first (requires pip install 'logatory[embed]')

logatory llm index

Installation

Requirements: Python 3.11+

Docker container logs

No log aggregation stack (ELK, Loki, Graylog) required — if your services run in Docker, Logatory reads their logs straight from the daemon. Install the optional dependency and use the native docker command:

```bash pip install 'logatory[docker]'

Install and start Ollama: https://ollama.ai

ollama pull gemma3:4b

Docker

Environment variables for Docker

```bash

docker-compose.yml (or .env file)

LOGATORY_API_TOKEN=change-me-in-production LOGATORY_PII_SALT=a-long-random-string ```

Build and run manually

docker build -t logatory .

docker run -d \
  -p 8080:8080 \
  -v logatory-data:/data \
  -e LOGATORY_API_TOKEN=mytoken \
  -e LOGATORY_PII_SALT=mysalt \
  logatory

The container runs as a non-root user (logatory, UID 1001). The database and config are stored in /data.

Scanning log files inside Docker

Mount the host log directory and run a one-shot scan:

docker run --rm \
  -v /var/log:/logs:ro \
  -v logatory-data:/data \
  logatory \
  logatory scan /logs/syslog --track-errors

Quick Start

```bash

Quick start

docker compose up -d

The stack starts Logatory on port 8080 with a named volume for the SQLite database.

demo

Interactive demo and database seeding using synthetic data — no real log files, Ollama, or database required for demo run.

logatory demo [run|seed|clear]

`demo run`

Guided CLI walkthrough of all 7 feature sections (log parsing, PII, rules, error tracking, findings, anomaly detection, LLM):

logatory demo run           # pause after each section
logatory demo run --no-pause  # print everything at once

`demo seed`

Populate the SQLite database with synthetic findings and errors so the web dashboard has something to display immediately. Inserts 25 findings spread over 14 days (for the trend chart) and 5 error groups. All records are tagged internally and never mixed with real data.

logatory demo seed

`demo clear`

Remove every record written by demo seed. Real findings and errors are never touched.

logatory demo clear

---

Demo data for the web dashboard

Seed the database with synthetic findings and errors so the dashboard shows data immediately:

```bash

Remove all demo data (real data is untouched)

docker compose exec logatory logatory demo clear ```

Alternatively, upload a real log file via the browser at http://localhost:8080/upload for an instant, transient scan.

---

Optional feature sets

pip install 'logatory[web]'         # web dashboard + REST API (FastAPI, uvicorn, Jinja2)
pip install 'logatory[docker]'      # read logs from local Docker containers
pip install 'logatory[opensearch]'  # OpenSearch / Elasticsearch integration
pip install 'logatory[xlsx]'        # read .xlsx spreadsheet log exports
pip install 'logatory[claude]'      # Anthropic Claude API
pip install 'logatory[embed]'       # ChromaDB for RAG (llm ask command)

Install everything:

pip install 'logatory[web,docker,opensearch,xlsx,claude,embed]'

List the configured targets; --check probes each for reachability

logatory fleet list --check

Configuration

Run logatory init to generate a config.yaml (with a freshly generated PII salt), or copy config.yaml.example and adapt. When no --config is passed, Logatory looks for a config in $LOGATORY_CONFIG, ./config.yaml, then ~/.config/logatory/config.yaml.

logatory init                              # write ./config.yaml
logatory init --minimal                    # smaller starter config
logatory init -o ~/.config/logatory/config.yaml

Run logatory doctor to verify your setup — it checks that the config loads, a PII salt is set, the database directory is writable, the LLM provider is configured/reachable (and that cloud API keys are present), plugins load, and alert channels build. It exits non-zero on hard failures, so it works in CI too.

logatory doctor
logatory doctor --config ~/.config/logatory/config.yaml

```yaml

Custom PII patterns file (optional)

pii_rules_path: pii_rules.yaml

Prefer env var LOGATORY_PII_SALT over storing here

pii_salt: ""

Prefer env var LOGATORY_API_TOKEN

api_token: ""

Environment variables

Variable	Description
`LOGATORY_PII_SALT`	Salt for PII pseudonymisation
`LOGATORY_API_TOKEN`	Bearer token for REST API auth
`ANTHROPIC_API_KEY`	API key when `llm.provider: claude`
`OPENAI_API_KEY`	API key when `llm.provider: openai`
`GROQ_API_KEY`	API key when `llm.provider: groq`
`MISTRAL_API_KEY`	API key when `llm.provider: mistral`
`OPENSEARCH_USERNAME`	OpenSearch basic auth username
`OPENSEARCH_PASSWORD`	OpenSearch basic auth password
`OPENSEARCH_API_KEY`	OpenSearch API key (`id:base64key`)
`OPENSEARCH_CLIENT_CERT`	Path to client certificate
`OPENSEARCH_CLIENT_KEY`	Path to client private key
`OPENSEARCH_CA_CERTS`	Path to CA certificate bundle
`LOGATORY_CONFIG`	Config file path used by `logatory serve --reload`

---

Default config already points to http://localhost:11434

logatory llm info ```

config.yaml

llm: provider: claude model: claude-3-5-haiku-20241022

bash export ANTHROPIC_API_KEY=sk-ant-... logatory llm info ```

CLI Reference

All commands accept --config/-c <path> to specify a config file. Without it, Logatory auto-discovers one from (in order) $LOGATORY_CONFIG, ./config.yaml, then ~/.config/logatory/config.yaml; if none exist, built-in defaults are used.

---

REST API Bearer token — leave empty to disable auth (local dev)

OpenAI-compatible APIs

llm:
  provider: openai
  model: gpt-4o-mini
  endpoint: https://api.openai.com/v1

export OPENAI_API_KEY=sk-...

Web Dashboard & REST API

Start the server (requires pip install 'logatory[web]'):

logatory serve --port 8080

REST API v1

Base path: /api/v1/ Interactive docs: /api/docs

Method	Path	Description
`GET`	`/api/v1/health`	Liveness probe (no auth)
`GET`	`/api/v1/findings`	List findings (`?severity=high&since_hours=24&source=nginx.log`)
`GET`	`/api/v1/findings/{id}`	Get finding by ID
`GET`	`/api/v1/errors`	List error groups (`?sort=count`)
`GET`	`/api/v1/errors/{fingerprint}`	Get error group + recent occurrences
`GET`	`/api/v1/stats`	Aggregate counts
`POST`	`/api/v1/events`	Ingest a raw log line → returns triggered findings

Authentication

Set api_token in config.yaml or via LOGATORY_API_TOKEN. Pass it as:

Authorization: Bearer <token>

Leave empty to disable auth (for local development or Docker with network-level access control).

Event ingestion example

curl -X POST http://localhost:8080/api/v1/events \
  -H "Authorization: Bearer mytoken" \
  -H "Content-Type: application/json" \
  -d '{"raw": "Failed password for root from 1.2.3.4 port 22", "source": "sshd"}'

Plugin directory — all *.py files here are auto-loaded at startup

plugins_dir: plugins/

Plugin System

Drop Python files into a directory and register custom rules, PII patterns, log-format parsers and source adapters. Enable in config.yaml:

plugins_dir: plugins/

A plugin file must expose a register(registry) function:

```python

plugins/my_plugin.py

def register(registry) -> None: # Custom detection rule registry.add_rule({ "id": "MY_DB_LEAK", "title": "Database credentials exposed in log", "description": "Fires when a connection string appears in a log message.", "level": "critical", "detection": { "match": [ {"field": "message", "op": "re", "value": r"postgresql://\S+:\S+@"}, ] }, })

# Custom PII pattern — redacts internal employee IDs registry.add_pii_pattern( name="employee_id", pattern=r"\bEMP-\d{4,8}\b", prefix="employee", )

# Load an entire directory of YAML rule files from pathlib import Path registry.add_rule_dir(Path(file).parent / "my_rules")

# Custom log-format parser — auto-detected like any built-in format # registry.add_parser(name="myfmt", detect=looks_like_myfmt, factory=MyParser)

# Custom source adapter — looked up by name like any built-in source # registry.add_adapter(name="kafka", adapter_cls=KafkaAdapter) ```

Plugin rules participate in logatory scan, logatory tail, and the web dashboard rule engine. Plugin PII patterns apply to every redaction pass; plugin parsers and adapters register into the global parser/adapter registries, so format auto-detection and source lookup pick them up everywhere. A plugin that raises an exception is logged as a warning and skipped — it never crashes the host process.

A complete, runnable example covering all four contribution types lives in plugins/example_plugin.py, and the full guide is in docs/PLUGINS.md.

---