📄 工具详情 ⚙️ 安装教程 📚 使用教程

💬

Prompt模板

ds4

Q: ds4 如何安装和开始使用？

访问 ds4 的 GitHub 仓库或官方网站，按照 README 文档中的步骤安装依赖并运行。通常需要 Python 3.8+ 或 Node.js 16+ 基础环境。

Q: ds4 是否免费？许可证是什么？

ds4 完全免费，采用 MIT 许可证开源发布，任何人都可以免费使用、修改和分发。

Q: ds4 适合哪些用户使用？

ds4 主要面向有一定技术基础的用户，包括开发者、数据分析师、AI 工程师等专业人士。

Q: ds4 的社区活跃度和项目维护状况如何？

ds4 是一款社区驱动的开源项目，欢迎通过 GitHub 参与贡献和反馈。

专业级提示词模板，解锁 AI 的真实潜力

📄 MIT 🏷 AI 8.5分

8.5AI 综合评分

⬇ 下载源码 ZIP ⚙️ 配置说明

✦ AI Skill Hub 推荐

AI Skill Hub 强烈推荐：ds4 是一款优质的Prompt模板。AI 综合评分 8.5 分，在同类工具中表现稳健。如果你正在寻找可靠的Prompt模板解决方案，这是一个值得深入了解的选择。

📚 深度解析

ds4 是经过精心设计和实践验证的专业 Prompt 模板。Prompt 工程（Prompt Engineering）是充分发挥 Claude、ChatGPT 等大型语言模型潜力的关键技能，而一套经过优化的 Prompt 模板可以将 AI 输出质量提升数倍。

优质 Prompt 模板的核心价值在于其结构化设计：明确的角色设定、精确的任务描述、具体的输出格式要求和必要的边界条件，这些要素共同构成了一个能够持续产出高质量结果的 Prompt 框架。ds4 提供的模板经过反复迭代和用户验证，能够有效减少 AI 的"幻觉"（Hallucination）和输出不稳定问题。

无论你使用 Claude 3.5 Sonnet、GPT-4、Gemini 还是国内的文心一言、智谱 AI，优质的 Prompt 设计都能跨模型复用。AI Skill Hub 建议将本模板保存为个人 Prompt 库的标准组件，根据具体场景调整参数后反复使用，形成自己的 AI 提效工作流。

📋 工具概览

ds4 是经过精心设计和反复验证的专业 Prompt 模板集合。这些 Prompt 框架能够有效激活 Claude、ChatGPT 等大型语言模型的深层能力，让 AI 生成更准确、更有价值的输出结果。无需任何安装，直接复制模板内容到 AI 对话框即可使用。

GitHub Stars

—

开发语言

多语言

支持平台

Windows / macOS / Linux

维护状态

轻量级项目，按需更新

开源协议

MIT

AI 综合评分

8.5 分

工具类型

Prompt模板

Forks

—

📖 中文文档

以下内容由 AI Skill Hub 根据项目信息自动整理，如需查看完整原始文档请访问底部「原始来源」。

📌 核心特色

精心设计的 Prompt 框架，快速激活 AI 的深层能力
支持参数化替换，灵活适配多种业务场景
经过反复验证的指令结构，显著提升 AI 输出质量和一致性
适用于 Claude、ChatGPT 等主流大语言模型
可作为团队标准 Prompt 模板复用和二次开发

🎯 主要使用场景

快速生成高质量的专业文案、分析报告或结构化内容
利用 Prompt 框架引导 AI 解决特定领域的复杂问题
在不同 AI 工具间复用经过验证的提示词模板

以下安装命令基于项目开发语言和类型自动生成，实际以官方 README 为准。

安装命令

# Prompt 无需安装，直接复制使用
# 支持：Claude / ChatGPT / Gemini / 通义千问 等主流模型

# 使用步骤
# 1. 复制 Prompt 模板内容
# 2. 粘贴到 AI 对话框
# 3. 替换 [占位符] 为实际内容
# 4. 发送后获取结构化输出

# 获取原始文件
git clone https://github.com/antirez/ds4

📋 安装步骤说明

复制本工具的 Prompt 模板内容
打开 Claude、ChatGPT 或其他 AI 对话工具
将 Prompt 粘贴到对话框开头
根据实际需求替换 [占位符] 中的内容
发送后 AI 将按照模板格式执行，获得结构化输出

以下用法示例由 AI Skill Hub 整理，涵盖最常见的使用场景。

常用命令 / 代码示例

# 粘贴到 Claude/ChatGPT 使用
# 示例 Prompt 结构：

你是一位 [角色]，擅长 [领域]。
请根据以下要求完成任务：

任务背景：[描述背景]
具体要求：[详细说明]
输出格式：[期望格式]

# 将 [] 内内容替换为实际需求

以下配置示例基于典型使用场景生成，具体参数请参照官方文档调整。

配置示例

# ds4 配置说明
# 查看配置选项
ds4 --config-example > config.yml

# 常见配置项
# output_dir: ./output
# log_level: info
# workers: 4

# 环境变量（覆盖配置文件）
export DS4_CONFIG="/path/to/config.yml"

📑 README 深度解析真实文档完整度 20/100 查看 GitHub 原文 →

以下内容由系统直接从 GitHub README 解析整理，保留代码块、表格与列表结构。

DwarfStar 4

DwarfStar 4 is a small native inference engine specific for DeepSeek V4 Flash. It is intentionally narrow: not a generic GGUF runner, not a wrapper around another runtime: it is completely self-contained. Other than running the model in a correct and fast way, the project goal is to provide DS4 specific loading, prompt rendering, tool calling, KV state handling (RAM and on-disk), server API and integrated coding agent, all ready to work with coding agents or with the provided CLI interface. There are also tools for GGUF and imatrix generation, and for quality and speed testing.

We support the following backends: Metal is our primary target. Starting from MacBooks with 96GB of RAM. NVIDIA CUDA with special care for the DGX Spark. * AMD ROCm is only supported in the rocm branch. It is kept separate from main since I (antirez) don't have direct hardware access, so the community rebases the branch as needed.

This project would not exist without llama.cpp and GGML, make sure to read the acknowledgements section, a big thank you to Georgi Gerganov and all the other contributors.

Agent Client Usage

ds4-server can be used by local coding agents that speak OpenAI-compatible chat completions. Start the server first, and set the client context limit no higher than the --ctx value you started the server with:

./ds4-server --ctx 100000 --kv-disk-dir /tmp/ds4-kv --kv-disk-space-mb 8192

You can use larger context and larger cache if you wish. Full context of 1M tokens is going to use more or less 26GB of memory (compressed indexer alone will be like 22GB), so configure a context which makes sense in your system. With 128GB of RAM you would run the 2-bit quants, which are already 81GB, 26GB are going to be likely too much, so a context window of 100~300k tokens is wiser. However users reported being able to run 2bit quants with 250k ctx window in a Macs with just 96GB of system memory: make sure to kill processes that use too much memory, if you plan doing so ;)

The 384000 output limit below avoids token caps since the model is able to generate very long replies otherwise (up to 384k tokens). The server still stops when the configured context window is full.

For opencode, add a provider and agent entry to ~/.config/opencode/opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "ds4": {
      "name": "ds4.c (local)",
      "npm": "@ai-sdk/openai-compatible",
      "options": {
        "baseURL": "http://127.0.0.1:8000/v1",
        "apiKey": "dsv4-local"
      },
      "models": {
        "deepseek-v4-flash": {
          "name": "DeepSeek V4 Flash (ds4.c local)",
          "limit": {
            "context": 100000,
            "output": 384000
          }
        }
      }
    }
  },
  "agent": {
    "ds4": {
      "description": "DeepSeek V4 Flash served by local ds4-server",
      "model": "ds4/deepseek-v4-flash",
      "temperature": 0
    }
  }
}

For Pi, add a provider to ~/.pi/agent/models.json:

{
  "providers": {
    "ds4": {
      "name": "ds4.c local",
      "baseUrl": "http://127.0.0.1:8000/v1",
      "api": "openai-completions",
      "apiKey": "dsv4-local",
      "compat": {
        "supportsStore": false,
        "supportsDeveloperRole": false,
        "supportsReasoningEffort": true,
        "supportsUsageInStreaming": true,
        "maxTokensField": "max_tokens",
        "supportsStrictMode": false,
        "thinkingFormat": "deepseek",
        "requiresReasoningContentOnAssistantMessages": true
      },
      "models": [
        {
          "id": "deepseek-v4-flash",
          "name": "DeepSeek V4 Flash (ds4.c local)",
          "reasoning": true,
          "thinkingLevelMap": {
            "off": null,
            "minimal": "low",
            "low": "low",
            "medium": "medium",
            "high": "high",
            "xhigh": "xhigh"
          },
          "input": ["text"],
          "contextWindow": 100000,
          "maxTokens": 384000,
          "cost": {
            "input": 0,
            "output": 0,
            "cacheRead": 0,
            "cacheWrite": 0
          }
        }
      ]
    }
  }
}

Optionally make it the default Pi model in ~/.pi/agent/settings.json:

{
  "defaultProvider": "ds4",
  "defaultModel": "deepseek-v4-flash"
}

For Codex CLI, use the Responses wire API:

[model_providers.ds4]
name = "DS4"
base_url = "http://127.0.0.1:8000/v1"
wire_api = "responses"
stream_idle_timeout_ms = 1000000

Then run:

codex --model deepseek-v4-flash -c model_provider=ds4

For Claude Code, use the Anthropic-compatible endpoint. A wrapper like this matches the local ~/bin/claude-ds4 setup:

#!/bin/sh
unset ANTHROPIC_API_KEY

export ANTHROPIC_BASE_URL="${DS4_ANTHROPIC_BASE_URL:-http://127.0.0.1:8000}"
export ANTHROPIC_AUTH_TOKEN="${DS4_API_KEY:-dsv4-local}"
export ANTHROPIC_MODEL="deepseek-v4-flash"

export ANTHROPIC_CUSTOM_MODEL_OPTION="deepseek-v4-flash"
export ANTHROPIC_CUSTOM_MODEL_OPTION_NAME="DeepSeek V4 Flash local ds4"
export ANTHROPIC_CUSTOM_MODEL_OPTION_DESCRIPTION="ds4.c local GGUF"

export ANTHROPIC_DEFAULT_SONNET_MODEL="deepseek-v4-flash"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="deepseek-v4-flash"
export ANTHROPIC_DEFAULT_OPUS_MODEL="deepseek-v4-flash"
export CLAUDE_CODE_SUBAGENT_MODEL="deepseek-v4-flash"

export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
export CLAUDE_CODE_DISABLE_NONSTREAMING_FALLBACK=1
export CLAUDE_STREAM_IDLE_TIMEOUT_MS=600000

exec "$HOME/.local/bin/claude" "$@"

Claude Code may send a large initial prompt, often around 25k tokens, before it starts doing useful work. Keep --kv-disk-dir enabled: after the first expensive prefill, the disk KV cache lets later continuations or restarted sessions reuse the saved prefix instead of processing the whole prompt again.

⚡ 核心功能