AI Skill Hub 强烈推荐:ds4 是一款优质的Prompt模板。AI 综合评分 8.5 分,在同类工具中表现稳健。如果你正在寻找可靠的Prompt模板解决方案,这是一个值得深入了解的选择。
ds4 是经过精心设计和反复验证的专业 Prompt 模板集合。这些 Prompt 框架能够有效激活 Claude、ChatGPT 等大型语言模型的深层能力,让 AI 生成更准确、更有价值的输出结果。无需任何安装,直接复制模板内容到 AI 对话框即可使用。
ds4 是经过精心设计和反复验证的专业 Prompt 模板集合。这些 Prompt 框架能够有效激活 Claude、ChatGPT 等大型语言模型的深层能力,让 AI 生成更准确、更有价值的输出结果。无需任何安装,直接复制模板内容到 AI 对话框即可使用。
# Prompt 无需安装,直接复制使用 # 支持:Claude / ChatGPT / Gemini / 通义千问 等主流模型 # 使用步骤 # 1. 复制 Prompt 模板内容 # 2. 粘贴到 AI 对话框 # 3. 替换 [占位符] 为实际内容 # 4. 发送后获取结构化输出 # 获取原始文件 git clone https://github.com/antirez/ds4
# 粘贴到 Claude/ChatGPT 使用 # 示例 Prompt 结构: 你是一位 [角色],擅长 [领域]。 请根据以下要求完成任务: 任务背景:[描述背景] 具体要求:[详细说明] 输出格式:[期望格式] # 将 [] 内内容替换为实际需求
# ds4 配置说明 # 查看配置选项 ds4 --config-example > config.yml # 常见配置项 # output_dir: ./output # log_level: info # workers: 4 # 环境变量(覆盖配置文件) export DS4_CONFIG="/path/to/config.yml"
DwarfStar 4 is a small native inference engine specific for DeepSeek V4 Flash. It is intentionally narrow: not a generic GGUF runner, not a wrapper around another runtime: it is completely self-contained. Other than running the model in a correct and fast way, the project goal is to provide DS4 specific loading, prompt rendering, tool calling, KV state handling (RAM and on-disk), server API and integrated coding agent, all ready to work with coding agents or with the provided CLI interface. There are also tools for GGUF and imatrix generation, and for quality and speed testing.
We support the following backends: Metal is our primary target. Starting from MacBooks with 96GB of RAM. NVIDIA CUDA with special care for the DGX Spark. * AMD ROCm is only supported in the rocm branch. It is kept separate from main since I (antirez) don't have direct hardware access, so the community rebases the branch as needed.
This project would not exist without llama.cpp and GGML, make sure to read the acknowledgements section, a big thank you to Georgi Gerganov and all the other contributors.
ds4-server can be used by local coding agents that speak OpenAI-compatible chat completions. Start the server first, and set the client context limit no higher than the --ctx value you started the server with:
./ds4-server --ctx 100000 --kv-disk-dir /tmp/ds4-kv --kv-disk-space-mb 8192
You can use larger context and larger cache if you wish. Full context of 1M tokens is going to use more or less 26GB of memory (compressed indexer alone will be like 22GB), so configure a context which makes sense in your system. With 128GB of RAM you would run the 2-bit quants, which are already 81GB, 26GB are going to be likely too much, so a context window of 100~300k tokens is wiser. However users reported being able to run 2bit quants with 250k ctx window in a Macs with just 96GB of system memory: make sure to kill processes that use too much memory, if you plan doing so ;)
The 384000 output limit below avoids token caps since the model is able to generate very long replies otherwise (up to 384k tokens). The server still stops when the configured context window is full.
For opencode, add a provider and agent entry to ~/.config/opencode/opencode.json:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"ds4": {
"name": "ds4.c (local)",
"npm": "@ai-sdk/openai-compatible",
"options": {
"baseURL": "http://127.0.0.1:8000/v1",
"apiKey": "dsv4-local"
},
"models": {
"deepseek-v4-flash": {
"name": "DeepSeek V4 Flash (ds4.c local)",
"limit": {
"context": 100000,
"output": 384000
}
}
}
}
},
"agent": {
"ds4": {
"description": "DeepSeek V4 Flash served by local ds4-server",
"model": "ds4/deepseek-v4-flash",
"temperature": 0
}
}
}
For Pi, add a provider to ~/.pi/agent/models.json:
{
"providers": {
"ds4": {
"name": "ds4.c local",
"baseUrl": "http://127.0.0.1:8000/v1",
"api": "openai-completions",
"apiKey": "dsv4-local",
"compat": {
"supportsStore": false,
"supportsDeveloperRole": false,
"supportsReasoningEffort": true,
"supportsUsageInStreaming": true,
"maxTokensField": "max_tokens",
"supportsStrictMode": false,
"thinkingFormat": "deepseek",
"requiresReasoningContentOnAssistantMessages": true
},
"models": [
{
"id": "deepseek-v4-flash",
"name": "DeepSeek V4 Flash (ds4.c local)",
"reasoning": true,
"thinkingLevelMap": {
"off": null,
"minimal": "low",
"low": "low",
"medium": "medium",
"high": "high",
"xhigh": "xhigh"
},
"input": ["text"],
"contextWindow": 100000,
"maxTokens": 384000,
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
}
}
]
}
}
}
Optionally make it the default Pi model in ~/.pi/agent/settings.json:
{
"defaultProvider": "ds4",
"defaultModel": "deepseek-v4-flash"
}
For Codex CLI, use the Responses wire API:
[model_providers.ds4]
name = "DS4"
base_url = "http://127.0.0.1:8000/v1"
wire_api = "responses"
stream_idle_timeout_ms = 1000000
Then run:
codex --model deepseek-v4-flash -c model_provider=ds4
For Claude Code, use the Anthropic-compatible endpoint. A wrapper like this matches the local ~/bin/claude-ds4 setup:
#!/bin/sh
unset ANTHROPIC_API_KEY
export ANTHROPIC_BASE_URL="${DS4_ANTHROPIC_BASE_URL:-http://127.0.0.1:8000}"
export ANTHROPIC_AUTH_TOKEN="${DS4_API_KEY:-dsv4-local}"
export ANTHROPIC_MODEL="deepseek-v4-flash"
export ANTHROPIC_CUSTOM_MODEL_OPTION="deepseek-v4-flash"
export ANTHROPIC_CUSTOM_MODEL_OPTION_NAME="DeepSeek V4 Flash local ds4"
export ANTHROPIC_CUSTOM_MODEL_OPTION_DESCRIPTION="ds4.c local GGUF"
export ANTHROPIC_DEFAULT_SONNET_MODEL="deepseek-v4-flash"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="deepseek-v4-flash"
export ANTHROPIC_DEFAULT_OPUS_MODEL="deepseek-v4-flash"
export CLAUDE_CODE_SUBAGENT_MODEL="deepseek-v4-flash"
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
export CLAUDE_CODE_DISABLE_NONSTREAMING_FALLBACK=1
export CLAUDE_STREAM_IDLE_TIMEOUT_MS=600000
exec "$HOME/.local/bin/claude" "$@"
Claude Code may send a large initial prompt, often around 25k tokens, before it starts doing useful work. Keep --kv-disk-dir enabled: after the first expensive prefill, the disk KV cache lets later continuations or restarted sessions reuse the saved prefix instead of processing the whole prompt again.
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ MIT 协议 — 最宽松的开源协议之一,可自由商用、修改、分发,仅需保留版权声明。
总体来看,ds4 是一款质量优秀的Prompt模板,在同类工具中具备一定竞争力。AI Skill Hub 将持续追踪其更新动态,建议收藏备用,结合自身场景选择合适时机引入使用。
| 原始名称 | ds4 |
| 原始描述 | DeepSeek 4 Flash local inference engine for Metal and CUDA |
| Topics | C |
| GitHub | https://github.com/antirez/ds4 |
| License | MIT |
收录时间:2026-05-14 · 更新时间:2026-05-22 · License:MIT · AI Skill Hub 不对第三方内容的准确性作法律背书。
选择 Agent 类型,复制安装指令后粘贴到对应客户端