经 AI Skill Hub 精选评估,LLMChess 获评「推荐使用」。这款AI工具在功能完整性、社区活跃度和易用性方面表现出色,AI 评分 7.5 分,适合有一定技术背景的用户使用。
LLMChess 是一款基于 Python 开发的开源工具,专注于 AI、Chess、LLM 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
LLMChess 是一款基于 Python 开发的开源工具,专注于 AI、Chess、LLM 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
# 方式一:pip 安装(推荐)
pip install llm_chess
# 方式二:虚拟环境安装(推荐生产环境)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install llm_chess
# 方式三:从源码安装(获取最新功能)
git clone https://github.com/maxim-saplin/llm_chess
cd llm_chess
pip install -e .
# 验证安装
python -c "import llm_chess; print('安装成功')"
# 命令行使用
llm_chess --help
# 基本用法
llm_chess input_file -o output_file
# Python 代码中调用
import llm_chess
# 示例
result = llm_chess.process("input")
print(result)
# llm_chess 配置文件示例(config.yml) app: name: "llm_chess" debug: false log_level: "INFO" # 运行时指定配置文件 llm_chess --config config.yml # 或通过环境变量配置 export LLM_CHESS_API_KEY="your-key" export LLM_CHESS_OUTPUT_DIR="./output"
LLM Chess is a benchmark that evaluates Large Language Models (LLMs) on their reasoning and instruction-following abilities in an agentic setting. LLMs engage in multi-turn dialogs to play chess against opponents like a Random Player or the Komodo Dragon chess engine. This setup tests both strategic reasoning (chess skill) and protocol adherence (sustained interaction without errors).
Key insights from the benchmark: - Early models (2024) struggled with basic instruction following, often hallucinating illegal moves or failing dialogs. - Advanced reasoning models (e.g., o1, o3, o4-mini) in 2025 saturated random-based evaluations, prompting the addition of Dragon as a stronger opponent for Elo anchoring. - Metrics separate chess skill (Win/Loss, Elo) from durability (Game Duration), revealing trade-offs in model capabilities.
See the live leaderboard for rankings and the NeurIPS FoRLM 2025 paper for full details.
<img width="2118" height="1582" alt="image" src="https://github.com/user-attachments/assets/4375a8a8-e226-4ed1-820f-86006d0404e2" />
1. Clone the repository:
git clone https://github.com/maxim-saplin/llm_chess.git
cd llm_chess
2. Create a virtual environment (recommended):
# Using uv (recommended)
uv sync
3. Install dependencies:
# Already handled by `uv sync` above
4. Configure LLMs: - Copy .env.sample to .env and add your API keys. - Suffixes like _W (white) and _B (black) distinguish configs for multi-LLM setups. - Supports Azure OpenAI chat completions (MODEL_KIND=azure), Azure OpenAI Responses API (MODEL_KIND=azure_responses), OpenAI, Anthropic, Google, Groq, and local models via Autogen. - For local models, ensure Ollama or LM Studio is running.
5. Chess Engines (optional, for stronger opponents): - Komodo Dragon: Download binaries from komodochess.com and place in dragon/. Set llm_chess.dragon_path. - Stockfish: Install via brew install stockfish (macOS) or equivalent. Set llm_chess.stockfish_path (default: /opt/homebrew/bin/stockfish).
Example with GPT-4o Mini (Black):
Proxy (to Player_Black):
You are a professional chess player and you play as black. Now is your turn to make a move. Before making a move you can pick one of the following actions:
- 'get_current_board' to get the schema and current status of the board
- 'get_legal_moves' to get a UCI formatted list of available moves
- 'make_move <UCI formatted move>' when you are ready to complete your turn (e.g., 'make_move e2e4')
Respond with the action.
Player_Black (to Proxy):
get_current_board
Proxy (to Player_Black):
♜ ♞ ♝ ♛ ♚ ♝ ♞ ♜
♟ ♟ ♟ ♟ ♟ ♟ ♟ ♟
· · · · · · · ·
· · · · · · · ·
· · · · · · · ·
· · · · · · · ·
♙ ♙ ♙ ♙ ♙ ♙ ♙ ♙
♖ ♘ ♗ ♕ ♔ ♗ ♘ ♖
Player_Black (to Proxy):
get_legal_moves
Proxy (to Player_Black):
a7a6, b7b6, c7c6, d7d6, e7e6, f7f6, g7g6, h7h6, a7a5, b7b5, c7c5, d7d5, e7e5, f7f5, g7g5, h7h5
Player_Black (to Proxy):
make_move e7e5
Proxy (to Player_Black):
Move made, switching player
Edit globals in llm_chess.py or pass via run_multiple_games.py:
MODEL_KIND_W / MODEL_KIND_B in .env.azure for classic Azure chat-completions deployments.azure_responses for Azure deployments that require the Responses API. Keep AZURE_OPENAI_ENDPOINT_* at the resource root such as https://your-resource.openai.azure.com; the runtime will normalize it to the Responses base path automatically.white_player_type / black_player_type: RANDOM_PLAYER, LLM, CHESS_ENGINE_DRAGON, CHESS_ENGINE_STOCKFISH.enable_reflection: Enable "reflect" action for strategic thinking (extra tokens).use_fen_board: Use FEN notation instead of Unicode board (default: False).max_game_moves: Max moves (default: 200).max_llm_turns: Max dialog turns (default: 10).max_failed_attempts: Max errors before loss (default: 3).throttle_delay_moves: API delay (default: 1s) to avoid rate limits.chess (board rules), Autogen (agents/dialogs), Stockfish/Dragon (engines)._logs/; analysis in data/.For issues or questions, open a GitHub issue.
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ Apache 2.0 — 宽松开源协议,可商用,需保留版权声明和 NOTICE 文件,含专利授权条款。
AI Skill Hub 点评:LLMChess 的核心功能完整,质量良好。对于AI 技术爱好者来说,这是一个值得纳入个人工具库的选择。建议先在非生产环境试用,再逐步推广。
| 原始名称 | llm_chess |
| Topics | AIChessLLM |
| GitHub | https://github.com/maxim-saplin/llm_chess |
| License | Apache-2.0 |
| 语言 | Python |
收录时间:2026-05-27 · 更新时间:2026-05-27 · License:Apache-2.0 · AI Skill Hub 不对第三方内容的准确性作法律背书。