AI Skill Hub 强烈推荐:语音转录工具 是一款优质的AI工具。AI 综合评分 8.0 分,在同类工具中表现稳健。如果你正在寻找可靠的AI工具解决方案,这是一个值得深入了解的选择。
语音转录工具 是一款基于 Python 开发的开源工具,专注于 asr、speech-recognition、python 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
语音转录工具 是一款基于 Python 开发的开源工具,专注于 asr、speech-recognition、python 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
# 方式一:pip 安装(推荐)
pip install soulx-transcriber
# 方式二:虚拟环境安装(推荐生产环境)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install soulx-transcriber
# 方式三:从源码安装(获取最新功能)
git clone https://github.com/Soul-AILab/SoulX-Transcriber
cd SoulX-Transcriber
pip install -e .
# 验证安装
python -c "import soulx_transcriber; print('安装成功')"
# 命令行使用
soulx-transcriber --help
# 基本用法
soulx-transcriber input_file -o output_file
# Python 代码中调用
import soulx_transcriber
# 示例
result = soulx_transcriber.process("input")
print(result)
# soulx-transcriber 配置文件示例(config.yml) app: name: "soulx-transcriber" debug: false log_level: "INFO" # 运行时指定配置文件 soulx-transcriber --config config.yml # 或通过环境变量配置 export SOULX_TRANSCRIBER_API_KEY="your-key" export SOULX_TRANSCRIBER_OUTPUT_DIR="./output"
</div>
<p> <sup>*</sup>Equal contribution. <sup>†</sup>Corresponding author </p>
<p> <sup>1</sup>Audio, Speech and Language Processing Group (ASLP@NPU), Northwestern Polytechnical University, Xi’an, China<br> <sup>2</sup>Soul AI Lab, China<br> <sup>3</sup>Moonstep AI, China<br> </p> </div>
SoulX-Transcriber is a unified end-to-end large audio language model for multi-speaker diarization and recognition in multi-speaker dialogue scenarios. Rather than relying on a cascaded pipeline, the model directly learns speaker attribution, timestamped segmentation, and transcription in a single framework, producing coherent speaker-consistent transcripts for overlapping and fast-turn conversations.
git clone https://github.com/Soul-AILab/SoulX-Transcriber.git
cd SoulX-Transcriber
conda create -n soulx_transcriber python=3.12 -y
conda activate soulx_transcriber
Install MS-Swift and dependencies:
pip install ms-swift
curl -LsSf https://astral.sh/uv/install.sh | sh
uv pip install vllm --torch-backend=auto --index-url https://mirrors.aliyun.com/pypi/simple/
uv pip install vllm-omni --index-url https://mirrors.aliyun.com/pypi/simple/
uv pip install 'vllm-omni[demo]' --index-url https://mirrors.aliyun.com/pypi/simple/
git clone https://github.com/vllm-project/vllm-omni.git cd vllm-omni uv pip install -e . --index-url https://mirrors.aliyun.com/pypi/simple/ ``` > For more details on compiling vLLM from source, refer to the vLLM official documentation.
<https://github.com/user-attachments/assets/4020c95b-8cce-4611-a7b5-ffe6f49c1fb6>
</div>
Please visit our ✨demopage✨ for more demos.
uv venv vllm_omni --python 3.12 --seed --index-url https://mirrors.aliyun.com/pypi/simple/
source vllm_omni/bin/activate
To improve out-of-domain generalization, we build an agent-based multi-speaker dialogue simulation pipeline with a speaker-aware prompt audio matching mechanism. Given a target dialogue text, the system analyzes speaker tags, selects the most suitable reference audio for each speaker using multi-dimensional speaker representations, and synthesizes context-consistent multi-turn dialogue audio.
Workflow: building dialogue text database → building reference audio database → target text analysis → reference audio matching → dialogue audio generation. Detailed information is shown on the figure below. <div align="center"> <img src="figs/simulation_pipeline.png" width="100%" alt="simulation_pipeline"> </div>
高质量的开源语音转录工具,支持多人语音转录
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ Apache 2.0 — 宽松开源协议,可商用,需保留版权声明和 NOTICE 文件,含专利授权条款。
总体来看,语音转录工具 是一款质量优秀的AI工具,在同类工具中具备一定竞争力。AI Skill Hub 将持续追踪其更新动态,建议收藏备用,结合自身场景选择合适时机引入使用。
| 原始名称 | SoulX-Transcriber |
| 原始描述 | 开源AI工具:An end-to-end framework for multi-speaker transcription that jointly models who 。⭐18 · Python |
| Topics | asrspeech-recognitionpython |
| GitHub | https://github.com/Soul-AILab/SoulX-Transcriber |
| License | Apache-2.0 |
| 语言 | Python |
收录时间:2026-06-02 · 更新时间:2026-06-02 · License:Apache-2.0 · AI Skill Hub 不对第三方内容的准确性作法律背书。