经 AI Skill Hub 精选评估,MCore-Bridge 获评「推荐使用」。这款AI工具在功能完整性、社区活跃度和易用性方面表现出色,AI 评分 7.5 分,适合有一定技术背景的用户使用。
MCore-Bridge是开源AI工具,提供Megatron-Core模型定义,实现了state-of-the-art语言模型的应用。它支持多种模型,包括GPT-OSS和LLAM4等,帮助开发者快速构建高质量的语言模型应用。
MCore-Bridge 是一款基于 Python 开发的开源工具,专注于 installable、deepseek-r1、gemma4 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
MCore-Bridge是开源AI工具,提供Megatron-Core模型定义,实现了state-of-the-art语言模型的应用。它支持多种模型,包括GPT-OSS和LLAM4等,帮助开发者快速构建高质量的语言模型应用。
MCore-Bridge 是一款基于 Python 开发的开源工具,专注于 installable、deepseek-r1、gemma4 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
# 方式一:pip 安装(推荐)
pip install mcore-bridge
# 方式二:虚拟环境安装(推荐生产环境)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install mcore-bridge
# 方式三:从源码安装(获取最新功能)
git clone https://github.com/modelscope/mcore-bridge
cd mcore-bridge
pip install -e .
# 验证安装
python -c "import mcore_bridge; print('安装成功')"
# 命令行使用
mcore-bridge --help
# 基本用法
mcore-bridge input_file -o output_file
# Python 代码中调用
import mcore_bridge
# 示例
result = mcore_bridge.process("input")
print(result)
# mcore-bridge 配置文件示例(config.yml) app: name: "mcore-bridge" debug: false log_level: "INFO" # 运行时指定配置文件 mcore-bridge --config config.yml # 或通过环境变量配置 export MCORE_BRIDGE_API_KEY="your-key" export MCORE_BRIDGE_OUTPUT_DIR="./output"
<p align="center"> <b>Providing Megatron-Core model definitions for state-of-the-art large models</b> </p>
<p align="center"> <a href="https://modelscope.cn">ModelScope</a> <br> <a href="README_zh.md">中文</a>   |   English   </p>
<p align="center"> <img src="https://img.shields.io/badge/python-3.12-5be.svg"> <img src="https://img.shields.io/badge/pytorch-%E2%89%A52.0-orange.svg"> <a href="https://github.com/NVIDIA/Megatron-LM/"><img src="https://img.shields.io/badge/megatron--core-%E2%89%A50.15-76B900.svg"></a>
<a href="https://pypi.org/project/mcore-bridge/"><img src="https://badge.fury.io/py/mcore-bridge.svg"></a> <a href="https://github.com/modelscope/mcore-bridge/blob/main/LICENSE"><img src="https://img.shields.io/github/license/modelscope/mcore-bridge"></a> <a href="https://pepy.tech/project/mcore-bridge"><img src="https://pepy.tech/badge/mcore-bridge"></a> <a href="https://github.com/modelscope/mcore-bridge/pulls"><img src="https://img.shields.io/badge/PR-welcome-55EB99.svg"></a> </p>
mcore-bridge is a large language model and multimodal large model definition library built on the Megatron-Core ecosystem, developed by the ModelScope community. It currently supports 300+ text-only models and 200+ multimodal models, including large language models such as Qwen3-Next, GLM5.1, DeepSeek-V3.2, Minimax2.7, Kimi K2.5, and GPT-OSS, as well as multimodal large models such as Qwen3.5, Qwen3-Omni, Gemma4, GLM4.6-V, InternVL3.5, and Ovis2.5.
------
Why Choose mcore-bridge?
------
Related Documentation:
To install using pip: ```shell pip install mcore-bridge -U
git clone https://github.com/modelscope/mcore-bridge.git cd mcore-bridge pip install -e .
How to use MCore-Bridge for training can be referred to the ms-swift project. Here we introduce how to use MCore-Bridge programmatically.
You need to create the following file (test.py), then run CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 test.py. Below is sample code demonstrating how to use Mcore-Bridge for model creation, weight loading, export, and saving.
The saved model can be used for inference by referring to the example code in the model card.
```python import os import torch import torch.distributed as dist from megatron.core import mpu from modelscope import snapshot_download from transformers import AutoConfig, AutoProcessor from mcore_bridge import ModelConfig, get_mcore_model, hf_to_mcore_config
is_rank0 = int(os.getenv('RANK')) == 0 torch.cuda.set_device(f"cuda:{os.getenv('LOCAL_RANK')}") dist.init_process_group(backend='nccl') TP, PP, EP, ETP = 2, 2, 2, 1 mpu.initialize_model_parallel( tensor_model_parallel_size=TP, pipeline_model_parallel_size=PP, expert_model_parallel_size=EP, expert_tensor_parallel_size=ETP, )
model_dir = snapshot_download('Qwen/Qwen3.5-35B-A3B') hf_config = AutoConfig.from_pretrained(model_dir, trust_remote_code=True) processor = AutoProcessor.from_pretrained(model_dir, trust_remote_code=True) config_kwargs = hf_to_mcore_config(hf_config) config = ModelConfig( params_dtype=torch.bfloat16, tensor_model_parallel_size=TP, pipeline_model_parallel_size=PP, expert_model_parallel_size=EP, expert_tensor_parallel_size=ETP, sequence_parallel=True, mtp_num_layers=1, **config_kwargs)
Mcore-Bridge integrates seamlessly with the ms-swift template for model training. You can also replace the ms-swift template module with a custom data processing pipeline to suit your own workflow.
The following provides a minimal example demonstrating how to perform a forward pass and compute the loss using a model created with Mcore-Bridge, helping users quickly integrate Mcore-Bridge into other projects.
Create the following file (test.py) and run it with: CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node=2 test.py.
import os
import torch
import torch.distributed as dist
from megatron.core import mpu
from modelscope import snapshot_download
from swift import get_processor, get_template
from swift.megatron.utils import get_packed_seq_params, get_padding_to
from swift.utils import to_device
from mcore_bridge import ModelConfig, get_mcore_model, hf_to_mcore_config, set_random_seed
data = {
'messages': [{
'role': 'user',
'content': '<image>describe the image.'
}, {
'role':
'assistant',
'content':
'The image depicts a close-up of a kitten with striking features. '
'The kitten has a white and gray coat with distinct black stripes, '
'particularly noticeable on its face and ears. Its eyes are large '
'and expressive, with a captivating blue hue that stands out against '
"the darker fur around them. The kitten's nose is small and pink, "
'and it has long, delicate whiskers extending from either side of its mouth. '
"The background is blurred, drawing attention to the kitten's face and "
'making it the focal point of the image. The overall impression is '
'one of cuteness and charm.'
}],
'images': ['http://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/cat.png']
}
def forward_mg_model(mg_model, template):
template.use_megatron = True
template.set_mode('train')
inputs = template.encode(data, return_length=True)
mg_inputs = to_device(template.data_collator([inputs], padding_to=get_padding_to(mg_model.config)), 'cuda')
text_position_ids = mg_inputs.pop('text_position_ids', None)
if text_position_ids is None:
text_position_ids = mg_inputs.get('position_ids')
for key in ['num_samples', 'attention_mask_2d', 'loss_scale']:
mg_inputs.pop(key, None)
if template.padding_free:
mg_inputs['packed_seq_params'] = get_packed_seq_params(text_position_ids)
mg_inputs['labels'] = torch.roll(mg_inputs['labels'], -1, dims=-1)
loss = mg_model(**mg_inputs)
loss_mask = mg_inputs['labels'] != -100
loss = loss * loss_mask
return loss.sum() / loss_mask.sum()
torch.cuda.set_device(f"cuda:{os.getenv('LOCAL_RANK')}")
dist.init_process_group(backend='nccl')
TP, PP, EP, ETP = 2, 1, 2, 1
mpu.initialize_model_parallel(
tensor_model_parallel_size=TP,
pipeline_model_parallel_size=PP,
expert_model_parallel_size=EP,
expert_tensor_parallel_size=ETP,
)
set_random_seed(42)
model_dir = snapshot_download('Qwen/Qwen3.5-35B-A3B')
template = get_template(get_processor(model_dir), padding_free=True)
config_kwargs = hf_to_mcore_config(template.config)
config = ModelConfig(
params_dtype=torch.bfloat16,
tensor_model_parallel_size=TP,
pipeline_model_parallel_size=PP,
expert_model_parallel_size=EP,
expert_tensor_parallel_size=ETP,
sequence_parallel=True,
mtp_num_layers=1,
**config_kwargs)
mg_model = get_mcore_model(config)[0]
mg_model.cuda()
config.bridge.load_weights([mg_model], model_dir)
loss = forward_mg_model(mg_model, template)
print(f'loss: {loss}') # loss: 0.8161308169364929
The following is the list of models supported by MCore-Bridge:
text-only large models:
| Series | model_type |
|---|---|
| Qwen | qwen2, qwen2_moe<br />qwen3, qwen3_moe, qwen3_next |
| DeepSeek | deepseek_v3, deepseek_v32 |
| GLM | glm4, glm4_moe, glm4_moe_lite<br />glm_moe_dsa |
| MiniMax | minimax_m2 |
| Kimi | kimi_k2, kimi_k25 |
| Bailing | bailing_moe |
| InternLM | internlm3 |
| Llama | llama |
| GPT-OSS | gpt_oss |
| Hunyuan | hy_v3 |
| ERNIE | ernie4_5, ernie4_5_moe |
| MiMo | mimo |
| Dots | dots1 |
| OLMoE | olmoe |
multimodal large models: <div class="rdm-tbl-wrap"><table class="rdm-tbl"><thead><tr><th>Series</th><th>model_type</th></tr></thead><tbody><tr><td>Qwen</td><td>qwen2_vl, qwen2_5_vl, qwen2_5_omni<br />qwen3_vl, qwen3_vl_moe, qwen3_omni_moe, qwen3_asr<br />qwen3_5, qwen3_5_moe</td></tr><tr><td>Gemma</td><td>gemma4</td></tr><tr><td>GLM</td><td>glm4v, glm4v_moe</td></tr><tr><td>Kimi</td><td>kimi_vl</td></tr><tr><td>InternVL</td><td>internvl_chat, internvl</td></tr><tr><td>Ovis</td><td>ovis2_5</td></tr><tr><td>Llama</td><td>llama4</td></tr><tr><td>Llava</td><td>llava-onevision</td></tr></tbody></table></div>
target_modules = r'^language_model.*\.(in_proj|out_proj|linear_fc1|linear_fc2|linear_qkv|linear_proj)$'
hf_target_modules = r'^model.language_model.*\.(in_proj_qkv|in_proj_z|in_proj_b|in_proj_a|out_proj|gate_proj|up_proj|down_proj|q_proj|k_proj|v_proj|o_proj)$' lora_config = LoraConfig(task_type='CAUSAL_LM', r=8, lora_alpha=32, lora_dropout=0.05, target_modules=target_modules) peft_models = [get_peft_model(model, lora_config) for model in mg_models]
MCore-Bridge是一个有潜力的开源AI工具,提供了多种模型和应用场景,帮助开发者快速构建高质量的语言模型应用。然而,工具的文档和示例应用需要进一步完善。
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ Apache 2.0 — 宽松开源协议,可商用,需保留版权声明和 NOTICE 文件,含专利授权条款。
AI Skill Hub 点评:MCore-Bridge 的核心功能完整,质量良好。对于AI 技术爱好者来说,这是一个值得纳入个人工具库的选择。建议先在非生产环境试用,再逐步推广。
| 原始名称 | mcore-bridge |
| 原始描述 | 开源AI工具:MCore-Bridge: Providing Megatron-Core model definitions for state-of-the-art lar。⭐66 · Python |
| Topics | installabledeepseek-r1gemma4glm-5gpt-ossllama4python |
| GitHub | https://github.com/modelscope/mcore-bridge |
| License | Apache-2.0 |
| 语言 | Python |
收录时间:2026-05-21 · 更新时间:2026-05-22 · License:Apache-2.0 · AI Skill Hub 不对第三方内容的准确性作法律背书。