AI快讯 2026-05-30 来源：Reddit r/LocalLLaMA

AI 前沿资讯：Can't get over 250TPS on RTX50…

📄 事件摘要

My main model is qwen3.6-27b-mtp and Im getting around 100tps and 2500tps prefill, which is great. Ive tried adding a second small model for auxiliary tasks, and even when its the only model running, it doesnt go over 200-250tps. Im building llama.cpp and running on docker windows. Ive also tried havenoammo/llama:cuda13-server, and get exactly the same performance so I think my build flags are OK.…

🌐 事件背景

Reddit r/LocalLLaMA 作为全球顶级技术社区之一，每日汇聚来自世界各地开发者的优质内容。此条消息在社区中获得较高关注度，说明其在AI快讯领域具有一定的代表性与前沿性。

💡 为什么值得关注

这则消息在社区引发活跃讨论，代表了AI快讯领域的重要进展方向。无论你是技术开发者、产品经理还是行业研究者，了解这类前沿动态都有助于做出更明智的技术选型和战略决策。

✦ AI Skill Hub 观点

从 AI Skill Hub 的视角来看，此类AI快讯领域的技术进展，往往预示着新的工具和解决方案即将涌现。我们将持续追踪相关动态，为中文用户提供及时、准确的 AI 技能与资讯聚合服务。

📰 相关资讯

📰

AI 前沿资讯：The next AI problem might not …

Reddit r/artificial · 2026-05-30

📰

谷歌 Gemini AI 动态