Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Nemotron 3 Ultra：用于智能体推理的开源、高效混合专家（MoE）Mamba-Transformer 混合模型

Computer Science > Computation and Language arXiv:2606.15007 (cs) [Submitted on 12 Jun 2026]

计算机科学 > 计算与语言 arXiv:2606.15007 (cs) [提交于 2026 年 6 月 12 日]

Abstract: We introduce Nemotron 3 Ultra, a state-of-the-art, open-weights model designed specifically for complex agentic reasoning tasks. By integrating the architectural strengths of Mamba (state-space models) and Transformers, Nemotron 3 Ultra achieves superior efficiency in long-context processing while maintaining the high-performance reasoning capabilities of traditional Transformer architectures. The model utilizes a Mixture-of-Experts (MoE) framework, allowing for sparse activation and significantly reduced computational overhead during inference. Our results demonstrate that Nemotron 3 Ultra outperforms existing open models in multi-step reasoning, tool-use, and autonomous agent benchmarks, providing a robust foundation for next-generation AI applications.

摘要： 我们推出了 Nemotron 3 Ultra，这是一款专为复杂智能体推理任务而设计的顶尖开源权重模型。通过整合 Mamba（状态空间模型）与 Transformer 的架构优势，Nemotron 3 Ultra 在处理长上下文时实现了卓越的效率，同时保持了传统 Transformer 架构的高性能推理能力。该模型采用了混合专家（MoE）框架，实现了稀疏激活，并显著降低了推理过程中的计算开销。我们的研究结果表明，Nemotron 3 Ultra 在多步推理、工具使用和自主智能体基准测试中均优于现有的开源模型，为下一代人工智能应用提供了坚实的基础。