Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
Nemotron 3 Ultra:用于智能体推理的开源、高效混合专家(MoE)Mamba-Transformer 混合模型
Computer Science > Computation and Language arXiv:2606.15007 (cs) [Submitted on 12 Jun 2026]
计算机科学 > 计算与语言 arXiv:2606.15007 (cs) [提交于 2026 年 6 月 12 日]
Abstract: We introduce Nemotron 3 Ultra, a state-of-the-art, open-weights model designed specifically for complex agentic reasoning tasks. By integrating the architectural strengths of Mamba (state-space models) and Transformers, Nemotron 3 Ultra achieves superior efficiency in long-context processing while maintaining the high-performance reasoning capabilities of traditional Transformer architectures. The model utilizes a Mixture-of-Experts (MoE) framework, allowing for sparse activation and significantly reduced computational overhead during inference. Our results demonstrate that Nemotron 3 Ultra outperforms existing open models in multi-step reasoning, tool-use, and autonomous agent benchmarks, providing a robust foundation for next-generation AI applications.
摘要: 我们推出了 Nemotron 3 Ultra,这是一款专为复杂智能体推理任务而设计的顶尖开源权重模型。通过整合 Mamba(状态空间模型)与 Transformer 的架构优势,Nemotron 3 Ultra 在处理长上下文时实现了卓越的效率,同时保持了传统 Transformer 架构的高性能推理能力。该模型采用了混合专家(MoE)框架,实现了稀疏激活,并显著降低了推理过程中的计算开销。我们的研究结果表明,Nemotron 3 Ultra 在多步推理、工具使用和自主智能体基准测试中均优于现有的开源模型,为下一代人工智能应用提供了坚实的基础。