当前位置: 首页 > 行业资讯 > 资讯详情

英伟达发布Nemotron 3开源AI模型系列，吞吐量达上一代4倍

发布日期：2026-05-22 来源：NVIDIA官方作者：NVIDIA官方浏览：1

Nemotron 3 technologies

　　Hybrid MoE: Nemotron 3 family of models utilize a hybrid Mamba-Transformer MoE architecture to provide best-in-class throughput while having better or on-par accuracy than standard Transformers.

　　LatentMoE: Super and Ultra utilize Latent MoE, a novel hardware-aware expert design for improved accuracy.

　　Multi-Token Prediction: Super and Ultra incorporate MTP layers for improved long-form text generation efficiency and better model quality.

　　NVFP4: Super and Ultra are trained with NVFP4.

　　Long Context: Nemotron 3 models support context length up to 1M tokens.

　　Multi-environment Reinforcement Learning Post-training: Nemotron 3 models are trained using a diverse set of RL environments helping models achieve superior accuracy across a broad range of tasks.

　　Granular Reasoning Budget Control at Inference Time: Nemotron 3 models are trained to work with inference-time budget control.

Nemotron 3 Nano

　　Nemotron 3 Nano is a 3.2B active (3.6B with embeddings), 31.6B total parameter model. It achieves better accuracy than our previous generation Nemotron 2 Nano while activating less than half of the parameters per forward pass.

Key highlights:

More accurate than GPT-OSS-20B and Qwen3-30B-A3B-Thinking-2507 on popular benchmarks spanning different categories.
On the 8K input / 16K output setting with a single H200, Nemotron 3 Nano provides inference throughput that is 3.3x higher than Qwen3-30B-A3B and 2.2x higher than GPT-OSS-20B.
Supports context length up to 1M tokens while outperforming both GPT-OSS-20B and Qwen3-30B-A3B-Instruct-2507 on RULER across different context lengths.
We are releasing the model weights, training recipe, and all the data for which we hold redistribution rights.

Open Source

　　Along with the Nemotron 3 white paper and the Nano 3 technical report, we are releasing the following:

Checkpoints:

Nemotron 3 Nano 30B-A3B FP8: the final post-trained and FP8 quantized Nano model
Nemotron 3 Nano 30B-A3B BF16: the post-trained Nano model
Nemotron 3 Nano 30B-A3B Base BF16: the pre-trained base Nano model
Qwen-3-Nemotron-235B-A22B-GenRM: the GenRM used for RLHF

Data:

Nemotron-CC-v2.1: 2.5 trillion new English tokens from Common Crawl, including curated data from 3 recent snapshots, synthetic rephrasing, and translation to English from other languages.
Nemotron-CC-Code-v1: A pretraining dataset consisting of 428 billion high-quality code tokens obtained from processing Common Crawl Code pages using the Lynx + LLM pipeline from Nemotron-CC-Math-v1. Preserves equations and code, standardizes math equations to LaTeX, and removes noise.
Nemotron-Pretraining-Code-v2: Refresh of curated GitHub code references with multi-stage filtering, deduplication, and quality filters. Large-scale synthetic code data.
Nemotron-Pretraining-Specialized-v1: Collection of synthetic datasets for specialized areas like STEM reasoning and scientific coding.
Nemotron-SFT-Data: Collection of new Nemotron 3 Nano SFT datasets.
Nemotron-RL-Data: Collection of new Nemotron 3 Nano RL datasets.

Model Recipes:

NVIDIA Nemotron Developer Repository

　　For more details, please refer to the following:

Nemotron 3 Blogs
- HuggingFace
- NVIDIA Tech Blog
Nemotron 3 white paper: NVIDIA Nemotron 3: Efficient and Open Intelligence
Nemotron 3 Nano technical report: Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

本文转载自NVIDIA官方，作者：NVIDIA官方，原文标题：《英伟达发布Nemotron 3开源AI模型系列，吞吐量达上一代4倍》，原文链接： https://research.nvidia.com/labs/nemotron/Nemotron-3/。本平台仅做分享和推荐，不涉及任何商业用途。文章版权归原作者所有。如涉及作品内容、版权和其它问题，请与我们联系，我们将在第一时间删除内容！

本文相关推荐

暂无相关推荐

点击立即订阅

智算多多

联系我们

官方邮箱：service@zsdodo.com

公司地址：北京市丰台区南四环西路188号总部基地三区国联股份数字经济总部

关注我们

公众号

视频号