首页
产品服务
模型广场
Token工厂
算力市场算力商情行业资讯
注册

英伟达发布Nemotron 3开源AI模型系列,吞吐量达上一代4倍

发布日期:2026-05-22 来源:NVIDIA官方作者:NVIDIA官方浏览:1

Nemotron 3 technologies

  Hybrid MoE: Nemotron 3 family of models utilize a hybrid Mamba-Transformer MoE architecture to provide best-in-class throughput while having better or on-par accuracy than standard Transformers.

  LatentMoE: Super and Ultra utilize Latent MoE, a novel hardware-aware expert design for improved accuracy.

  Multi-Token Prediction: Super and Ultra incorporate MTP layers for improved long-form text generation efficiency and better model quality.

  NVFP4: Super and Ultra are trained with NVFP4.

  Long Context: Nemotron 3 models support context length up to 1M tokens.

  Multi-environment Reinforcement Learning Post-training: Nemotron 3 models are trained using a diverse set of RL environments helping models achieve superior accuracy across a broad range of tasks.

  Granular Reasoning Budget Control at Inference Time: Nemotron 3 models are trained to work with inference-time budget control.

Nemotron 3 Nano

Nano V3 Comparison

  Nemotron 3 Nano is a 3.2B active (3.6B with embeddings), 31.6B total parameter model. It achieves better accuracy than our previous generation Nemotron 2 Nano while activating less than half of the parameters per forward pass.

Key highlights:

  • More accurate than GPT-OSS-20B and Qwen3-30B-A3B-Thinking-2507 on popular benchmarks spanning different categories.
  • On the 8K input / 16K output setting with a single H200, Nemotron 3 Nano provides inference throughput that is 3.3x higher than Qwen3-30B-A3B and 2.2x higher than GPT-OSS-20B.
  • Supports context length up to 1M tokens while outperforming both GPT-OSS-20B and Qwen3-30B-A3B-Instruct-2507 on RULER across different context lengths.
  • We are releasing the model weights, training recipe, and all the data for which we hold redistribution rights.

Open Source

  Along with the Nemotron 3 white paper and the Nano 3 technical report, we are releasing the following:

Checkpoints:

  • Nemotron 3 Nano 30B-A3B FP8: the final post-trained and FP8 quantized Nano model
  • Nemotron 3 Nano 30B-A3B BF16: the post-trained Nano model
  • Nemotron 3 Nano 30B-A3B Base BF16: the pre-trained base Nano model
  • Qwen-3-Nemotron-235B-A22B-GenRM: the GenRM used for RLHF

Data:

  • Nemotron-CC-v2.1: 2.5 trillion new English tokens from Common Crawl, including curated data from 3 recent snapshots, synthetic rephrasing, and translation to English from other languages.
  • Nemotron-CC-Code-v1: A pretraining dataset consisting of 428 billion high-quality code tokens obtained from processing Common Crawl Code pages using the Lynx + LLM pipeline from Nemotron-CC-Math-v1. Preserves equations and code, standardizes math equations to LaTeX, and removes noise.
  • Nemotron-Pretraining-Code-v2: Refresh of curated GitHub code references with multi-stage filtering, deduplication, and quality filters. Large-scale synthetic code data.
  • Nemotron-Pretraining-Specialized-v1: Collection of synthetic datasets for specialized areas like STEM reasoning and scientific coding.
  • Nemotron-SFT-Data: Collection of new Nemotron 3 Nano SFT datasets.
  • Nemotron-RL-Data: Collection of new Nemotron 3 Nano RL datasets.

Model Recipes:

  For more details, please refer to the following:

  • Nemotron 3 Blogs
  • Nemotron 3 white paper: NVIDIA Nemotron 3: Efficient and Open Intelligence
  • Nemotron 3 Nano technical report: Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
本文转载自NVIDIA官方, 作者:NVIDIA官方, 原文标题:《 英伟达发布Nemotron 3开源AI模型系列,吞吐量达上一代4倍 》, 原文链接: https://research.nvidia.com/labs/nemotron/Nemotron-3/。 本平台仅做分享和推荐,不涉及任何商业用途。文章版权归原作者所有。如涉及作品内容、版权和其它问题,请与我们联系,我们将在第一时间删除内容!
本文相关推荐
暂无相关推荐
点击立即订阅