Qwen iconQwen: Qwen3 Next 80B A3B Instruct

Model Type

Open weight model icon

Open Weight Model

80B parameters

Recommended Use Cases

Text Generation

Try Qwen3 Next 80B A3B Instruct

Qwen3 Next 80B A3B Instruct is Alibaba's novel-architecture instruction model featuring hybrid attention and extreme MoE sparsity, optimized for fast, stable responses without thinking traces.

Qwen3-Next-80B-A3B-Base outperforms Qwen3-32B-Base on downstream tasks with 10% of the total training cost and with 10x inference throughput for context over 32K.

  • Qwen Team

Overview

Qwen3-Next-80B-A3B-Instruct is the instruction-tuned variant of Alibaba's next-generation architecture, combining hybrid attention and extreme MoE sparsity for fast, direct responses. It targets production workloads requiring high throughput, RAG, tool use, and agentic workflows.

Key Features

  • Novel architecture: Hybrid Transformer-Mamba design
  • Extreme efficiency: 80B total, only 3.9B active per token
  • 10x throughput: Compared to Qwen3-32B at long contexts
  • No thinking traces: Direct responses without <think> blocks
  • 256K context: Native long-context, extendable to 1M

Technical Specifications

SpecificationValue
Total Parameters80B
Active Parameters3.9B
ArchitectureHybrid attention + High-sparsity MoE
Layers48
Hidden Dimension2048
Experts512 (10 activated + 1 shared)
Context Length256K tokens (1M with YaRN)
Training Data15T tokens
Release DateSeptember 2025

When to Use Qwen3-Next-80B-A3B-Instruct

Choose this model when you need:

  • Fast, direct responses at scale
  • High throughput on long contexts
  • RAG and retrieval workflows
  • Tool calling and agentic tasks
  • Production deployment with consistent output

Choose Thinking variant when you need:

  • Complex reasoning tasks
  • Visible thought processes
  • Mathematical problem solving
  • Maximum accuracy over speed

Availability

  • Open Weights: Hugging Face (Qwen/Qwen3-Next-80B-A3B-Instruct)
  • API: NVIDIA NIM, OpenRouter
  • Local: SGLang, vLLM, Ollama, llama.cpp

Role in Series

Qwen3-Next models:

  1. Qwen3-Next-80B-A3B-Instruct: Fast, production-optimized (this model)
  2. Qwen3-Next-80B-A3B-Thinking: Deep reasoning
  3. Qwen3-Coder-Next: Coding-specialized variant

Links