Qwen iconQwen: Qwen3 Max Thinking

Model Type

Proprietary model icon

Proprietary Model

API access only

Recommended Use Cases

Text Generation

Try Qwen3 Max Thinking

Qwen3-Max-Thinking is Alibaba's flagship reasoning model, combining trillion-scale parameters with advanced test-time scaling and adaptive tool use to achieve performance comparable to GPT-5.2-Thinking, Claude Opus 4.5, and Gemini 3 Pro.

By scaling up model parameters and leveraging substantial computational resources for reinforcement learning, Qwen3-Max-Thinking achieves significant performance improvements across multiple dimensions.

  • Qwen Team

Overview

Released January 2026, Qwen3-Max-Thinking builds on Qwen3-Max with extended reasoning capabilities. It autonomously selects and leverages built-in Search, Memory, and Code Interpreter tools during conversations, and employs an experience-cumulative test-time scaling strategy that outperforms standard parallel sampling approaches.

Key Features

  • Adaptive tool use: Automatically invokes Search, Memory, and Code Interpreter without user intervention
  • Test-time scaling: Experience-cumulative, multi-round reasoning with self-reflection
  • 262K context: Long-context understanding for complex documents
  • Claude Code compatible: Works seamlessly with Claude Code via Anthropic API protocol
  • Reduced hallucinations: Search and Memory tools provide real-time information access

Capabilities

Adaptive Tools:

  • Search: Real-time web information retrieval during reasoning
  • Memory: Personalized responses based on conversation history
  • Code Interpreter: Execute code for computational reasoning

Test-Time Scaling: The model uses an experience-cumulative strategy that distills key insights from past reasoning rounds, avoiding redundant re-derivation and focusing on unresolved uncertainties. This achieves higher context efficiency than naive parallel sampling.

Benchmark Performance

Qwen3-Max-Thinking demonstrates competitive performance across 19 benchmarks:

DomainHighlights
KnowledgeStrong C-Eval (93.7%), competitive MMLU-Pro
ReasoningHMMT Feb 25 (98.0%), IMOAnswerBench (83.9%)
Agentic SearchHLE with tools (49.8%) - leads all models
Instruction FollowingArena-Hard v2 (90.2%) - leads all models
Tool UseCompetitive Tau² Bench, BFCL-V4

When to Use Qwen3-Max-Thinking

Choose Qwen3-Max-Thinking when you need:

  • Complex multi-step reasoning with tool integration
  • Problems requiring computational verification (math, code)
  • Tasks benefiting from real-time web search during reasoning
  • Research and analysis requiring deep thinking
  • Mathematical competitions and STEM problem solving
  • Agentic workflows with autonomous tool selection

Consider Qwen3-Max (non-thinking) when you need:

  • Faster responses for simpler tasks
  • Lower token consumption
  • Direct answers without extended reasoning traces
  • Production workloads with tight latency requirements

Consider other Qwen3 models when you need:

  • Open weights for customization → Qwen3-235B-A22B
  • Vision capabilities → Qwen3-VL
  • Coding specialist → Qwen3-Coder

Availability

  • Web: chat.qwen.ai (with adaptive tool-use)
  • API: Alibaba Cloud Model Studio (qwen3-max-2026-01-23)
  • Compatible: OpenAI API protocol, Anthropic API protocol (Claude Code)

Role in Series

Qwen3-Max variants:

  1. Qwen3-Max: Trillion-parameter flagship, fast responses
  2. Qwen3-Max-Thinking: Extended reasoning with adaptive tools (this model)

Links