Qwen: Qwen3 VL 235B A22B Thinking
Model Type
Open Weight Model
235B parameters
Recommended Use Cases
Text Generation
Qwen3-VL-235B-A22B-Thinking is Alibaba's most powerful reasoning-enhanced vision-language model, with 235B total parameters and 22B active per token, optimized for complex multimodal reasoning and STEM tasks.
Overview
Qwen3-VL-235B-A22B-Thinking is the reasoning-optimized variant of the flagship Qwen3-VL model. It combines massive MoE capacity with extended chain-of-thought capabilities, excelling at mathematical reasoning from diagrams, scientific visual analysis, and multi-step video understanding where accuracy outweighs speed.
Key Features
- Flagship reasoning: Maximum capability with deliberate thinking
- Extended thinking traces: Visible
<think>blocks for complex problems - 1M context: 256K native, expandable to 1M tokens
- STEM excellence: Mathematical and scientific reasoning from visuals
- 100% AIME25: Top benchmark performance on math reasoning
Technical Specifications
| Specification | Value |
|---|---|
| Total Parameters | 235B |
| Active Parameters | 22B |
| Architecture | MoE transformer with CoT training |
| Vision Encoder | DeepStack multi-level ViT fusion |
| Context Length | 256K tokens (expandable to 1M) |
| Release Date | September 2025 |
Thinking vs Instruct (235B-A22B)
| Aspect | Thinking (this model) | Instruct |
|---|---|---|
| Response style | Extended chain-of-thought | Direct answers |
| Latency | Higher | Lower |
| Token consumption | Higher | Lower |
| Accuracy on hard tasks | Maximum | High |
| Best for | STEM, complex reasoning | Production, general tasks |
When to Use Qwen3-VL-235B-A22B-Thinking
Choose this model when you need:
- Maximum accuracy on complex visual reasoning
- Mathematical problem solving from diagrams
- Scientific analysis requiring step-by-step logic
- Long video understanding with causal reasoning
- Research and analysis where accuracy is critical
Consider alternatives when you need:
- Fast production responses (use Instruct variant)
- Lower deployment costs (use 30B-A3B models)
- Edge deployment (use 8B models)
Availability
- Open Weights: Hugging Face (Qwen/Qwen3-VL-235B-A22B-Thinking)
- API: OpenRouter, DeepInfra, Novita
- Local: vLLM with tensor parallelism (~471GB weights)
Role in Series
Qwen3-VL models by reasoning capability:
- Qwen3-VL-8B-Instruct: Fast, edge deployment
- Qwen3-VL-8B-Thinking: Edge reasoning
- Qwen3-VL-30B-A3B-Instruct: Efficient production
- Qwen3-VL-30B-A3B-Thinking: Efficient reasoning
- Qwen3-VL-235B-A22B-Instruct: Flagship production
- Qwen3-VL-235B-A22B-Thinking: Maximum reasoning (this model)