Qwen: Qwen3.5 397B A17B
Model Type
Open Weight Model
397B parameters
Recommended Use Cases
Try Qwen3.5 397B A17B
Qwen3.5-397B-A17B is Alibaba's unified vision-language foundation model, combining text and visual understanding in a single architecture with 397B parameters (17B active) and native 262K context.
Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility.
- Qwen Team
Overview
Released February 2026, Qwen3.5 introduces a fundamentally new approach: early fusion multimodal training that achieves cross-generational parity with both Qwen3 (text) and Qwen3-VL (vision) models in a single unified architecture. The model uses Gated Delta Networks combined with sparse MoE for high-throughput inference.
Key Features
- 397B total / 17B active parameters (MoE)
- 262K native context, extensible to 1M+ with YaRN
- Unified multimodal: Text, image, and video in one model
- 201 languages (expanded from Qwen3's 119)
- Gated Delta Networks: Hybrid attention for efficiency
- Multi-token prediction: Faster inference
Benchmark Performance
Language tasks: Competitive with GPT-5.2, Claude 4.5 Opus, and Gemini 3 Pro across knowledge, reasoning, coding, and multilingual benchmarks.
Vision tasks: Strong performance on MMMU (85.0%), MathVision (88.6%), OCRBench (93.1%), and video understanding benchmarks.
Agent tasks: Leading scores on ΟΒ²-Bench (86.7%), BFCL-V4 (72.9%), and visual agent benchmarks.
When to Use Qwen3.5-397B-A17B
Choose Qwen3.5-397B-A17B when you need:
- Unified text and vision capabilities in one model
- Self-hosted deployment with open weights
- Strong multilingual support (201 languages)
- Visual agent capabilities (GUI operation)
- Long-context processing up to 1M tokens
- Apache 2.0 licensing for commercial use
Choose Qwen3.5-Plus (API) when you need:
- 1M context by default without configuration
- Built-in official tools and adaptive tool use
- Managed production infrastructure
- No deployment overhead
Choose Qwen3-VL when you need:
- Dedicated vision model with proven track record
- Smaller deployment footprint
- Existing Qwen3-VL integrations
Choose Qwen3 text models when you need:
- Text-only workloads without vision overhead
- Maximum text performance per parameter
Role in Series
Qwen flagship evolution:
- Qwen3 (Apr 2025): Text-only foundation, hybrid thinking
- Qwen3-VL (Sep 2025): Separate vision-language model
- Qwen3.5 (Feb 2026): Unified vision-language foundation (this model)