Qwen3.5 397B A17B

Qwen3.5-397B-A17B is Alibaba's unified vision-language foundation model, combining text and visual understanding in a single architecture with 397B parameters (17B active) and native 262K context.

Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility.

Qwen Team

Overview

Released February 2026, Qwen3.5 introduces a fundamentally new approach: early fusion multimodal training that achieves cross-generational parity with both Qwen3 (text) and Qwen3-VL (vision) models in a single unified architecture. The model uses Gated Delta Networks combined with sparse MoE for high-throughput inference.

Key Features

397B total / 17B active parameters (MoE)
262K native context, extensible to 1M+ with YaRN
Unified multimodal: Text, image, and video in one model
201 languages (expanded from Qwen3's 119)
Gated Delta Networks: Hybrid attention for efficiency
Multi-token prediction: Faster inference

Benchmark Performance

Language tasks: Competitive with GPT-5.2, Claude 4.5 Opus, and Gemini 3 Pro across knowledge, reasoning, coding, and multilingual benchmarks.

Vision tasks: Strong performance on MMMU (85.0%), MathVision (88.6%), OCRBench (93.1%), and video understanding benchmarks.

Agent tasks: Leading scores on τ²-Bench (86.7%), BFCL-V4 (72.9%), and visual agent benchmarks.

When to Use Qwen3.5-397B-A17B

Choose Qwen3.5-397B-A17B when you need:

Unified text and vision capabilities in one model
Self-hosted deployment with open weights
Strong multilingual support (201 languages)
Visual agent capabilities (GUI operation)
Long-context processing up to 1M tokens
Apache 2.0 licensing for commercial use

Choose Qwen3.5-Plus (API) when you need:

1M context by default without configuration
Built-in official tools and adaptive tool use
Managed production infrastructure
No deployment overhead

Choose Qwen3-VL when you need:

Dedicated vision model with proven track record
Smaller deployment footprint
Existing Qwen3-VL integrations

Choose Qwen3 text models when you need:

Text-only workloads without vision overhead
Maximum text performance per parameter

Role in Series

Qwen flagship evolution:

Qwen3 (Apr 2025): Text-only foundation, hybrid thinking
Qwen3-VL (Sep 2025): Separate vision-language model
Qwen3.5 (Feb 2026): Unified vision-language foundation (this model)

Qwen: Qwen3.5 397B A17B

Model Type

Recommended Use Cases

Overview

Key Features

Benchmark Performance

When to Use Qwen3.5-397B-A17B

Role in Series

Links