DeepSeek V3.2 Exp

An experimental version of DeepSeek-V3.2, serving as an intermediate step toward next-generation architecture (October 2025). V3.2-Exp introduces DeepSeek Sparse Attention while maintaining performance parity with V3.1-Terminus.

Per DeepSeek:

DeepSeek-V3.2-Exp builds upon V3.1-Terminus by introducing DeepSeek Sparse Attention—a sparse attention mechanism designed to explore and validate optimizations for training and inference efficiency in long-context scenarios.

Role in V3.2 Series

V3.2-Exp is a research-oriented release focused on validating the DeepSeek Sparse Attention architecture rather than advancing raw task accuracy. Training configurations were deliberately aligned with V3.1-Terminus to enable direct comparison.

Key Features

Architecture: 685B MoE with DeepSeek Sparse Attention (DSA)
Context Window: 128K tokens
Purpose: Architecture validation and research
License: MIT

DeepSeek Sparse Attention (DSA)

Fine-grained sparse attention mechanism
Substantial improvements in long-context training and inference efficiency
Maintains virtually identical model output quality

DeepSeek: DeepSeek V3.2 Exp

Model Type

Recommended Use Cases

Role in V3.2 Series

Key Features

DeepSeek Sparse Attention (DSA)

Links