DeepSeek iconDeepSeek: DeepSeek V3.1 Base

Model Type

Proprietary model icon

Proprietary Model

API access only

Recommended Use Cases

Text Generation

The pre-trained base model for the V3.1 series, built upon the original V3 checkpoint with extended long-context training (August 2025). V3.1-Base serves as the foundation for all V3.1 instruct and chat models.

Per DeepSeek:

DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach.

Role in V3.1 Series

V3.1-Base is the pre-trained foundation model without instruction tuning or RLHF. It's intended for researchers who want to build custom fine-tuned models or study the base capabilities before post-training.

Key Features

  • Architecture: 685B MoE (671B parameters, 37B active)
  • Context Window: 128K tokens
  • Format: FP8 with UE8M0 scale data format
  • License: MIT

Long-Context Training

Extended context capability through a two-phase approach:

  • 32K Extension Phase: 630B tokens (10x increase from V3)
  • 128K Extension Phase: 209B tokens (3.3x increase from V3)

Links

DeepSeek V3.1 Base | Try That LLM