DeepSeek: R1
Model Type
Proprietary Model
API access only
Recommended Use Cases
Text Generation
Try R1
DeepSeek's first-generation reasoning model, trained via large-scale reinforcement learning to achieve performance comparable to OpenAI o1 on math, code, and reasoning tasks (January 2025). R1 demonstrated that reasoning capabilities can emerge purely through RL without supervised fine-tuning.
Per DeepSeek:
DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. Notably, it is the first open research to validate that reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT.
Key Features
- Architecture: 685B MoE (671B total, 37B active per forward pass)
- Base Model: DeepSeek-V3-Base
- Training: Large-scale reinforcement learning with rule-based rewards
- License: MIT (supports commercial use and distillation)
Benchmarks
- AIME 2024: ~79.8% pass@1
- MATH-500: ~97.3% pass@1
- Codeforces: 2,029 Elo rating
Distilled Versions
DeepSeek open-sourced smaller distilled models based on Qwen and Llama:
- DeepSeek-R1-Distill-Qwen: 1.5B, 7B, 14B, 32B
- DeepSeek-R1-Distill-Llama: 8B, 70B