Try That LLM

DeepSeek is a Chinese AI company founded in 2023, dedicated to making AGI a reality. Based in Hangzhou, DeepSeek is known for producing frontier-level open-weight models at a fraction of typical training costs.

Notable Models

DeepSeek-V3 Series — Flagship general-purpose models (685B MoE)

DeepSeek-V3.2 — Reasoning-first model built for agents (December 2025)
DeepSeek-V3.1 — General-purpose flagship
DeepSeek-V3.1-Terminus — Optimized variant

DeepSeek-R1 Series — Reasoning models trained via reinforcement learning (January 2025)

DeepSeek-R1 — 671B MoE (37B active); on par with OpenAI o1
DeepSeek-R1-Distill — Distilled variants based on Qwen and Llama (1.5B to 70B)

Specialized Models

DeepSeek-Coder-V2 — Code generation
DeepSeek-Math-V2 — Mathematical reasoning
DeepSeek-VL2 — Vision-language model
DeepSeek-OCR — Optical character recognition (3B)

Multimodal

Janus-Pro-7B — Unified multimodal understanding and generation

Key Innovations

Mixture of Experts (MoE): 685B total parameters with only 37B active per forward pass
Reinforcement Learning for Reasoning: R1 trained via large-scale RL without supervised fine-tuning
Cost Efficiency: V3 reportedly trained for ~$6M vs. $100M+ for comparable models

DeepSeek

Notable Models

Key Innovations

Links

Models in this family