DeepSeek

DeepSeek is a Chinese AI company founded in 2023, dedicated to making AGI a reality. Based in Hangzhou, DeepSeek is known for producing frontier-level open-weight models at a fraction of typical training costs.

Notable Models

DeepSeek-V3 Series — Flagship general-purpose models (685B MoE)

DeepSeek-R1 Series — Reasoning models trained via reinforcement learning (January 2025)

  • DeepSeek-R1 — 671B MoE (37B active); on par with OpenAI o1
  • DeepSeek-R1-Distill — Distilled variants based on Qwen and Llama (1.5B to 70B)

Specialized Models

Multimodal

  • Janus-Pro-7B — Unified multimodal understanding and generation

Key Innovations

  • Mixture of Experts (MoE): 685B total parameters with only 37B active per forward pass
  • Reinforcement Learning for Reasoning: R1 trained via large-scale RL without supervised fine-tuning
  • Cost Efficiency: V3 reportedly trained for ~$6M vs. $100M+ for comparable models

Links

Models in this family

Dec 1, 2025
Sep 29, 2025
Aug 21, 2025
Aug 20, 2025
Jan 20, 2025