MiniMax M2.5

MiniMax M2.5 is MiniMax's flagship model for coding and agentic workflows, achieving state-of-the-art performance on software engineering benchmarks while delivering industry-leading inference speed.

Extensively trained with reinforcement learning in hundreds of thousands of complex real-world environments, M2.5 is SOTA in coding, agentic tool use and search, office work, and a range of other economically valuable tasks. — MiniMax

Overview

Released February 2026, M2.5 represents MiniMax's rapid advancement in the M-series, climbing from 56% to 80.2% on SWE-Bench Verified within a year. The model is trained to reason efficiently and decompose tasks optimally, completing complex agentic tasks 37% faster than its predecessor while matching the speed of Claude Opus 4.6.

Key Capabilities

Software Engineering:

Full development lifecycle coverage: system design, development, feature iteration, code review, and testing
Full-stack across Web, Android, iOS, and Windows platforms
Server-side APIs, business logic, databases—not just frontend demos
Multi-language proficiency beyond Python

Agentic Tasks:

Precise search iterations with better token efficiency
Web browsing and context management
Multi-expert workflow coordination
Tool-use reasoning across multiple turns

Office & Productivity:

Advanced workspace scenarios: Word, PPT, Excel financial modeling
Visual and interactive application development

Performance Highlights

Benchmark	Score	Notes
SWE-Bench Verified	80.2%	Surpasses GPT-5.2, near Claude Opus 4.6
Multi-SWE-Bench	51.3%	Industry-leading multilingual coding
BrowseComp	76.3%	Web search and context management
BFCL Multi-Turn	76.8%	Tool-use reasoning

Developer Tool Compatibility

M2.5 generalizes across popular coding agent frameworks:

Claude Code
Cursor
Cline
Roo Code
Kilo Code
Codex CLI
OpenCode
Droid
TRAE
Grok CLI

When to Use MiniMax M2.5

Choose M2.5 when you need:

High-speed agentic coding workflows
Multi-language software engineering beyond Python
Full-stack application development
Complex tool-use chains with tight latency requirements
Open weights for self-hosting or fine-tuning
Cost-effective high-throughput production workloads

Choose M2.5-lightning when you need:

Even faster inference (identical results, higher speed)
Latency-critical applications

Consider alternatives when you need:

Maximum accuracy regardless of speed → Claude Opus 4.6
Extended reasoning with adaptive tools → Qwen3-Max-Thinking
Vision and multimodal capabilities → dedicated vision models

Role in Series

MiniMax M-series progression:

M1: Initial release (~56% SWE-Bench)
M2: Open-weight coding model with 10B active / 230B total parameters
M2.1: Improved multilingual and full-stack capabilities
M2.5: Current flagship, SOTA coding and agents (this model)

MiniMax: MiniMax M2.5

Model Type