Claude Opus 4.6

Claude Opus 4.6 is Anthropic's most intelligent model, featuring state-of-the-art performance on agentic coding, knowledge work, and expert-level reasoning, with a 1M token context window in beta—a first for Opus-class models.

Claude Opus 4.6 is the strongest model Anthropic has shipped. It takes complicated requests and actually follows through, breaking them into concrete steps, executing, and producing polished work even when the task is ambitious.

Sarah Sachs, AI Lead, Notion

Overview

Released on February 5, 2026, Claude Opus 4.6 improves on its predecessor's coding skills with more careful planning, longer agentic task sustenance, better operation in large codebases, and enhanced code review and debugging abilities. It outperforms GPT-5.2 by 144 Elo points on GDPval-AA (economically valuable knowledge work) and leads all frontier models on Humanity's Last Exam.

Key Features

1M token context (beta): First Opus-class model with million-token context
128K output tokens: Complete larger tasks in single requests
Adaptive thinking: Model decides when deeper reasoning is helpful
Effort controls: Four levels (low, medium, high, max) for intelligence/speed tradeoffs
Context compaction: Auto-summarizes context for longer-running tasks
Agent teams: Coordinate multiple agents in parallel (Claude Code)

Technical Specifications

Specification	Value
Model ID	`claude-opus-4-6`
Context Length	200K standard, 1M beta
Max Output	128K tokens
Release Date	February 5, 2026
Knowledge Cutoff	May 2025

Benchmark Performance

State-of-the-art results:

Terminal-Bench 2.0: Highest score (agentic coding)
Humanity's Last Exam: Leads all frontier models (multidisciplinary reasoning)
GDPval-AA: +144 Elo vs GPT-5.2, +190 vs Opus 4.5 (knowledge work)
BrowseComp: Best performance (hard-to-find information retrieval)
BigLaw Bench: 90.2% (legal reasoning)
MRCR v2 (8-needle 1M): 76% vs Sonnet 4.5's 18.5% (long-context retrieval)

Domain strengths:

Cybersecurity investigations
Life sciences (2× improvement over Opus 4.5)
Financial analysis
Multilingual coding
Root cause analysis

New API Features

Adaptive Thinking: Claude decides when to use extended thinking based on task complexity. Enable with effort parameter.

Effort Levels:

low: Fast responses, minimal reasoning
medium: Balanced performance
high (default): Deep reasoning when useful
max: Maximum reasoning depth

When to Use Claude Opus 4.6

Choose Opus 4.6 when you need:

Maximum intelligence for complex tasks
Long-context understanding (1M tokens)
Agentic coding with long-horizon planning
Expert-level reasoning in specialized domains
Financial, legal, or scientific analysis
Large codebase navigation and refactoring

Consider alternatives when you need:

Fast, cost-effective responses → Claude Sonnet 4.5
High volume, simple tasks → Claude Haiku 4.5
Lower latency requirements → Reduce effort level or use Sonnet

Safety Profile

Opus 4.6 maintains Anthropic's safety standards:

Low rates of misaligned behaviors (deception, sycophancy)
Lowest over-refusal rate of any recent Claude model
Enhanced cybersecurity safeguards with six new probes
Comprehensive safety evaluations including interpretability methods

Role in Series

Claude model tiers by capability:

Claude Haiku 4.5: Fastest, most cost-effective
Claude Sonnet 4.5: Balanced performance and cost
Claude Opus 4.6: Maximum intelligence (this model)

Anthropic: Claude Opus 4.6

Model Type

Recommended Use Cases