evals.report
BenchmarksLabsCompareRun guides
AnthropicClaude Opus

Claude Opus 4.6

Anthropic · Claude Opus. Released Feb 5, 2026.

12 results

Benchmark results 12

Compare this model
BenchmarkCategoryScoreMetricStatusDate
FrontierMathReasoning40.7%accuracyOfficialMay 30, 2026Details
DeepSWECoding27.06%% resolvedOfficialDetails
ARC-AGI-3Reasoning0.51%accuracyOfficialMay 19, 2026Details
ARC-AGI-2Reasoning69.17%accuracyOfficialMay 19, 2026Details
LMArenaChat preference1497source-defined ratingOfficialMay 27, 2026Details
LiveBenchReasoning76.33%scoreOfficialJan 8, 2026Details
GPQA DiamondReasoning90.5%accuracyOfficialMay 30, 2026Details
SWE-bench VerifiedCoding78.7%% resolvedOfficialMay 30, 2026Details
Humanity's Last ExamReasoning34.2%accuracyOfficialMay 31, 2026Details
MMMU-ProMultimodal77.3%accuracyOfficialApr 8, 2026Details
AIME (OTIS Mock)Reasoning94.4%accuracyOfficialMay 30, 2026Details
SimpleQA VerifiedOther46.5%accuracyOfficialMay 30, 2026Details