evals.report
BenchmarksLabsCompareRun guides
MiniMaxMiniMax M2

MiniMax M2.5

MiniMax · MiniMax M2. Released Feb 12, 2026.

13 results

Benchmark results 13

Compare this model
BenchmarkCategoryScoreMetricStatusDate
SWE-bench VerifiedCoding80.2%% resolvedVerifiedFeb 12, 2026Details
SWE-bench ProCoding55.4%% resolvedVerifiedFeb 12, 2026Details
GPQA DiamondReasoning85.2%accuracyVerifiedFeb 12, 2026Details
Humanity's Last ExamReasoning19.4%accuracyVerifiedFeb 12, 2026Details
Epoch Capabilities IndexReasoning147.4IndexOfficialFeb 12, 2026Details
GDPvalAgents1176EloOfficialFeb 12, 2026Details
SciCodeCoding42.6%accuracyUnverifiedFeb 12, 2026Details
Global-MMLUReasoning84.2%accuracyUnverifiedFeb 12, 2026Details
WebDev ArenaChat preference1382EloVerifiedFeb 12, 2026Details
EQ-Bench Creative Writing v3Chat preference1331EloVerifiedFeb 12, 2026Details
Design ArenaChat preference1261EloVerifiedFeb 12, 2026Details
SWE-bench MultilingualCoding68.3%% resolvedOfficialFeb 12, 2026Details
Vectara Hallucination LeaderboardOther9.1%Hallucination RateOfficialFeb 12, 2026Details