evals.report
BenchmarksLabsCompareRun guides
MiniMaxMiniMax M2

MiniMax M2.7

MiniMax · MiniMax M2. Released Mar 18, 2026.

12 results

Benchmark results 12

Compare this model
BenchmarkCategoryScoreMetricStatusDate
SWE-bench ProCoding56.22%% resolvedVerifiedMar 18, 2026Details
SWE-bench VerifiedCoding78%% resolvedUnverifiedMar 18, 2026Details
Artificial Analysis Intelligence IndexReasoning49.6IndexUnverifiedMar 18, 2026Details
SWE-rebenchCoding51.9%Resolved rate (pass@1)UnverifiedMar 18, 2026Details
τ²-bench (Telecom)Tool use84.8%pass^1OfficialMar 18, 2026Details
GDPvalAgents1505EloOfficialMar 18, 2026Details
SciCodeCoding47.0%accuracyUnverifiedMar 18, 2026Details
AA-Omniscience: Knowledge and Hallucination BenchmarkReasoning1AA-Omniscience IndexOfficialMar 18, 2026Details
IFBenchReasoning75.7%accuracyOfficialMar 18, 2026Details
WebDev ArenaChat preference1401EloVerifiedMar 18, 2026Details
Design ArenaChat preference1285EloVerifiedMar 18, 2026Details
Vectara Hallucination LeaderboardOther12.9%Hallucination RateOfficialMar 18, 2026Details